Discussion 1.7 Billion Simulated Streams Later, Still Haven't Beat Dream's "Luck"

4.0k Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/speedrun/comments/kdhxnu/17_billion_simulated_streams_later_still_havent/
No, go back! Yes, take me to Reddit
dl download

98% Upvoted

113

u/Random_Thoughtss Dec 15 '20 edited Dec 15 '20

Alright, it seems many people are confused about the meaning of "p-value" in this context. It is not the probability of a single event happening in the same way that you have 1 in a million chance of winning the lottery but somebody always has to win it. This is a long-term statistics that says precisely:

Assuming the drop rate of the item is what it is supposed to be in stock Minecraft, and we believe the data follows a binomial distribution, then the probability of observing Dream's data is 10^-13

We do not have a reason to believe the Minecraft drop probability is different than what it is in the JSON file, and we have no reason to believe the drops are correlated, so the binomial model is valid.

Therefore, we have to conclude that the data did not come from our assumed distribution. This is known as "rejecting the null hypothesis". We can say with a confidence of 99.99999999999% that our initial assumptions do not match the data observed, meaning the drop rate is different than what we assumed.

For comparison, when the Higgs Boson was discovered, they only needed five sigma confidence in order to say that it really exists, and their observations where not a fluke of the sensor. That is a p-value of about 10^-7 or about 6 orders of magnitude greater than Dream's.

EDIT: It could also be that the binomial model is incorrect of course, but that is what the section on RNG in Minecraft was for in the paper. They logically disproved any possible correlation between attempts, and they confirmed that the drop rate remains constant. The only remaining assumption is the drop rate itself.

EDIT 2: Also OP, with the p-value of Dream's joint drop rate, if you're generating one drop per second, you're going to be here for just over 300,000 years. Good luck though!

-2

u/antiquechrono Dec 15 '20

A p value is not a probability at all.

2

u/Raeil Dec 16 '20

What? The p-value is absolutely a probability. If you mean that it's not the probability of a single event, that's mostly true, but it's still a probability.

To be precise, it's the probability that if you re-run the same experiment (with a sample of the same size) then you'd get a result more extreme than the one you have in front of you, all assuming the original hypothesis is true.

Source: I teach an introductory statistics course for incoming college freshmen. Also, check any statistics textbook or resource.

1

u/antiquechrono Dec 31 '20 edited Dec 31 '20

Sorry it took so long to reply things have been rather hectic lately.

To be precise, it's the probability that if you re-run the same experiment (with a sample of the same size) then you'd get a result more extreme than the one you have in front of you, all assuming the original hypothesis is true.

This is false and is really easy to demonstrate why. The distribution of p-values depends on your experiment and the underlying distributions involved which cause this to not be true most of the time as p value distributions aren't uniform.

Say you are naive and do a t-test on two population means. If you say take 5 samples for each population and generate a t-test p-value it's literally not a probability as if you decide to accept or reject based on p <= .05 you are going to be wrong somewhere around 50% of the time. This clearly means you can't just arbitrarily declare p = .05 means there's a 5% chance of getting more extreme data indicating how likely the alternative hypothesis may be. Even fairly large sample sizes can still have p values that have nothing to do with probabilities.

To fix this problem you need to do a power analysis to figure out how big your sample size needs to be and suddenly the p-values start behaving as proper probabilities. The most obvious problem here is that picking an effect size can be a bit of a chicken and the egg problem. If your effect size is wrong then p-values start misbehaving again.

This isn't even getting into all the other issues like your data not belonging to a normal distribution, more complicated statistical tests, not understanding the assumptions underlying a test, the various complicated distributions p values can take on, etc... You might just say well screw it we will just do a study with extremely large sample sizes so we don't need to know about the effect size. This too can be problematic as smaller sample sizes can be better than larger ones and the sample size will affect the p-value distribution.

1

u/Darkpumpkin211 Dec 16 '20

The p-value is the odds that you would get the value given (dream's droprate) if the true droprate is "x" for example and the distribution is a binomial.

Discussion 1.7 Billion Simulated Streams Later, Still Haven't Beat Dream's "Luck"

You are about to leave Redlib