Thread:TheHumanAmbassador/@comment-26907577-20190924151043/@comment-32182236-20190926181355

Predictive power is the only method that I'm aware of that allows us to spot overfitting, and eliminate those, while retaining the more accurate theories that happened to make such predictions.

And yes, induction isn't making testable predictions. That's science.

If evidence contradicts a hypothesis, then yes, it needs revision (or in some cases, we abandon it altogether, like we did for geocentrism)

The hypothesis holding up to new evidence, therefore, makes it more probable that we were onto something in our theory, which we created with no way of knowing that said event would happen.

I suppose that means if one created a theory without looking at all the data, and then compared it to the data points, one could essentially recreate predictive power.

And I'd say the p-value quite depends on how many assumptions you're making. I'd use a Bayesian inference, in fact. (Then again, I don't think I can do very well at setting a good prior probability)

Let's suppose that Grillby's, as a restaurant, earns a certain amount of gold each day. Here's my set of data. (I'm making these numbers up, just know. I haven't even tried to find out how much gold Grillby's makes in a day)

These numbers will be sorted by date. Each value appears one day after the previous.

102, 119, 88, 110, 102.

Now, these appear to be somewhat random. But suppose I claimed that it is not random at all! Suppose that instead, I tell you that there is a complex function that "predicts" these very values (that I got after looking at all 5 data points)

Don't believe me? Then watch and learn!

We'll say that t=0 for the first day, 102. M is the amount of money Grillby's made during that day. So our predictive function should pass through (0, 102), (1, 119), (2, 88), (3, 110), and (4, 102).

So, here's the function!

M=-(23t^4)/3+(377t^3)/6-(953t^2)/6+(362t)/3+102

It passes through every single point. So, can we say that this is the real function?

..Have you thought of your answer? Think of it BEFORE proceeding.

..Okay, got your answer? Okay, the answer is NO.

Let's generate the next value.

119.

Our model predicted that Grillby's would make -203 gold the next day. That is, they'd lose 203 gold.

That's WAY off.

And it's an example of overfitting.

(I just generated a random number between 80 and 120 the whole time.)

-

The null hypothesis idea is fine, but.. In most cases, a hypothesis isn't a null hypothesis.

I suppose we can make the null hypothesis be that the law that we theorize exist.. doesn't exist? But even still, there could be a better hypothesis that fits the data.