Home On (machine learning) Model Designs...
Post
Cancel

On (machine learning) Model Designs...

I was reading the individual treatment effect (ITE) estimator X-learner by Künzel et al., 2019, and I have some quick thoughts (rants!) about the previous model designed by our lab, let’s call it the “split half” (unpublished) first.

You see, creating a machine learning model or estimating model isn’t as simple as making something that you ‘think’ is going to work, it consists of several steps within which should be derived with statistical and logical vigor.

Take the example of the split half validation steps. Intuitively, it makes sense that the average estimation from 20 causal forest is going to be more stable than a single estimation from a single causal forest. But you need to be able to prove it.

In the X-learner example, where EMSE of T-learner is compared to the X-leaner, the author used intuition but still within the frame of a mathematical proof.

The T-learner has the bounded minmax rate of

\(\mathcal{O}(m^{-\alpha_\mu} + n^{-\alpha_\mu})\) for the entire dataset with sample size $N$

But the X-learner is going to have

\[\mathcal{O}(m^{-\alpha_\tau} + n^{-\alpha_\mu})\]

for $\hat{\tau_0}$ with size $m=N-n$

and

\(\mathcal{O}(m^{-\alpha_\mu} + n^{-\alpha_\tau})\) for $\hat{\tau_1}$ with size $n$.

And because, by empirical evidence, $\alpha_\tau$ is often smaller than $\alpha_\mu$. As well as intuitively, it’s easier to estimate the treatment effect than potential outcome.

This makes sense both intuitively and mathematically! The only part where intuition was relied on, was that the minmax rate of X-learner is smaller than that of T-learner.

If I were to justify this design merely by intuition, I would just say, ‘Oh, the X-learner is surely more robust than the T-learner because we used more data to make the predictions, so it’s more robust.’

Thus, when it comes to model design, no matter how much sense you think your model makes, you should still proof and justify each step of the model. In addition to convincing your readers, problems such as scenarios where your model would fail, and basic statistical assumptions that would be violated will arise unnoticed.

This post is licensed under CC BY 4.0 by the author.