Discussion about this post

User's avatar
Raghu Ram's avatar

This really highlights the 'data-first' trap. We’ve been conditioned to believe that if the dataset is large enough, the 'truth' will eventually emerge through sheer computation. But as you point out, if the causal structure is missing or misidentified, more data just lead to more precisely wrong answers. Starting with the DAG isn't just a 'step' in the process; it’s the only way to ensure the math actually maps to reality. Do you think the current obsession with 'Black Box' ML is making it harder for new practitioners to adopt this 'Backwards' (but correct) workflow?

No posts

Ready for more?