3 votes How hard is it to get counting right? Posted August 1, 2021 by skybrian Tags: statistics, philosophy, lyme disease, ticks, substack.desystemize, author.collin lysford https://desystemize.substack.com/p/desystemize-1 Link information This data is scraped automatically and may be incorrect. Title Desystemize #1 Authors collin Word count 2131 words 1 comment Collapse replies Expand all Comments sorted by most votes newest first order posted relevance OK skybrian (OP) August 1, 2021 Link This is the first article of what looks like an interesting substack about the philosophy of science. (It's clearly influenced by David Chapman's work.) [...] This is the first article of what looks like an interesting substack about the philosophy of science. (It's clearly influenced by David Chapman's work.) Counting is a system, one a lot more profound and powerful than we give it credit for. After all, if I pick up an animal, count the ticks on it (the “tick burden”), and then set it loose again, I’ve created something from scratch. We now have a number, where before there was only a messy and detailed world. When we set that animal loose, we can never travel back in time to the world that was then and have another look, but the number will survive as far in the future as we care to take it. We can study a tract of past we can’t revisit because we’ve made a number that serves as a mirror to it. Systems are great at creating things! [...] Dr. Ostfeld didn’t start that paragraph by noting such-and-such statistical technique clearly indicated something was off with the tick counts. He started with the sentence “My research group has set and checked many hundreds of thousands of live animal traps over the years.” In other words, it was familiarity with the data-generating process that enabled the lab group to imagine this potential vulnerability and come up with this experiment. By the time the data gets into the hands of analysts, it’s too late to fix. You can’t math your way out of a wrong number. This mistake was caught only because it was the same people generating the data as analyzing it. Which, great for ecology - but as data science becomes more and more specialized, it will be increasingly done by people who are explicitly and solely data scientists. And they’ll inherit datasets from repositories somewhere and never catch a single one of these systemic errors because they couldn’t sift through the wet mouse turds even if they wanted to.