12 votes

Is there any way to estimate how many people in a region currently have coronavirus?

I've always wanted to be able to run probabilities when considering doing something that could infect me with coronavirus. I know how many people test positive, how many are dying and how many are getting tested. But what I really want to know is what are the odds that a mask-less interaction with one person will infect me.

9 comments

  1. [4]
    dredmorbius
    (edited )
    Link
    The approach I've used, and see used by those who are not Space Alien Cats (i.e., Internet randoms), is to look at reported deaths and work backward. Mortality is based on IFR (incident fatality...

    The approach I've used, and see used by those who are not Space Alien Cats (i.e., Internet randoms), is to look at reported deaths and work backward.

    • Mortality is based on IFR (incident fatality rate) and case duration.
    • IFR seems to be 0.5% -- 1%. This is not the case fatality rate, (CFR, closer to 3% worldwide), which is based on only reported cases, but the overall mortality for all cases, detected or not. Mortality is strongly influenced by age. Regions with markedly younger populations --- say, some parts of HIV/AIDS-ravaged Africa, might have surprisingly low COVID-19 mortality as there simply aren't enough people alive age 45+ who would be susceptible to dying of coronavirus. See COVID-19 pandemic death rates by country, where true IFR is likely near the lower end of reported CFR for countries with reliable reporting. This clusters near 0.5%.
    • Mortality seems to occur 2--3 weeks after diagnosis ("12 days from onset of symptoms"). As testing improves, more cases are caught earlier, and apparent survival to mortality increases.
    • Reported COVID deaths are themselves only a subset of all actual mortality. The New York Times have been looking at this via excess mortality (all deaths in 2020 vs. prior years), which seem to be running about 130% of reported COVID-19 deaths. As of 2 Dec. 2020, reported deaths were just under 280,000, whilst the NYT reported 345,000 total excess mortality, or 123% of reported.
    • Death reports themselves can take a few days to a week or two to be reported themselves. The NYT discusses this on its reporting page. The US right now is still seeing a bobble in reported data and lags from the Thanksgiving holiday (though that should be mostly over). Sweden's data, which I've been watching closely for the past 8 weeks, takes about 10 days to fully settle. Even older data get revised and moved around.
    • Infection trends tend to follow on current infection level, modulo tightening or loosening restrictions, which take about two weeks to show up in new-infections data. I've seen several cases of sharp reversaals, notably India in September and France peaking 7 November. So a sharp rise can turn around suddenly. Rises tend to take longer to appear though as they usually follow an exponential growth, a small initial trend can escalate rapidly. "Natural" (uncontrolled) doubling time is about 3 days, or a 10x weekly growth rate (if you see this, get very worried).
    • Different areas report and manage data differently. Russia's infections data show now weekly variation (e.g., a weekend "valley", like most regions) ... but its deaths data do. India's case and death counts have moved in near synchronicity without lags for deaths. Sweden seems to dump or delay data days after actual incidents. North Korea has reported no cases at all (not credibly), and Suriname's "active cases" number fell yesterday ... to negative six (as reported by Worldometers). Places with small populations have thin data making trending harder. Turkey had drastically underreported COVID-19 cases until late November, they've risen from less than 500k and will cross the 1 million mark in another day or two

    TL;DR: it's complicated.

    But you might:

    • Take the current 7-day reported deaths rate.
    • Multiply by 1.2.
    • Divide by 0.5% (high estimate) and 1% (low estimate)

    ... and the resulting numbers are roughly the range of people infected 2--3 weeks ago, if your region has decent monitoring and reporting practices. Testing reports should give an indication of how that number has trended since.

    Note that total cases does not equal active cases (the second is smaller), and that infectious cases (the ones you actually care about) are a subset of active cases. A course of COVID-19 typically lasts 4--8 weeks by my understanding.

    By rough numbers, total infecteds are 6--8x the number of reported cases in the US and Europe. Wide-scale antibody testing would have to be performed to confirm this.

    10 votes
    1. [3]
      vord
      Link Parent
      Explains why there hasn't been any push to do so in the USA. "Don't want to cause a panic" is a weird way I've heard phrasing "Don't want people to know just how bad we fucked up."

      Wide-scale antibody testing would have to be performed to confirm this.

      Explains why there hasn't been any push to do so in the USA. "Don't want to cause a panic" is a weird way I've heard phrasing "Don't want people to know just how bad we fucked up."

      5 votes
      1. [2]
        dredmorbius
        Link Parent
        Not just in the US. Turkey's true case count is likely 2.1 -- 5 million, rather than the 925,000 currently reported.

        Not just in the US. Turkey's true case count is likely 2.1 -- 5 million, rather than the 925,000 currently reported.

        5 votes
  2. [5]
    enso
    Link
    Here is a link that allows you to look at the probability that someone at an even of size N people will have covid at a county level. It allows you to change the ratio of tested:total cases...

    Here is a link that allows you to look at the probability that someone at an even of size N people will have covid at a county level. It allows you to change the ratio of tested:total cases between 5 and 10.

    8 votes
    1. [4]
      teaearlgraycold
      Link Parent
      Thanks. It won't tell me the probability of meeting with 1 person, but I figure I can invert the probability with this equation: p_individual = 1 - (1 - p_multi) ^ (1 / n_multiple) Which means...

      Thanks. It won't tell me the probability of meeting with 1 person, but I figure I can invert the probability with this equation:

      p_individual = 1 - (1 - p_multi) ^ (1 / n_multiple)
      

      Which means that for a region where a 10 person gathering has a probability of 10% of an attendee infecting you, each person has a 1.05% chance of being infected.

      5 votes
      1. [3]
        archevel
        Link Parent
        Wouldn't you have to consider a few additional factors other than just number of people and current infection rate when estimating the risk, eg. probability of a covid infected person attending...

        Wouldn't you have to consider a few additional factors other than just number of people and current infection rate when estimating the risk, eg.

        • probability of a covid infected person attending the meeting
        • probability of infection from another person at meeting location (might also vary based on number of people)
        1 vote
        1. [2]
          teaearlgraycold
          Link Parent
          That's what this calculation is The site says this: So their estimate is naive enough for my math to work.

          probability of a covid infected person attending the meeting

          That's what this calculation is

          probability of infection from another person at meeting location (might also vary based on number of people)

          The site says this:

          The risk level is the estimated chance (0-100%) that at least 1 COVID-19 positive individual will be present at an event in a county, given the size of the event.

          So their estimate is naive enough for my math to work.

          2 votes
          1. archevel
            Link Parent
            I thought the calculation was for: Given you attend an event of a certain size. Is the likelihood of a person attending the meeting the same in case they are infected or not? With regards to the...

            I thought the calculation was for:

            what are the odds that a mask-less interaction with one person will infect me.

            Given you attend an event of a certain size.

            probability of a covid infected person attending the meeting

            That's what this calculation is

            Is the likelihood of a person attending the meeting the same in case they are infected or not?

            With regards to the meeting location the likelihood of infection is probably higher in am enclosed space vs something outdoors. If the space is crowded or not probably also impact s the risk... But then again finding out those variables me might be impossible.