STUMP » Articles » COVID Round-up: Excess Mortality, COVID Death Undercounting, Spreadsheet Errors, and More » 18 October 2020, 14:48

Where Stu & MP spout off about everything.

COVID Round-up: Excess Mortality, COVID Death Undercounting, Spreadsheet Errors, and More  


18 October 2020, 14:48

Let’s take a look at U.S. COVID mortality, and some other COVID-related issues.

U.S. COVID mortality as of October 16, 2020

Using data from the CDC, the official (provisional) COVID death count is 204,613 as of 16 October 2020.

With total deaths of 2,226,473 recorded (from any cause, for the week ending 2/1/2020 to 10/10/2020), and using the CDC’s estimate that this is 112% of expected deaths, that gives us a total excess death count of about 238,550. So, the official COVID count is about 86% of the excess deaths measured in 2020.

Remember that 86% number, as I will be getting back to it.

For this period (the week ending 2/1/2020 up to 10/10/2020), official COVID deaths make up 9% of deaths.

Comparing top causes of death: COVID vs. 2017

For comparison, in 2017 (the last year we have full, final data for), the top causes of death had this share: (Table C on page 9):

1. Heart disease: 23.0%
2. Cancer: 21.3%
3. Accidental causes: 6.0%
4. CLRD (chronic lower respiratory diseases): 5.7%
5. Stroke: 5.2%
6. Alzheimer’s: 4.3%
7. Diabetes: 3.0%
8. Flu and pneumonia: 2.0%
9. Kidney disease: 1.8%
10. Suicide: 1.7%

Drug overdoses show up in accidental causes or suicide — I think, unless there is clear evidence, they will generally code an overdose death accidental.

Here’s a figure, broken out by sex:

Some interesting differences:
  • Suicide is #8 for causes of death for men (2.6%), and not in the top 10 for females
  • Huge difference in proportion of deaths for some major causes on both lists:
    • Stroke is 6.2% of female deaths, and 4.3% of male
    • Alzheimers is 6.1% of female deaths, and 2.6% for males
    • Accidental causes (includes drug overdoses) are 7.6% of male deaths and 4.4% of female

In any case, getting back to COVID: it’s 9% of deaths recorded from the week ending 2/1/2020 (i.e., 1/26/2020 to 10/10/2020).

Even if all the COVID deaths stop now (they won’t), it looks like COVID will be #3 on the cause of death list for the U.S. in 2020, after cancer and heart disease.

Yes, cancer and heart disease will still outstrip COVID deaths in number.

No, there aren’t a huge number of actual cancer & heart disease deaths being misattributed to COVID. If anything, I think it’s gone in the other direction… but mainly in the early, first wave of COVID

On the undercounting of COVID deaths

Yes, undercounting. I’ve mentioned it before, but I’m going to keep bringing it up as long as people stupidly try to claim that COVID deaths are being over-counted by orders of magnitude.

First, I’ve started playing around with Tableau, though it looks like I can’t embed it on either STUMP or substack, so go here if you want to play with it. I basically duplicated the CDC’s dashboard and have started playing with the elements to make something more useable for me. I have barely made any changes thus far.

So here is a graph of U.S. deaths, broken out by with/without COVID:

I want you to notice where the light green portion peaks over the orange line. Now, these could be extra deaths due to lockdowns and other secondary aspects of COVID…. but I think the early spike, when many foolishly believed the lockdowns would be for only two weeks, that is undercounting of COVID.

Now, that second wave (or rather, plateau)…. mmmm, I still think it’s COVID, mainly because lockdowns have stayed in place well after peaks in cases (duh) and deaths (less of a duh). If people are being driven to despair by lockdowns, that portion of mortality should be elevated even as COVID deaths come down. We’re not seeing that. Yet. Wait til winter (ugh).

You don’t have to take my word for it on undercounting: Excess Deaths From COVID-19 and Other Causes, March-July 2020, research letter in JAMA:

Previous studies of excess deaths (the gap between observed and expected deaths) during the coronavirus disease 2019 (COVID-19) pandemic found that publicly reported COVID-19 deaths underestimated the full death toll, which includes documented and undocumented deaths from the virus and non–COVID-19 deaths caused by disruptions from the pandemic.1,2 A previous analysis found that COVID-19 was cited in only 65% of excess deaths in the first weeks of the pandemic (March-April 2020); deaths from non–COVID-19 causes (eg, Alzheimer disease, diabetes, heart disease) increased sharply in 5 states with the most COVID-19 deaths.1 This study updates through August 1, 2020, the estimate of excess deaths and explores temporal relationships with state reopenings (lifting of coronavirus restrictions).
Between March 1 and August 1, 2020, 1 336 561 deaths occurred in the US, a 20% increase over expected deaths (1 111 031 [95% CI, 1 110 364 to 1 111 697]). The 10 states with the highest per capita rate of excess deaths were New York, New Jersey, Massachusetts, Louisiana, Arizona, Mississippi, Maryland, Delaware, Rhode Island, and Michigan.
Of the 225,530 excess deaths, 150 541 (67%) were attributed to COVID-19. Joinpoint analyses revealed an increase in deaths attributed to causes other than COVID-19, with 2 reaching statistical significance. US mortality rates for heart disease increased between weeks ending March 21 and April 11 (APC, 5.1 [95% CI, 0.2-10.2]), driven by the spring surge in COVID-19 cases. Mortality rates for Alzheimer disease/dementia increased twice, between weeks ending March 21 and April 11 (APC, 7.3 [95% CI, 2.9-11.8]) and between weeks ending June 6 and July 25 (APC, 1.5 [95% CI, 0.8-2.3]), the latter coinciding with the summer surge in sunbelt states.

The argument is that the increase in both Alzheimer’s deaths and heart disease deaths really were COVID deaths, at least in the period of March – July 2020. I concur.

It has to do with the spike in such deaths. When you have such a rapid increase to a much higher level… and it’s in places where they definitely had a spike in COVID deaths simultaneously, then yeah, most of it is COVID. You can get some marginal increases in deaths for Alzheimer’s due to reduced visits and heart attacks because people don’t go to the hospital when they should… but that should be everywhere, not only places like New York City.

And the effect is seen strongest in the places with the biggest spikes in speed and magnitude.

No, it’s not a plot if the official cause of death changes months later

The above numbers are the current official stats, and, as I keep explaining to people, there is a reason it takes a few years before we have a clean data release for the cause of deaths in the U.S. We would normally have the report on 2018 deaths in the summer of 2020, but it’s obviously not a CDC priority right now, so I have to use the report on 2017 deaths released in 2019.

Yes, some deaths from 2020 are going to get recategorized as having been COVID months after the fact, and some will get recategorized from COVID to something else, like heart disease.

The total deaths do not change much — every so often, there are unreported deaths (from murders, or death while living alone), but it’s tough to hide a dead body, Faulkner stories notwithstanding. Only a handful of deaths get added to the total years after the fact. (And it’s even rarer for deaths to be removed.)

But cause of death can change — if there’s only one underlying cause on the death certificate, the coroner (or other person officially filling out the death certificate), it’s because they were busy. When there was a huge spike of deaths in New York City in April, yes, the folks were busy. There will be deaths that get recategorized.

Just take a gander at New York City deaths this year:

During that spike, about 30% of the excess mortality was designated as being a death where the person did not have COVID, though, in all probability, they did.

As noted in my first section, 86% of excess mortality is directly attributed to official COVID deaths, which is higher than the 65% from the study. Part of it is the delay between doing a study and getting published, even as a quick letter.

Part of it, though, is as time has gone on, more testing has occurred, and COVID deaths are more likely to be actually flagged as COVID deaths. The official COVID count from those early months, especially March and April, will likely rise as records are reviewed and updated. There will still likely be undercounting of COVID deaths from those months, though, as many of the sufferers weren’t tested and were unable to convey the symptoms they suffered.

More on the spreadsheet screw-up in the UK

Wired UK: Meet the Excel warriors saving the world from spreadsheet disaster

Last week, the [UK] government stumbled into its own spreadsheet nightmare when it admitted contact-tracing efforts were stymied by a simple data processing mistake. They’re not the first to fall victim to the curse of Excel – and they won’t be the last either. Last year, Canadian marijuana grower Canopy Growth had to correct its quarterly earnings after incorrectly posting a £40 million loss — the real figure was £88m, miscalculated by a formula error. The company’s stock fell two per cent. Boeing leaked employees’ personal data in a hidden spreadsheet column. An investment bank analysis of Tesla’s purchase of Solar City undervalued the company by $400m after double counting its debt in a spreadsheet. These may be egregious errors, but they are hardly uncommon.

Research suggests more than 90 per cent of spreadsheets have errors, and half of spreadsheet models used in large businesses have “material defects”. Given some 750 million people use Excel globally, there are plenty of errors needing attention. One prominent researcher calls spreadsheets the dark matter of corporate IT. And that’s why people like Lyford-Smith have become defenders of the spreadsheet, mitigating the risks by fixing everyone else’s mistakes.

They’re an organised bunch, which is perhaps no surprise for spreadsheet specialists. The European Spreadsheet Risks Interest Group (EUSpRig) runs an annual conference to gather research (cancelled this year in favour of a webinar series), and collates best practice, training materials, and horror stories on its website — and there’s also a Yahoo Groups mailing list, where members offer tips and tricks, share links to resources and pick apart press coverage of the contact tracing debacle. In short, they can’t figure out what the real problem was because the reporting is so disjointed and unclear. One popular share was a YouTube clip of a satirical Spreadsheet News Network from Matt Parker of Stand-up Maths — as one member posted, it “made my day”.

Yes, it was funny (and the head of the group got quoted in the video), and I posted it last week. The video is here, and there are a lot of spreadsheet jokes in the chyrons, so watch out for those.

In the end, the problem isn’t spreadsheets, but people. Lyford-Smith says horror stories such as the contact-tracing chaos tend to spark a backlash against Excel. “But Excel is universally available and very accessible,” he says. “People are going to keep using it and usually the problems aren’t systems problems, they’re management or risk problems.” Until we get better at Excel, we’ll need people to protect the world from its own dependence on poorly designed spreadsheets.

It doesn’t even require getting better at Excel. Many of the worst Excel disasters were created by experts… who have no control mindset. We’ve been keeping a log for years, and you will see names such as Goldman Sachs in there.

One of my fellow EuSpRIG members, Patrick O’Beirne: UK Covid-19 Track & Trace Excel snafu: Uncontrolled spreadsheets lead to data loss

TL;DR: if they had applied basic data controls, they could have detected the problem as soon as it happened rather than lose precious days. Whenever data is being transformed from one system into another, there are many opportunities for data going missing, being duplicated, or corrupted. Whatever the skill level of the developer, nobody can know in advance what data might actually arrive in use and every possible failure mode. Any such system, whatever the technology used to implement it, should have at a minimum a simple check that the number of records output reconciles to the number input; and ideally other checks on data quality and hash totals.

If you open a text file in Excel 2003 with more than 65536 lines, or try to copy and paste from another application, it raises an error message “File not loaded completely”. If you import in any version of Excel into a worksheet in compatibility mode (ie 65536 rows) it would report ” The text file contains more data than will fit on a single worksheet”. So why did this not happen? This was, in Sherlock Holmes terms, the dog that did not bark in the night.

I tried to imagine scenarios involving XLSWriter or Apache POI trying to create XLS files with more than 65K rows; I worked out an unlikely scenario involving copying to the clipboard from another application and in VBA using Worksheets(1).Paste which does not raise an error, it truncates the data. Other Excel experts suggested that the PHE developers simply suppressed Application.DisplayAlerts or On Error Resume Next, or even deliberately imported line by line with a cut off at 65535. They point to “Hanlon’s razor”: “Don’t ascribe to malice what can be explained by incompetence”, and to the collection of reports on and And Arthur Conan Doyle wrote “Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth”.

In most of these cases it’s BEING AWARE OF POTENTIAL ERRORS.

More COVID Stories

Let me jump off that last item (in case you can’t read it):

Early in the coronavirus outbreak, hospital data from China revealed a startling disparity: Covid-19, the disease caused by the virus, was killing far more men than women.

That difference persisted in other Asian countries, such as South Korea, as well as in European countries, such as Italy. Then, it appeared in the United States.

By mid-October, the coronavirus had killed almost 17,000 more American men than women, according to data from the Centers for Disease Control and Prevention. For every 10 women claimed by the disease in the United States, 12 men have died, found an analysis by Global Health 50/50, a U.K.-based initiative to advance gender equality in health care.

In developed countries, of course, far more men die each year than women. The “natural” mortality for men is higher than for women at every age. (You can see some stats in this 2017 post)

However, the sex gap in mortality for coronaviruses is worse than regular mortality — this happened with the original SARS in 2002, and with MERS later.

Women generally have stronger immune systems, thanks to sex hormones, as well as chromosomes packed with immune-related genes. About 60 genes on the X chromosome are involved in immune function, Johns Hopkins University microbiologist Sabra Klein told The Washington Post in April. People with two X chromosomes can benefit from the double helping of some of those genes.
The power of the immune system wanes as people age, regardless of sex. But what is a gentle decline for women is an abrupt dive off a cliff for men: Iwasaki’s work indicates the T-cell response of men in their 30s and 40s is equivalent to that of a woman in her 90s.

Wow, I hadn’t heard that one before, though I knew immune response differed.

Perhaps, when men get the flu or colds, they really do get it worse than we women. Huh.

Related Posts
Vaccines Reduce Risk: A Look at the Changing Age-Related Mortality Risk of COVID
A Sampling of Political Mortality
CORRECTION: The French ARE actually long-lived