Summary: This blog is part II of a series showcasing management and analytics of the daily U.S. Covid-19 case/death data published by the Center for Systems Science and Engineering at Johns Hopkins University.
A new column is added each day holding the cumulative counts for each geography.
The data munging revolves on pivoting or melting the data into R data.tables and computing daily counts as differences of successive cumulative records.
I manage around the data problems by generally avoiding counts for specific state geographies and days, working instead at the state level with moving averages.
The R data.table, tidyverse, pryr, plyr, fst, and knitr packages are featured, as well as functions from my personal stash, detailed below.