• Stars
    star
    255
  • Rank 154,651 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created about 4 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa

Coronavirus COVID-19 (2019-nCoV) Data Repository for South Africa

DOI dsJournal

Coronavirus COVID-19 (2019-nCoV) Data Repository for South Africa created, maintained and hosted by Data Science for Social Impact research group, led by Dr. Vukosi Marivate, at the University of Pretoria.

Disclaimer: We have worked to keep the data as accurate as possible. We collate the COVID 19 reporting data from NICD and DoH. We only update that data once there is an official report or statement. For the other data, we work to keep the data as accurate as possible. If you find errors. Make a pull request.

If you use this repo for any research/development/innovation, please contact us (see contacts below)

See our blog posts:

If you are interested in the Africa-wide effort: Go to https://github.com/dsfsi/covid19africa

For information on daily updates on the repo, go to https://twitter.com/vukosi/status/1239184086633242630?s=20

Licenses

Code License: MIT | Data License: CC BY-SA 4.0

Data Available [/data]

Please note that these reports are the daily reports as released by the National Department of Health or the NICD. The new cases reported are based on new positive test reports released. However, there may be significant lag from when the patient was tested. As an example in epidemiological Week 1 of 2021 (3-9 Jan) approximately 33k new cases were reported on the daily announcement. However, the NICD Testing Summary Report for Week 3 of 2021 (which also reports the two previous weeks) shows that the number of positive tests was 43635 for Week 1 of 2021. The difference is due to the lag in testing being done -- some of the 33k cases reported on the daily announcments were actually from prior weeks while a large number of people were tested between 3-9 January, but the cases were only reported from the 10th onwards. Care needs to be taken in doing some analyses to take this into account.

Active

dataset url raw_url[file]
provincial_cumulative_timeline_confirmed provincial_cumulative_timeline_confirmed provincial_cumulative_timeline_confirmed.csv
provincial_cumulative_timeline_recoveries provincial_cumulative_timeline_recoveries provincial_cumulative_timeline_recoveries.csv
provincial_cumulative_timeline_testing provincial_cumulative_timeline_testing provincial_cumulative_timeline_testing.csv
provincial_cumulative_timeline_deaths provincial_cumulative_timeline_deaths provincial_cumulative_timeline_deaths.csv
vaccination covid19za_timeline_vaccination covid19za_timeline_vaccination.csv
death_statistics covid19za_timeline_death_statistics covid19za_timeline_death_statistics.csv
transmission_type covid19za_timeline_transmission_type covid19za_timeline_transmission_type.csv
testing covid19za_timeline_testing covid19za_timeline_testing.csv
district_data district_data
DoH PDFs and Extracted CSVs doh_pdf
DoH Whatsapp case update archive doh_whatsapp
health facility data [public and private] health_system_za_hospitals_v1 health_system_za_hospitals_v1.csv
nicd_daily_national_report nicd_daily_national_report nicd_daily_national_report.csv
nicd_hospital_surveillance_data nicd_hospital_surveillance_data nicd_hospital_surveillance_data.csv
samrc_excess_deaths_province samrc_excess_deaths_province samrc_excess_deaths_province.csv
Apple, Google, Facebook Mobility Data mobility

Deprecated

NOTE: Since around 24 March 2020, we have not gotten individual case data from DoH or NICD. For now if you need provincial counts use the provincial_cumulative_timeline. For individual cases up to 25 March 2020, use the confirmed_cases.

dataset url raw_url[file]
confirmed_cases* [updated to 25 March 2020] covid19za_timeline_confirmed covid19za_timeline_confirmed.csv
deaths covid19za_timeline_deaths covid19za_timeline_deaths.csv

* NICD no longer gives individual case data. Please use provincial_cumulative_timeline from 26 March 2020 onwards.

Dashboard

Data Sources:

  • NICD - South Africa URL
  • Department of Health - South Africa Main Site, Twitter
  • South African Government Media Statements URL
  • National Department of Health Data Dictionary URL
  • MedPages URL
  • Statistics South Africa URL

Contributing

Options

  • I want to help, but don't have an idea: You can take a look at the issues to see which one you might be interested in tackling.
  • I have an idea or new feature: Create a new issue first, assign it to yourself and then fork the repo.

Adopting a file

Once you have chosen how you are going to contribute, you must list which files you will be working on by adding your name to the adopt-a-file csv file. Edit covid19za_volunteer_adopted_file.

Submitting Changes [Pull Request]

Resources [Get some ideas]

Contributors

Contributors Made with contributors-img.

Contact

Citing the dataset

On a visualisation/notebook/webapp:

Data Science for Social Impact Research Group @ University of Pretoria, Coronavirus COVID-19 (2019-nCoV) Data Repository for South Africa. Available on: https://github.com/dsfsi/covid19za.

In a publication

Data Science Journal

@article{marivate2020use, Author = {Vukosi Marivate and Herkulaas MvE Combrink}, Journal = {Data Science Journal}, Number = {1}, Pages = {1-7}, Title = {Use of Available Data To Inform The COVID-19 Outbreak in South Africa: A Case Study.}, Volume = {19}, Year = {2020}, url = {https://doi.org/10.5334/dsj-2020-019} }

and Dataset

@dataset{marivate_vukosi_2020_3819126, author = {Marivate, Vukosi and Arbi, Riaz and Combrink, Herkulaas and de Waal, Alta and Dryza, Henkho and Egersdorfer, Derrick and Garnett, Shaun and Gordon, Brent and Greyling, Lizel and Lebogo, Ofentswe and Mackie, Dave and Merry, Bruce and Mkhondwane, S'busiso and Mokoatle, Mpho and Moodley, Shivan and Mtsweni, Jabu and Mtsweni, Nompumelelo and Myburgh, Paul and Richter, Jannik and Rikhotso, Vuthlari and Rosen, Simon and Sefara, Joseph and van der Walt, Anelda and van Heerden, Schalk and Welsh, Jay and Hazelhurst, Scott and Petersen, Chad and Mbuvha, Rendani and Dhlamini, Nelisiwe and James, Vaibhavi}, title = {{Coronavirus disease (COVID-19) case data - South Africa}}, month = mar, year = 2020, publisher = {Zenodo}, doi = {10.5281/zenodo.3819126}, url = {https://doi.org/10.5281/zenodo.3819126} }

Showcase

Web Projects

Some of COVID-19 Data for South Africa (data in this repo) is currently being used by other independent projects shown in the table below :

Project Name Project Description Project Demo Project owner Country
1. Covid-19 SA Data Data visualizations corresponding to the current Covid-19 outbreak in South Africa [Website],[GitHub Repo] Simon Rosen South Africa
2. Covid-19 testing areas A Covid-19 Testing Facilities Map [Website],[GitHub Repo] Yannick Zehnder Switzerland
3. Covid-19 Map A Coronavirus Map [Website] [GitHub Repo] Jay Welsh South Africa
4. Covid-19 Telegram Bot Corona virus statistics via Telegram Link CodeChap South Africa
5. Covid-19 Xitsonga Dashboard Xitsonga Dashboard Link xitsonga.org South Africa
6. Hospitals' capacity to respond to Covid-19 Data visualization mapping local hospitals (private ad public) in South Africa [Map Viz] ,[Repo] Nompumelelo South Africa
7. Covid-19 Trends Covid-19 analytics dashboard for South Africa [Website] [Repo] Schalk van Heerden South Africa
8. Covid-19 Tshivenda Dashboard Tshivenda Dashboard Link luvenda.com South Africa
9. Map of Health facilites around me Map showing comparable details of hospitals around my location in response to Covid-19 [Webpage] , [GitHub Repo] These authors South Africa
10. R-based Interactive health facilties Map Afrimapr, mapping health facilities using R-building blocks [Webpage] [Repo] Dr Andy South United Kingdom
11. Estimating the Reproductive Number of COVID-19 Estimating effective reproductive number for SA, it's provinces and other countries. [Website] Louis Rossouw South Africa
12. Modelling COVID-19 in South Africa at a Provincial Level Modelling COVID-19 in South Africa at a Provincial Level using reported and excess deaths. [Website] Louis Rossouw South Africa
13. South African Provincial COVID-19 Visualization Visualize deaths, cases and recoveries alongside mobility data on a provincial level. Additionally, visualize cahnge of cases over a weekly basis. [Website] Christopher Marais South Africa
14. Differential Evolution to Optimize A Long-term Multi-strain Model of COVID-19 in South Africa Uses Differential Evolution (an Evolutionary Optimization Algorithm) for data fitting and parameter estimation. Link to be provided. CJ Pretorius and MC du Plessis South Africa

Scholarly Work

See Google Scholar

Support

We want to acknowledge support from these organisations

More Repositories

1

textaugment

TextAugment: Text Augmentation Library
Python
370
star
2

covid19africa

Africa open COVID-19 data working group
Jupyter Notebook
48
star
3

masakhane-web

Masakhane Web is a translation web application for solely African Languages.
Jupyter Notebook
35
star
4

PuoBERTa

A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.
Makefile
3
star
5

gov-za-multilingual

The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statements
Jupyter Notebook
3
star
6

project-state-capture

Zondo Commission or State Capture Commission Transcripts
2
star
7

sa-parliament

South African Member Of Parliament Data
Python
2
star
8

za-terminology

DSFSI South African Terminlogy Lists and Lexicon Project
Makefile
2
star
9

dsfsi-datasets

Datasets made available for different small projects
Jupyter Notebook
2
star
10

Higher_Education_EDA

This is an EDA Git for education researchers and practitioners
Jupyter Notebook
2
star
11

PuoData

Curated corpora for Setswana. Used to train PuoBERTa.
2
star
12

embedding-eval-data

Embedding Evaluation Data for South African Languages
1
star
13

2020-AMMI-salomon

Jupyter Notebook
1
star
14

dsfsi-dataset-template

Makefile
1
star
15

za-bank-risk

This repository is an initial pipeline for reading, processing, labelling and classifying unstructured annual reports of South African (SA) banks with the aim of identifying financial risk. It leveraged work by the Corporate Financial Information Environment-Final Report Structure Extractor (CFIE–FRSE) of El-Haj et al. which created a corpus of annual reports of United Kingdom (UK) companies.
Jupyter Notebook
1
star