Data Science for Social Impact Research Group @ University of Pretoria (@dsfsi)

Top repositories

1

textaugment

TextAugment: Text Augmentation Library
Python
370
star
2

covid19za

Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Jupyter Notebook
255
star
3

covid19africa

Africa open COVID-19 data working group
Jupyter Notebook
48
star
4

masakhane-web

Masakhane Web is a translation web application for solely African Languages.
Jupyter Notebook
35
star
5

PuoBERTa

A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.
Makefile
3
star
6

gov-za-multilingual

The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statements
Jupyter Notebook
3
star
7

project-state-capture

Zondo Commission or State Capture Commission Transcripts
2
star
8

sa-parliament

South African Member Of Parliament Data
Python
2
star
9

za-terminology

DSFSI South African Terminlogy Lists and Lexicon Project
Makefile
2
star
10

dsfsi-datasets

Datasets made available for different small projects
Jupyter Notebook
2
star
11

PuoData

Curated corpora for Setswana. Used to train PuoBERTa.
2
star
12

Higher_Education_EDA

This is an EDA Git for education researchers and practitioners
Jupyter Notebook
2
star
13

embedding-eval-data

Embedding Evaluation Data for South African Languages
1
star
14

2020-AMMI-salomon

Jupyter Notebook
1
star
15

dsfsi-dataset-template

Makefile
1
star
16

za-bank-risk

This repository is an initial pipeline for reading, processing, labelling and classifying unstructured annual reports of South African (SA) banks with the aim of identifying financial risk. It leveraged work by the Corporate Financial Information Environment-Final Report Structure Extractor (CFIE–FRSE) of El-Haj et al. which created a corpus of annual reports of United Kingdom (UK) companies.
Jupyter Notebook
1
star