There are no reviews yet. Be the first to send feedback to the community and the maintainers!
textaugment
TextAugment: Text Augmentation Librarycovid19za
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africacovid19africa
Africa open COVID-19 data working groupmasakhane-web
Masakhane Web is a translation web application for solely African Languages.gov-za-multilingual
The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statementsPuoBERTa
A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.project-state-capture
Zondo Commission or State Capture Commission Transcriptsza-terminology
DSFSI South African Terminlogy Lists and Lexicon Projectdsfsi-datasets
Datasets made available for different small projectsPuoData
Curated corpora for Setswana. Used to train PuoBERTa.sa-parliament
South African Member Of Parliament Dataembedding-eval-data
Embedding Evaluation Data for South African Languages2020-AMMI-salomon
dsfsi-dataset-template
za-bank-risk
This repository is an initial pipeline for reading, processing, labelling and classifying unstructured annual reports of South African (SA) banks with the aim of identifying financial risk. It leveraged work by the Corporate Financial Information Environment-Final Report Structure Extractor (CFIE–FRSE) of El-Haj et al. which created a corpus of annual reports of United Kingdom (UK) companies.Love Open Source and this site? Check out how you can help us