• Stars
    star
    4
  • Rank 3,304,323 (Top 66 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 2 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statements

More Repositories

1

textaugment

TextAugment: Text Augmentation Library
Python
395
star
2

covid19za

Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Jupyter Notebook
255
star
3

covid19africa

Africa open COVID-19 data working group
Jupyter Notebook
48
star
4

masakhane-web

Masakhane Web is a translation web application for solely African Languages.
Jupyter Notebook
34
star
5

PuoBERTa

A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.
Makefile
4
star
6

Higher_Education_EDA

This is an EDA Git for education researchers and practitioners
Jupyter Notebook
3
star
7

project-state-capture

Zondo Commission or State Capture Commission Transcripts
2
star
8

za-mavito

DSFSI South African Terminlogy Lists and Lexicon Project
HTML
2
star
9

dsfsi-datasets

Datasets made available for different small projects
Jupyter Notebook
2
star
10

PuoData

Curated corpora for Setswana. Used to train PuoBERTa.
2
star
11

za-bank-risk

This repository is an initial pipeline for reading, processing, labelling and classifying unstructured annual reports of South African (SA) banks with the aim of identifying financial risk. It leveraged work by the Corporate Financial Information Environment-Final Report Structure Extractor (CFIE–FRSE) of El-Haj et al. which created a corpus of annual reports of United Kingdom (UK) companies.
Jupyter Notebook
2
star
12

sa-parliament

South African Member Of Parliament Data
Python
2
star
13

embedding-eval-data

Embedding Evaluation Data for South African Languages
1
star
14

2020-AMMI-salomon

Jupyter Notebook
1
star
15

dsfsi-dataset-template

Makefile
1
star
16

zabantu-beta

ZaBantu is a fleet of light-weight Masked Language Models for Southern Bantu Languages
Python
1
star
17

gov-za-sona-multilingual

Python
1
star
18

izindaba-zesizulu

Categorised isiZulu News. Source data is the isiZulu news from the SABC social media posts.
1
star