There are no reviews yet. Be the first to send feedback to the community and the maintainers!
petals
๐ธ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloadingpromptsource
Toolkit for creating, sharing and using natural language prompts.Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.xmtf
Crosslingual Generalization through Multitask Finetuningt-zero
Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)biomedical
Tools for curating biomedical training data for large-scale language modelingdata-preparation
Code used for sourcing and cleaning the BigScience ROOTS corpuslam
Libraries, Archives and Museums (LAM)data_tooling
Tools for managing datasets for governance and training.multilingual-modeling
BLOOM+1: Adapting BLOOM model to support a new unseen languageevaluation
Code and Data for Evaluation WGdata_sourcing
This directory gathers the tools developed by the Data Sourcing Working Groupmetadata
Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.model_card
tokenization
carbon-footprint
A repository for `codecarbon` logs.bloom-dechonk
A repo for running model shrinking experimentscatalogue_data
Scripts to prepare catalogue datapii_processing
PII Processing code to detect and remediate PII in BigScience datasets. Reference implementation for the PII Hackathontraining_dynamics
bibliography
A list of BigScience publicationsscaling-laws-tokenization
scaling-laws-tokenizationdatasets_stats
Generate statistics over datasets used in the context of BSevaluation-robustness-consistency
Tools for evaluating model robustness and consistencyinterpretability-ideas
evaluation-results
Dump of results for bigscience.Love Open Source and this site? Check out how you can help us