Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

OCaml

HTML

PHP

Nix

Kotlin

MATLAB

Rust

Dart

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Jupyter Notebook

JavaScript

C++

Groovy

R

Go

Kotlin

Julia

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇱🇾 Libya

🇨🇲 Cameroon

🇾🇹 Mayotte

🇱🇹 Lithuania

🇳🇮 Nicaragua

🇷🇸 Serbia

🇩🇴 Dominican Republic

🇬🇺 Guam

All Countries Compare Countries

neuml/txtinstruct

Stars
168
Rank 225,507 (Top 5 %)
Language
Python
License
Apache License 2.0
Created over 1 year ago
Updated over 1 year ago

neuml/txtinstruct

neuml

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

📚 Datasets and models for instruction-tuning

Datasets and models for instruction-tuning

txtinstruct is a framework for training instruction-tuned models.

The objective of this project is to support open data, open models and integration with your own data. One of the biggest problems today is the lack of licensing clarity with instruction-following datasets and large language models. txtinstruct makes it easy to build your own instruction-following datasets and use those datasets to train instructed-tuned models.

txtinstruct is built with Python 3.7+ and txtai.

Installation

The easiest way to install is via pip and PyPI

pip install txtinstruct

You can also install txtinstruct directly from GitHub. Using a Python Virtual Environment is recommended.

pip install git+https://github.com/neuml/txtinstruct

Python 3.7+ is supported

See this link to help resolve environment-specific install issues.

Examples

The following example notebooks show how to build models with txtinstruct.

Notebook	Description
Introducing txtinstruct	Build instruction-tuned datasets and models

Further Reading

Instruction-tune models using your own data with txtinstruct

txtai

💡 Semantic search and workflows powered by language models

paperai

📄 🤖 Semantic search and workflows for medical/scientific papers

codequestion

🔎 Semantic search for developers

tldrstory

📊 Semantic search for headlines and story text

paperetl

📄 ⚙️ ETL processes for medical and scientific papers

txtchat

💭 Conversational search and workflows for all

txtai.js

Semantic search and workflows in JavaScript

txtai.rs

Semantic search and workflows in Rust

txtmarker

Highlight text in documents

cord19q

COVID-19 Open Research Dataset (CORD-19) Analysis

txtai.go

Semantic search and workflows in Go

txtai.java

Semantic search and workflows in Java

webelapse

Generate time-lapse video for a website

py27hash

Python 2.7 hashing and iteration in Python 3+

neuspo

🏈 Fact-driven, real-time sports event and news site

kernelpipes

Run Kaggle kernel pipeline jobs

txtai.weaviate

Example Weaviate integration with txtai

.github

Default community files for all NeuML projects

rag

🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.