• Stars
    star
    252
  • Rank 161,312 (Top 4 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created about 1 year ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A New Tamil Large Language Model (LLM) Based on Llama 2

Tamil-Llama: A Family of LLaMA-based LLMs focused on Tamil Language

Tamil LLaMA Image

Description

This repository contains the code and models for "Tamil-Llama", a project focused on enhancing the performance of language models for the Tamil language. It builds upon the open-source LLaMA model, introducing additional Tamil tokens and employing the LoRA methodology for efficient training. Please read the technical report for more details.

Technical Report: https://arxiv.org/abs/2311.05845

Table of Contents

Available Models

Model Type Data Base Model # Params Download Links
Tamil LLaMA 7B Base Base model 12GB LLaMA 7B 7B HF Hub
Tamil LLaMA 13B Base Base model 4GB LLaMA 13B 13B HF Hub
Tamil LLaMA 7B Instruct Instruction following model 145k instructions Tamil LLaMA 7B Base 7B HF Hub
Tamil LLaMA 13B Instruct Instruction following model 145k instructions Tamil LLaMA 13B Base 13B HF Hub

Quantized Version of Available Models

Model Format Bits Download Links
Tamil LLaMA 7B Base GGUF Q4_K_M, Q5_K_M, Q8_0 HF Hub
Tamil LLaMA 13B Base GGUF Q4_K_M, Q5_K_M, Q8_0 HF Hub
Tamil LLaMA 7B Instruct GGUF Q4_K_M, Q5_K_M, Q8_0 HF Hub
Tamil LLaMA 13B Instruct GGUF Q4_K_M, Q5_K_M, Q8_0 HF Hub

Benchmark Scores

Scores are calculated using the HuggingFace Open LLM Leaderboard.

Note: The benchmarks test the model's capabilities in English reasoning, although the Tamil LLaMA models were not trained on quality reasoning tasks in English it shows decent performance across most benchmarks.

Model Average ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K
Tamil LLaMA 13B Instruct 51.59 54.52 79.35 50.37 41.22 76.56 7.51
Tamil LLaMA 13B Base 49.5 52.82 79.95 52.05 36.56 75.61 0
Tamil LLaMA 7B Instruct 45.52 48.04 70.97 39.95 41.7 70.64 1.82
Tamil LLaMA 7B Base 44.52 46.67 72.85 40.95 35.93 70.72 0

Demo

A simple interactive demo of Tamil-LLaMA-7B-Instruct-v0.1 is hosted in the HuggingFace Space here -> abhinand/tamil-llama-playground

Tamil LLaMA Image

Getting Started

Using LMStudio:

LM Studio, an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.

  1. Download and Install LM Studio: Begin by downloading LM Studio from the official website.

  2. Locate the Tamil Llama Model: After installation, open LM Studio and use the search bar to find the "Tamil Llama" model. Alternatively, if you have the GGUF model ID, paste it directly into the search bar.

  3. Download the Appropriate Model Variant: Depending on your system's specifications, select the appropriate variant of the Tamil Llama model. Click on the 'Download' button to start the download process.

  4. Import the Preset JSON File: Once the model is downloaded, navigate to the 'Chat' tab in LM Studio. In the settings, find the 'Preset' menu and click on the dropdown. Select "Import Preset From File" and import the preset JSON file located at config/lm_studio/model_config.json in the repository.

  5. Select and Load the Model: Click on "Select a model to load" located on the top bar. From the list, choose the Tamil Llama variant that you previously downloaded.

  6. Initiate Conversations with the Model: The Tamil Llama model is now ready to use. You can start engaging in conversations in the chat area of LM Studio.

Using with Ollama:

  1. Verify Ollama Installation: First, ensure that Ollama is correctly installed on your system. If not, install it from the official source.

  2. Download the Modelfile: Access the GitHub repository and download the Modelfile. This file is necessary for setting up the Tamil Llama model in Ollama.

  3. Prepare the Working Directory: Place the downloaded Modelfile and the model's GGUF file in the same directory. To work in this directory, use the cd command in your terminal to change to the appropriate directory.

  4. Download the Tamil Llama Model: Execute the following command in your terminal to download the desired Tamil Llama model from the GitHub repository:

    curl -L https://huggingface.co/abhinand/tamil-llama-7b-instruct-v0.1-gguf/resolve/main/tamil-llama-7b-v0.1-q8_0.gguf -o tamil-llama.gguf

    This command downloads the Tamil Llama model GGUF file and saves it as tamil-llama.gguf in your current directory.

  5. Import and Run the Model in Ollama: After downloading the model, use the following command to create and run the Tamil Llama model in Ollama:

    ollama create tamil-llama -f Modelfile 

    This command imports the Tamil Llama model into Ollama and prepares it for use.

Optionally, depending upon your system's capabilities make sure to configure these parameters in the Modelfile too:

PARAMETER num_thread 8
PARAMETER num_gpu 0

For more information regarding the Modelfile's available parameters check out the official docs.

Datasets

The repository includes a Tamil-translated version of the Alpaca dataset and a subset of the OpenOrca dataset, which are used for instruction fine-tuning and evaluation.

Tamil Alpaca: abhinand/tamil-alpaca

Tamil Alpaca Orca: abhinand/tamil-alpaca-orca

Tamil LLaMA Eval: abhinand/tamil-llama-eval

Prompting Format for Instruction Models

Prompt Template Without Input

{system_prompt}

### Instruction:
{instruction or query}

### Response:
{response}

Prompt Template With Input

{system_prompt}

### Instruction:
{instruction or query}

### Input:
{input}

### Response:
{response}

Usage Note

It's important to note that the models have not undergone detoxification. Therefore, while they possess impressive linguistic capabilities, there is a possibility for them to generate content that could be deemed harmful or offensive. We urge users to exercise discretion and supervise the model's outputs closely, especially in public or sensitive applications.

Contributions

We welcome contributions to this project. If you have suggestions or improvements, please open an issue or a pull request.

License

This project is licensed under the GNU GPL v3.0 license - see the LICENSE.md file for details.

IMPORTANT: The GPL 3.0 License is applicable solely to the source code and datasets provided. As this project is a derivative of Meta's LLaMA 2 model, it is subject to the original licensing of LLaMA 2, which cannot be altered. Therefore, for comprehensive details regarding the licensing of the model, please consult the LLAMA2-LICENSE file.

Citation

If you use this model or the Tamil-Llama dataset in your research, please cite:

@misc{balachandran2023tamilllama,
      title={Tamil-Llama: A New Tamil Language Model Based on Llama 2}, 
      author={Abhinand Balachandran},
      year={2023},
      eprint={2311.05845},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact

For any queries regarding the codebase or research, please reach out to Abhinand Balachandran at [email protected].

More Repositories

1

gptq_for_langchain

A guide about how to use GPTQ models with langchain
Jupyter Notebook
40
star
2

ml-notebooks-101

Beginner-friendly implementations of ML algorithms for various problems
Jupyter Notebook
22
star
3

isolated-sign-language-recognition

Google - Isolated Sign Language Recognition (Kaggle Competition) - BRONZE MEDAL WINNING SOLUTION
Python
9
star
4

lunar-lander-deep-rl

Solving OpenAI Gym's Lunar Lander environment using Deep Reinforcement Learning
Python
7
star
5

blog-posts

Code and support materials that goes with my blog posts
Jupyter Notebook
6
star
6

quick-flask-server

Python Package to serve your Flask apps from local development server to the Internet using ngrok in a single line of code.
Python
5
star
7

planets-recognizer-app

This is a ML-powered Web app that can identify planets in your images.
Jupyter Notebook
5
star
8

Predicting-Credit-Card-Approvals

A machine learning model to predict if a credit card application will get approved or not.
Jupyter Notebook
4
star
9

deeplearning-dev-containers

Personal deep learning dev setup with containers
Dockerfile
4
star
10

cancer-detector

AI powered web app that can detect cancer in Histopathalogic Scan Images.
Jupyter Notebook
3
star
11

gquiz-bot

Python bot built with selenium to automatically open the browser and auto-fill a Google Quiz form based on a pre-filled/pre-scored Google Quiz form.
Python
3
star
12

battery-draw-util

CLI Utility to display real-time battery discharge rates in Debian derivatives (originally meant to be used in KDE Plasma's "Command Output widget" to display battery draw)
Python
2
star
13

facemask-detector

Face Mask Detection in PyTorch
Python
2
star
14

IMMLU

I-MMLU: Measuring Massive Multitask Language Understanding in Indian Languages
Jupyter Notebook
2
star
15

Machine-Learning-on-Titanic-Dataset-Kaggle

My solution to the famous Titanic Challenge on Kaggle. It is also a public kernel that introduces newbies to the world of ML and Data Science.
Jupyter Notebook
2
star
16

abhinand5

app.run("GitHub's magic profile page!!!")
1
star