• Stars
    star
    833
  • Rank 54,737 (Top 2 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 26 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring safety & security. 🛡️ Features include text quality, relevance metrics, & sentiment analysis. 📊 A comprehensive tool for LLM observability. 👀

LangKit

LangKit graphic

LangKit is an open-source text metrics toolkit for monitoring language models. It offers an array of methods for extracting relevant signals from the input and/or output text, which are compatible with the open-source data logging library whylogs.

💡 Want to experience LangKit? Go to this notebook!

Table of Contents 📖

Motivation 🎯

Productionizing language models, including LLMs, comes with a range of risks due to the infinite amount of input combinations, which can elicit an infinite amount of outputs. The unstructured nature of text poses a challenge in the ML observability space - a challenge worth solving, since the lack of visibility on the model's behavior can have serious consequences.

Features 🛠️

The currently supported metrics include:

  • Text Quality
    • readability score
    • complexity and grade scores
  • Text Relevance
    • Similarity scores between prompt/responses
    • Similarity scores against user-defined themes
  • Security and Privacy
    • patterns - count of strings matching a user-defined regex pattern group
    • jailbreaks - similarity scores with respect to known jailbreak attempts
    • prompt injection - similarity scores with respect to known prompt injection attacks
    • refusals - similarity scores with respect to known LLM refusal of service responses
  • Sentiment and Toxicity
    • sentiment analysis
    • toxicity analysis

Installation 💻

To install LangKit, use the Python Package Index (PyPI) as follows:

pip install langkit[all]

Usage 🚀

LangKit modules contain UDFs that automatically wire into the collection of UDFs on String features provided by whylogs by default. All we have to do is import the LangKit modules and then instantiate a custom schema as shown in the example below.

import whylogs as why
from langkit import llm_metrics

results = why.log({"prompt": "Hello!", "response": "World!"}, schema=llm_metrics.init())

The code above will produce a set of metrics comprised of the default whylogs metrics for text features and all the metrics defined in the imported modules. This profile can be visualized and monitored in the WhyLabs platform or they can be further analyzed by the user on their own accord.

More examples are available here.

Modules 📦

You can have more information about the different modules and their metrics here.

More Repositories

1

whylogs

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
Jupyter Notebook
2,644
star
2

whylogs-java

Profile and monitor your ML data pipeline end-to-end
Java
177
star
3

whylogs-examples

A collection of WhyLogs examples in various languages
Jupyter Notebook
48
star
4

openllmtelemetry

Open LLM Telemetry package
Jupyter Notebook
22
star
5

whylogs-proto

Protobuf definition for WhyLogs format
14
star
6

datasketches

A fork of datasketches for consumption in WhyLogs
C++
13
star
7

whylabs-toolkit

Python
12
star
8

whylabs-tutorials

Tutorials for WhyLabs
Jupyter Notebook
6
star
9

whylogs-container

Container code for WhyLogs
Kotlin
6
star
10

whylogs_action

Repo for running Whylogs as part of a CI workflow using github actions.
Python
5
star
11

llm-traceguard

End-to-end observability with built-in security guardrails
Makefile
5
star
12

whylabs-docs

WhyLabs documentation repository
JavaScript
3
star
13

whylabs-client-python

Public Python client for WhyLabs API
Python
2
star
14

airflow-provider-whylogs

A repo to contain whylogs operators to work with Apache Airflow
Python
2
star
15

whylabs-ray-examples

Python
2
star
16

monitor-schema

A repository for the WhyLabs monitor config schema
1
star
17

bigquery-dataflow-templates

Python
1
star
18

whylogs-container-python

Python
1
star
19

whylogs-container-python-client

Python swagger client for the whylogs container
Python
1
star
20

whylabs

Python library for configuring and managing WhyLabs organizations.
Jupyter Notebook
1
star
21

langkit-container-examples

1
star