Discover octoml/octoml-llm-qa Open Source project by OctoML (@octoml)

Stars
18
Rank 1,208,065 (Top 24 %)
Language
Python
Created over 1 year ago
Updated about 1 year ago

octoml

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

A code sample that shows how to use 🦜️🔗langchain, 🦙llama_index and a hosted LLM endpoint to do a standard chat or Q&A about a pdf document

Apple-M1-BERT

3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1

Python

135

octoml-profile

Home for OctoML PyTorch Profiler

105

synr

A library for syntactically rewriting Python programs, pronounced (sinner).

Python

octoai-textgen-cookbook

Simple getting-started code examples for LLM applications powered by OctoAI

Python

deformable-attention-kernel

TVMScript kernel for deformable attention

Python

triton-client-rs

A client library in Rust for Nvidia Triton.

Rust

tvm2onnx

An open-source tool created by OctoML that converts TVM-optimized models to code runnable in ONNX Runtime.

Python

relax

A fork of tvm/unity

Python

octoml-cli-tutorials

A repository containing full end to end examples of the OctoML CLI workflow.

Python

TransparentAI

An example of building your own ML cloud app using OctoML.

Python

public-tvm-docker

Build TVM docker image for production compilation deployments

qualcomm

dockercon23-octoai

DockerCon 2023 OctoAI AI/ML Workshop GitHub Repo

Jupyter Notebook

tvm-build

A library for building TVM programmatically.

Rust

mlops

CK MLOps components

octoml-examples

A collection of test models for the OctoML AI acceleration service

octoai-apps

A collection of OctoAI-based demos.

TypeScript

macho-dyld

Custom dyld version inherited from original Apple dyld implementation

C++

cm-mlops

Collective Mind repository with unified automations to automatically co-design, optimize and deploy intelligent and Pareto-efficient systems across continuously changing software and hardware stacks.

Python

mlperf-loadgen-harness

A simple Python harness to run an ONNX model in various concurrency and replication configurations against MLCommon's LoadGen to measure throughput.

Python

octoai-template-apps

Python

fern-config

Configuration for generating SDKs and Documentation.

MDX

mlcommons-inference

Fork of MLCommons inference repository to test TVM integration

Python

azsphere

TVM on Azure Sphere Platform

venv

CK virtual environment

Python

octoai-launch-examples

Examples of how to build Generative AI applications powered by the OctoAI compute service.

Jupyter Notebook

octocloud-templates

Python

.github

octoml/octoml-llm-qa

octoml

Reviews

Repository Details

More Repositories