• Stars
    star
    18
  • Rank 1,208,065 (Top 24 %)
  • Language
    Python
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A code sample that shows how to use 🦜️🔗langchain, 🦙llama_index and a hosted LLM endpoint to do a standard chat or Q&A about a pdf document

More Repositories

1

Apple-M1-BERT

3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1
Python
135
star
2

octoml-profile

Home for OctoML PyTorch Profiler
105
star
3

synr

A library for syntactically rewriting Python programs, pronounced (sinner).
Python
70
star
4

octoai-textgen-cookbook

Simple getting-started code examples for LLM applications powered by OctoAI
Python
42
star
5

deformable-attention-kernel

TVMScript kernel for deformable attention
Python
24
star
6

triton-client-rs

A client library in Rust for Nvidia Triton.
Rust
23
star
7

tvm2onnx

An open-source tool created by OctoML that converts TVM-optimized models to code runnable in ONNX Runtime.
Python
15
star
8

relax

A fork of tvm/unity
Python
15
star
9

octoml-cli-tutorials

A repository containing full end to end examples of the OctoML CLI workflow.
Python
14
star
10

TransparentAI

An example of building your own ML cloud app using OctoML.
Python
13
star
11

public-tvm-docker

Build TVM docker image for production compilation deployments
13
star
12

qualcomm

C
8
star
13

dockercon23-octoai

DockerCon 2023 OctoAI AI/ML Workshop GitHub Repo
Jupyter Notebook
7
star
14

tvm-build

A library for building TVM programmatically.
Rust
7
star
15

mlops

CK MLOps components
6
star
16

octoml-examples

A collection of test models for the OctoML AI acceleration service
5
star
17

octoai-apps

A collection of OctoAI-based demos.
TypeScript
5
star
18

macho-dyld

Custom dyld version inherited from original Apple dyld implementation
C++
4
star
19

cm-mlops

Collective Mind repository with unified automations to automatically co-design, optimize and deploy intelligent and Pareto-efficient systems across continuously changing software and hardware stacks.
Python
4
star
20

mlperf-loadgen-harness

A simple Python harness to run an ONNX model in various concurrency and replication configurations against MLCommon's LoadGen to measure throughput.
Python
4
star
21

octoai-template-apps

Python
3
star
22

fern-config

Configuration for generating SDKs and Documentation.
MDX
3
star
23

mlcommons-inference

Fork of MLCommons inference repository to test TVM integration
Python
2
star
24

azsphere

TVM on Azure Sphere Platform
C
2
star
25

venv

CK virtual environment
Python
2
star
26

octoai-launch-examples

Examples of how to build Generative AI applications powered by the OctoAI compute service.
Jupyter Notebook
1
star
27

octocloud-templates

Python
1
star
28

.github

1
star