• Stars
    star
    434
  • Rank 100,274 (Top 2 %)
  • Language
    Python
  • Created over 7 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Accelerating network inference over video

NoScope

This is the official project page for the NoScope project.

Please read the blog post and paper for more details!

Requirements

This repository contains the code for the optimization step in the paper. The inference code is here.

You will need the following installed:

  • python 2.7
  • pip python-setuptools python-tk
  • keras
  • CUDA, CUDNN, tensorflow-gpu
  • OpenCV 3.2 with FFmpeg bindings
  • g++ 5.4 or later

Your machine will need at least:

  • AVX2 capabilities
  • 300+GB of memory
  • 500+GB of space
  • A GPU (this has only been tested with NVIDIA K80 and P100)

Guides on Installing the Requirements

Setting up the inference engine

To set up the inference engine, do the following: Note: It is recommended that you create a folder that contains this repository, the tensorflow-noscope repository, and the data folder referred to below.

git clone https://github.com/stanford-futuredata/tensorflow-noscope.git
cd tensorflow-noscope
git checkout speedhax
git submodule init
git submodule update
./configure
cd tensorflow
bazel build -c opt --copt=-mavx2 --config=cuda noscope

The build will fail. To fix this, update the BUILD file to point towards your OpenCV install and add this directory to your PATH environmental variable. The BUILD file is in the tensorflow-noscope git repository at tensorflow/noscope/BUILD. You will need to change all references to "/lfs/0/ddkang/". You will probably need to change these to "/usr/" if you installed OpenCV using the directions above.

Please encourage the Tensorflow developers to support non-bazel building and linking. Due to a quirk in bazel, it may occasionally "forget" that tensorflow-noscope was built. If this happens, rebuild.

Setting up the optimization engine

To set up the optimization engine, install the NoScope python package by going to the root directory of where you checked out https://github.com/stanford-futuredata/noscope and running "pip install -e ./"

Running the example

Once you have inference engine set up, the example/ subfolder within this repository contains the script to reproduce Figure 5d in the paper.

In order to run this:

  1. Create a folder named data that sits in the same directory as your noscope and tensorflow-noscope folders
  2. Create the following folders within the data folder: videos, csv, cnn-avg, cnn-models, and experiments
  3. Download the coral-reef video and labels, putting the csv file in the csv folder and the mp4 file in the videos folder:
wget https://storage.googleapis.com/noscope-data/csvs-yolo/coral-reef-long.csv
wget https://storage.googleapis.com/noscope-data/videos/coral-reef-long.mp4
  1. Update the code and data paths in example/run.sh. code should point to the folder that contains both the noscope and tensorflow-noscope folders. This value is how the optimization and inference engines find eachother. data should point to the data folder created in this section.
  2. Download the YOLO neural network weights file from https://pjreddie.com/media/files/yolo.weights. It is suggested that you place the file at the location tensorflow-noscope/tensorflow/noscope/darknet/weights/. Note that you will need to make the weights folder.
  3. Update example/noscope_motherdog.py to point to the YOLO configuration and weights files. The config file is tensorflow-noscope/tensorflow/noscope/darknet/cfg/yolo.cfg and the weights file is the one you downloaded. If you put the weights file in the suggested location, this step should be unnecessary.
  4. Run example/run.sh. The outputted summary.csv will be in the location data/experiments/$VIDEO_NAME.

Datasets

The datasets that are currently available are coral-reef-long and jackson-town-square. Due to the expense of hosting these files, we have turned on requester pays for download. Please use an authenticated gsutil to download the files.

The mp4 video files are available at https://storage.googleapis.com/noscope-data/videos/VIDEO_NAME.mp4

The CSVs with ground truth labels are available at https://storage.googleapis.com/noscope-data/csvs-yolo/VIDEO_NAME.csv

If the above links do not work, you can download the data on Google drive here.

More Repositories

1

ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
Python
2,998
star
2

macrobase

MacroBase: A Search Engine for Fast Data
Java
661
star
3

ARES

Automated Evaluation of RAG Systems
Python
460
star
4

sparser

Sparser: Raw Filtering for Faster Analytics over Raw Data
C
427
star
5

dawn-bench-entries

DAWNBench: An End-to-End Deep Learning Benchmark and Competition
Python
257
star
6

ASAP

ASAP: Prioritizing Attention via Time Series Smoothing
Jupyter Notebook
184
star
7

FrugalGPT

FrugalGPT: better quality and lower cost for LLM applications
Python
167
star
8

index-baselines

Simple baselines for "Learned Indexes"
HTML
156
star
9

FAST

End-to-end earthquake detection pipeline via efficient time series similarity search
Jupyter Notebook
144
star
10

gavel

Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
Jupyter Notebook
124
star
11

equivariant-transformers

Equivariant Transformer (ET) layers are image-to-image mappings that incorporate prior knowledge on invariances with respect to continuous transformations groups (ICML 2019). Paper: https://arxiv.org/abs/1901.11399
Jupyter Notebook
88
star
12

stk

Python
86
star
13

selection-via-proxy

Python
77
star
14

sinkhorn-label-allocation

Sinkhorn Label Allocation is a label assignment method for semi-supervised self-training algorithms. The SLA algorithm is described in full in this ICML 2021 paper: https://arxiv.org/abs/2102.08622.
Python
53
star
15

readinggroup

45
star
16

cs145-2017

Jupyter Notebook
43
star
17

Willump

Willump Is a Low-Latency Useful Machine learning Platform.
Python
42
star
18

Baleen

Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)
Python
42
star
19

msketch

Moments Sketch Code
Jupyter Notebook
39
star
20

Uniserve

A runtime implementation of data-parallel actors.
Java
38
star
21

wmsketch

Sketching linear classifiers over data streams with the Weight-Median Sketch (SIGMOD 2018).
C++
38
star
22

dawn-bench-models

Python
36
star
23

momentsketch

Simplified Moment Sketch Implemntation
Java
36
star
24

blazeit

Its BlazeIt because it's blazing fast
C++
28
star
25

optimus-maximus

To Index or Not to Index: Optimizing Exact Maximum Inner Product Search
Python
26
star
26

ACORN

state-of-the-art search over vector embeddings and structured data (SIGMOD '24)
C++
25
star
27

acidrain

2AD analysis prototype and logs from sample applications
Python
22
star
28

lit-code

Code for LIT, ICML 2019
Python
21
star
29

POP

Code for "Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP", which appeared at SOSP 2021
Python
20
star
30

loa

Public code for LOA
Python
18
star
31

omg

Python
17
star
32

pytorch-distributed

Fork of diux-dev/imagenet18
Python
13
star
33

tasti

Semantic Indexes for Machine Learning-based Queries over Unstructured Data (SIGMOD 2022)
Python
13
star
34

cs245-as1

Student files for CS245 Programming Assignment 1: In-memory data layout
Java
12
star
35

offload-annotations

A new approach for bringing heterogeneous computing to existing libraries and workloads.
Python
9
star
36

Willump-Simple

Willump Is a Low-Latency Useful Machine learning Platform.
Python
8
star
37

cs245-as3-public

Durable transactions assignment for CS245
Java
7
star
38

InQuest

Accelerating Aggregation Queries on Unstructured Streams of Data
Python
7
star
39

cs245-as2-public

Scala
7
star
40

training_on_a_dime

Scripts and logs for "Analysis and Expoitation of Dynamic Pricing in the Public Cloud for ML Training", which is to appear at DISPA 2020
Jupyter Notebook
7
star
41

SparseJointShift

Model Performance Estimation and Explanation When Labels and A Few Features Shifts
Python
7
star
42

DROP

Java
6
star
43

tKDC

Repository for tKDE Experiments
Jupyter Notebook
6
star
44

sketchstore

Algorithms for compressing and merging large collections of sketches
Jupyter Notebook
5
star
45

parallel-lb-simulator

Java
4
star
46

crosstrainer

CrossTrainer: Practical Domain Adaptation with Loss Reweighting
Python
4
star
47

smol

C++
4
star
48

supg

Python
3
star
49

fast-tree

C++
3
star
50

abae

Accelerating Approximate Aggregation Queries with Expensive Predicates (VLDB 21)
Python
3
star
51

graphIO

Automated Lower Bounds on the I/O Complexity of Computation Graphs
Python
3
star
52

futuretea-whyrust

Why Rust presentation at FutureTea, 3/13
Rust
3
star
53

ezmode

An iterative algorithm for selecting rare events in large, unlabeled datasets
Python
1
star
54

willump-dfs

Applying Willump design to deep feature synthesis
Python
1
star
55

fexipro-benchmarking

C++
1
star
56

macrobase-cpp

1
star
57

swag-python

Situationally aWAre decodinG
Python
1
star