This repository contains a code for Neural Search for startups demo.
The demo is based on the vector search engine Qdrant.
Install python requirements:
pip install poetry
poetry install
You will also need Docker and docker-compose
To launch this demo locally you will need to download data first.
The source of the original data is https://www.startups-list.com/
You can download the data via the following command:
wget https://storage.googleapis.com/generall-shared-data/startups_demo.json -P data/
To launch service locally, use
docker-compose -f docker-compose-local.yaml up
After service is started you can upload initial data to the search engine.
# Init neural index
python -m qdrant_demo.init_collection_startups
After a successful upload, neural search API will be available at http://localhost:8000/docs
You can play with the data in the following Colab Notebook.
Alternatively, you can use larger dataset of companies provided by Crunchbase.
You will need to register at https://www.crunchbase.com/ and get an API key.
# Download data
wget 'https://api.crunchbase.com/odm/v4/odm.tar.gz?user_key=<CRUNCHBASE-API-KEY>' -O odm.tar.gz
Decompress data and put organizations.csv
into ./data
folder.
# Decompress data
tar -xvf odm.tar.gz
mv odm/organizations.csv ./data
After that, you can run indexing of Crunchbase data into Qdrant.
# Init neural index
python -m qdrant_demo.init_collection_crunchbase