• Stars
    star
    284
  • Rank 145,616 (Top 3 %)
  • Language
    Ruby
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated 22 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Improve your Elasticsearch, OpenSearch, Solr, Vectara, Algolia and Custom Search search quality.

Quepid

License CircleCI Docker Hub Rails Style Guide Slack

Quepid logo

Quepid makes improving your app's search results a repeatable, reliable engineering process that the whole team can understand. It deals with three issues:

  1. Our collaboration stinks Making holistic progress on search requires deep, cross-functional collaboration. Shooting emails or tracking search requirements in spreadsheets won't cut it.

  2. Search testing is hard Search changes are cross-cutting: most changes will cause problems. Testing is difficult: you can't run hundreds of searches after every relevance change.

  3. Iterations are slow Moving forward seems impossible. To avoid sliding backwards, progress is slow. Many simply give up on search, depriving users of the means to find critical information.

To learn more, please check out the Quepid website and the Quepid wiki.

If you are ready to dive right in, you can use the Hosted Quepid service right now or follow the installation steps to set up your own instance of Quepid.

Table of Contents

Below is information related to developing the Quepid open source project, primarily for people interested in extending what Quepid can do!

Development Setup

I. System Dependencies

Using Docker Compose

Provisioning from an already built machine takes approximately 3 - 4 minutes. Provisioning from scratch takes approximately 20 minutes.

1. Prerequisites

Make sure you have installed Docker. Go here https://www.docker.com/community-edition#/download for installation instructions. And the Docker app is launched.

To install using brew follow these steps:

brew cask install docker
brew cask install docker-toolbox

NOTE: you may get a warning about trusting Oracle on the first try. Open System Preferences > Security & Privacy, click the Allow Oracle button, and then try again to install docker-toolbox

2. Setup your environment

Run the local Ruby based setup script to setup your Docker images:

bin/setup_docker

If you want to create some cases that have 100's and 1000's of queries, then do:

bin/docker r bin/rake db:seed:large_cases

This is useful for stress testing Quepid! Especially the front end application!

Lastly, to run the Jupyter notebooks, you need to run:

bin/setup_jupyterlite

3. Running the app

Now fire up Quepid locally at http://localhost:

bin/docker server

It can take up to a minute for the server to respond as it compiles all the front end assets on the first call.

We've created a helper script to run and manage the app through docker that wraps around the docker-compose command. You will need Ruby installed. You can still use docker-compose directly, but for the basic stuff you can use the following:

  • Start the app: bin/docker server or bin/docker s
  • Connect to the app container with bash: bin/docker bash or bin/docker ba
  • Connect to the Rails console: bin/docker console or bin/docker c
  • Run any command: bin/docker run [COMMAND] or bin/docker r [COMMAND]
  • Run dev mode as daemon: bin/docker daemon or bin/docker q
  • Destroy the Docker env: bin/docker destroy or bin/docker d
  • Run front end unit tests: bin/docker r rails test:frontend
  • Run back end unit tests: bin/docker r rails test

II. Development Log

While running the app under foreman, you'll only see a request log, for more detailed logging run the following:

tail -f log/development.log

III. Run Tests

There are three types of tests that you can run:

Minitest

These tests run the tests from the Rails side (mainly API controllers, and models):

bin/docker r rails test

Run a single test file via:

bin/docker r rails test test/models/user_test.rb

Or even a single test in a test file by passing in the line number!

bin/docker r rails test test/models/user_test.rb:33

If you need to reset your test database setup then run:

bin/docker r bin/rake db:drop RAILS_ENV=test
bin/docker r bin/rake db:create RAILS_ENV=test

View the logs generated during testing set config.log_level = :debug in test.rb and then tail the log file via:

tail -f log/test.log

JS Lint

To check the JS syntax:

bin/docker r rails test:jshint

Karma

Runs tests for the Angular side. There are two modes for the karma tests:

  • Single run: bin/docker r rails karma:run
  • Continuous/watched run: bin/docker r bin/rake karma:start

Note: The karma tests require the assets to be precompiled, which adds a significant amount of time to the test run. If you are only making changes to the test/spec files, then it is recommended you run the tests in watch mode (bin/docker r bin/rake karma:start). The caveat is that any time you make a change to the app files, you will have to restart the process (or use the single run mode).

Rubocop

To check the Ruby syntax:

bin/docker r bundle exec rubocop

Rubocop can often autocorrect many of the lint issues it runs into via --autocorrect-all:

bin/docker r bundle exec rubocop --autocorrect-all

If there is a new "Cop" as they call their rules that we don't like, you can add it to the ./rubocop.yml file.

All Tests

If you want to run all of the tests in one go (before you commit and push for example), just run these two commands:

bin/docker r rails test
bin/docker r rails test:frontend

For some reason we can't run both with one command, though we should be able to!.

Performance Testing

If you want to create a LOT of queries for a user for testing, then run

bin/docker r bin/rake db:seed:large_cases

You will have two users, [email protected] and [email protected] to test with.

IV. Debugging

Debugging Ruby

Debugging ruby usually depends on the situation, the simplest way is to print out the object to the STDOUT:

puts object         # Prints out the .to_s method of the object
puts object.inspect # Inspects the object and prints it out (includes the attributes)
pp object           # Pretty Prints the inspected object (like .inspect but better)

In the Rails application you can use the logger for the output:

Rails.logger object.inspect

If that's not enough and you want to run a debugger, the debug gem is included for that. See https://guides.rubyonrails.org/debugging_rails_applications.html#debugging-with-the-debug-gem.

Also, we have the derailed gem available which helps you understand memory issues.

bin/docker r bundle exec derailed bundle:mem

Debugging JS

While running the application, you can debug the javascript using your favorite tool, the way you've always done it.

The javascript files will be concatenated into one file, using the rails asset pipeline.

You can turn that off by toggling the following flag in config/environments/development.rb:

# config.assets.debug = true
config.assets.debug = false

to

config.assets.debug = true
# config.assets.debug = false

Because there are too many Angular JS files in this application, and in debug mode Rails will try to load every file separately, that slows down the application, and becomes really annoying in development mode to wait for the scripts to load. Which is why it is turned off by default.

PS: Don't forget to restart the server when you change the config.

Also please note that the files secure.js, application.js, and admin.js are used to load all the JavaScript and CSS dependencies via the Rails Asset pipeline. If you are debugging Bootstrap, then you will want individual files. So replace //= require sprockets with //= require bootstrap-sprockets.

Webpacker

To use webpacker, that will compile javascript code into packs and will load changes faster, you need to

bin/rails webpacker:install

Prior to that I had to install:

brew install mysql

Debugging Splainer and other NPM packages

docker-compose.override.yml.example can be copied to docker-compose.override.yml and use it to override environment variables or work with a local copy of the splainer-search JS library during development defined in docker-compose.yml. Example is included. Just update the path to splainer-search with your local checkout! https://docs.docker.com/compose/extends/

Convenience Scripts

This application has two ways of running scripts: rake & thor.

Rake is great for simple tasks that depend on the application environment, and default tasks that come by default with Rails.

Whereas Thor is a more powerful tool for writing scripts that take in args much more nicely than Rake.

Rake

To see what rake tasks are available run:

bin/docker r bin/rake -T

Note: the use of bin/rake makes sure that the version of rake that is running is the one locked to the app's Gemfile.lock (to avoid conflicts with other versions that might be installed on your system). This is equivalent of bundle exec rake.

Common rake tasks that you might use:

# db
bin/docker r bin/rake db:create
bin/docker r bin/rake db:drop
bin/docker r bin/rake db:migrate
bin/docker r bin/rake db:rollback
bin/docker r bin/rake db:schema:load
bin/docker r bin/rake db:seed
bin/docker r bin/rake db:setup

# show routes
bin/docker r bin/rails routes

# tests
bin/docker r rails test
bin/docker r rails test:frontend
bin/docker r bin/rake test:jshint

Thor

The see available tasks:

bin/docker r thor list

Examples include:

case
----
thor case:create NAME ...      # creates a new case
thor case:share CASEID TEAMID  # shares case with an team

ratings
-------
thor ratings:generate SOLRURL FILENAME  # generates random ratings into a .csv file
thor ratings:import CASEID FILENAME     # imports ratings to a case

user
----
thor user:create EMAIL USERNAME PASSWORD    # creates a new user
thor user:grant_administrator EMAIL         # grant administrator privileges to user
thor user:reset_password EMAIL NEWPASSWORD  # resets user's password

To see more details about any of the tasks, run bin/docker r thor help TASKNAME:

thor help user:create
Usage:
  thor user:create EMAIL USERNAME PASSWORD

Options:
  -a, [--administrator], [--no-administrator]

Description:
  `user:create` creates a new user with the passed in email, name and password.

  EXAMPLES:

  $ thor user:create [email protected] "Eric Pugh" mysuperstrongpassword

  With -a option, will mark the user as Administrator

  EXAMPLES:

  $ thor user:create -a [email protected] Administrator mysuperstrongpassword

Elasticsearch

You will need to configure Elasticsearch to accept requests from the browser using CORS. To enable CORS, add the following to elasticsearch's config file. Usually, this file is located near the elasticsearch executable at config/elasticsearch.yml.

http.cors:
  enabled: true
  allow-origin: /https?:\/\/localhost(:[0-9]+)?/

See more details on the wiki at https://github.com/o19s/quepid/wiki/Troubleshooting-Elasticsearch-and-Quepid

Dev Errata

I'd like to use a new Node module, or update a existing one

Typically you would simply do:

bin/docker r yarn add foobar

or

bin/docker r yarn upgrade foobar

which will install/upgrade the Node module, and then save that dependency to package.json.

Then check in the updated package.json and yarn.lock files.

Use bin/docker r yarn outdated to see what packages you can update!!!!

I'd like to use a new Ruby Gem, or update an existing one

Typically you would simply do:

bin/docker r bundle add foobar

which will install the new Gem, and then save that dependency to Gemfile.

You can also upgrade a gem that doesn't have a specific version in Gemfile via:

bin/docker r bundle update foobar

You can remove a gem via:

bin/docker r bundle remove foobar --install

Then check in the updated Gemfile and Gemfile.lock files. For good measure run the bin/setup_docker.

To understand if you have gems that are out of date run:

bin/docker r bundle outdated --groups

How to test nesting Quepid under a domain.

Uncomment in docker-compose.yml the setting - RAILS_RELATIVE_URL_ROOT=/quepid-app and then open http://localhost:3000/quepid-app.

I'd like to run and test out a local PRODUCTION build

Those steps should get you up and running locally a production build (versus the developer build) of Quepid.

  • Make the desired changes to the code
  • From the root dir in the project run the following to build a new docker image:
docker build -t o19s/quepid -f Dockerfile.prod .

This could error on first run. Try again if that happens

  • Tag a new version of your image.
  • You can either hard code your version or use a sys var for it (like QUEPID_VERSION=10.0.0) or if you prefer use 'latest'
docker tag o19s/quepid o19s/quepid:$QUEPID_VERSION
  • Bring up the mysql container
docker-compose up -d mysql
  • Run the initialization scripts. This can take a few seconds
docker-compose run --rm app bin/rake db:setup
  • Update your docker-compose.prod.yml file to use your image by updating the image version in the app image: o19s/quepid:10.0.0

  • Start up the app either as a Daemon (-d) or as an active container

docker-compose up [-d]

I'd like to test SSL

There's a directory .ssl that contains they key and cert files used for SSL. This is a self signed generated certificate for use in development ONLY!

The key/cert were generated using the following command:

openssl req -new -newkey rsa:2048 -sha1 -days 365 -nodes -x509 -keyout .ssl/localhost.key -out .ssl/localhost.crt

PS: It is not necessary to do that again.

The docker-compose.yml file contains an nginx reverse proxy that uses these certificates. You can access Quepid at https://localhost or http://localhost. (Quepid will still be available over http on port 80.)

I'd like to test OpenID Auth

Add dev docs here!

The developer deploy of Keycloak Admin console credentials are admin and password.

Modifying the database

Here is an example of generating a migration:

bin/docker r bundle exec bin/rails g migration FixCuratorVariablesTriesForeignKeyName

Followed by bin/docker r bundle exec rake db:migrate

You should also update the schema annotation data by running bin/docker r bundle exec annotations when you change the schema.

Updating RubyGems

Modify the file Gemfile and then run:

bin/docker r bundle install

You will see a updated Gemfile.lock, go ahead and check it and Gemfile into Git.

How does the Frontend work?

We use Angular 1 for the front end, and as part of that we use the angular-ui-bootstrap package for all our UI components. This package is tied to Bootstrap version 3. We import the Bootstrap 3 CSS directly via the file bootstrap.css.

For the various Admin pages, we actually are using Bootstrap 5! That is included via the package.json using NPM. See admin.js for the line //= require bootstrap/dist/js/bootstrap.bundle which is where we are including.

We currently use Rails Sprockets to compile everything, but do have dreams of moving the JavaScript over to Webpacker.

I'd like to develop Jupyterlite

Run the ./bin/setup_jupyterlite to update the archive file ./jupyterlite/notebooks.gz. This also sets up the static files in the ./public/notebooks directory. However, so that we don't check in hundreds of files, we ignore that directory from Github. At asset:precompile time we unpack the ./jupyterlite/notebooks.gz file instead. This works on Heroku and the production Docker image.

To update the version of Jupyterlite edit Dockerfile.dev and Dockerfile.prod and update the pip install version.

Question? Does jupyterlite work in localhost????

How does the Personal Access Tokens work?

See this great blog post: https://keygen.sh/blog/how-to-implement-api-key-authentication-in-rails-without-devise/.

QA

There is a code deployment pipeline to the http://quepid-staging.herokuapp.com site that is run on successful commits to main.

If you have pending migrations you will need to run them via:

heroku run bin/rake db:migrate -a quepid-staging
heroku restart -a quepid-staging

Seed Data

The following accounts are created through the seeds. They all follow the following format:

email: quepid+[type]@o19s.com
password: password

where type is one of the following:

  • admin: An admin account
  • 1case: A user with 1 case
  • 2case: A user with 2 cases
  • solr: A user with a Solr case
  • es: A user with a ES case
  • realisticActivity: A user with a Solr case that has 10s of queries and 30 tries
  • 100sOfQueries: A user with a Solr case that has 100s of queries (usually disabled)
  • 1000sOfQueries: A user with a Solr case that has 1000s of queries (usually disabled)
  • oscOwner: A user who owns the team 'OSC'
  • oscMember: A user who is a member of the team 'OSC'
  • CustomScorer: A user who has a custom scorer
  • CustomScorerDefault: A user who has a custom scorer that is set as their default

Data Map

Check out the Data Mapping file for more info about the data structure of the app.

Rebuild the ERD via bin/docker r bundle exec rake erd:image

App Structure

Check out the App Structure file for more info on how Quepid is structured.

Operating Documentation

Check out the Operating Documentation file for more informations how Quepid can be operated and configured for your company.

Thank You's

Quepid would not be possible without the contributions from many individuals and organizations.

Specifically we would like to thank Erik Bugge and the folks at Kobler for funding the Only Rated feature released in Quepid 6.4.0.

Quepid wasn't always open source! Check out the credits for a list of contributors to the project.

If you would like to fund development of a new feature for Quepid do get in touch!

More Repositories

1

elasticsearch-learning-to-rank

Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch
Java
1,480
star
2

relevant-search-book

Code and Examples for Relevant Search
Jupyter Notebook
300
star
3

hello-ltr

Set of Jupyter notebooks demonstrating Learning to Rank integrated with Solr and Elasticsearch
Jupyter Notebook
165
star
4

elyzer

"Stop worrying about Elasticsearch analyzers", my therapist says
Python
153
star
5

splainer

Elasticsearch/Solr Sandbox for exploring explain information and tweaking
JavaScript
135
star
6

hello-nlp

A natural language search microservice
Python
96
star
7

awesome-search-relevance

Tools and other things for people who work on search relevance & information retrieval
82
star
8

Spyglass

Simple search results with Solr and EmberJS
JavaScript
58
star
9

solr-to-es

Migrate a Solr node to an Elasticsearch index.
Python
54
star
10

lucene-query-example

Educational Examle of a custom Lucene Query & Scorer
Java
48
star
11

solr_nginx

Starter Reverse Proxy Configuration for Solr
47
star
12

RankyMcRankFace

Hardened Fork of Ranklib learning to rank library
Java
44
star
13

SemanticSearchInNumpy

XSLT
44
star
14

hangry

Vector search in Lucene based search attempting to use just the existing Lucene data structures (experimental)
Java
43
star
15

trireme

Migration tool providing support for Apache Cassandra, DataStax Enterprise Cassandra, & DataStax Enterprise Solr.
Python
37
star
16

elastic-graph-recommender

Building recommenders with Elastic Graph!
JavaScript
37
star
17

elasticsearch-ltr-demo

This demo uses data from TheMovieDB (TMDB) to demonstrate using Ranklib learning to rank models with Elasticsearch.
HTML
36
star
18

lazy-semantic-indexing

Elasticsearch Latent Semantic Indexing experimentation
Python
33
star
19

pdf-discovery-demo

Demonstration of searching PDF document with Solr, Tika, and Tesseract
JavaScript
30
star
20

match-query-parser

Search a single field with different query time analyzers in Solr
Java
25
star
21

splainer-search

Angular JS Solr and Elasticsearch and OpenSearch Diagnostic Search Services
JavaScript
25
star
22

tmdb_dump

Dump TheMovieDB
Python
23
star
23

es-tmdb

Elasticsearch TMDB examples
Python
21
star
24

solr-tmdb

TheMovieDB in Solr
Python
19
star
25

StackExchangeSolrIndexing

AutoTaxonomyExtractionAndTagging
XML
18
star
26

skipchunk

Extracts a latent knowledge graph from text and index/query it in elasticsearch or solr
Python
18
star
27

cfn-solr

Cloud formation script for solr servers
Shell
16
star
28

search-metrics

Python functions for popular relevance metrics (ndcg, err, etc)
Python
15
star
29

solr_angular_demo

A little search widget for instant Solr search with angular
JavaScript
15
star
30

lucene-bm25f

BM25F demo with lucene using BlendedTermQuery and a custom similarity
Java
15
star
31

bearded-wookie

An experiment in visualizing your Solr index via term counts, document counts, and memory usage per field and data type.
CSS
15
star
32

elasticsearch-image-search

Stupid Experiments in Elasticsearch Image Search
Jupyter Notebook
14
star
33

ubi

User Behavior Insights standard schema
13
star
34

solr-movielens-recommender

Movielens collaborative filtering with Solr streaming expression
Python
11
star
35

grand_central

Docker & Kubernetes deployment system for dynamic environments.
Java
11
star
36

agent_q

Headless agent for test driven relevancy with Quepid.com
Ruby
10
star
37

ltr-synth-judg

Experiments in creating synthetic training data for learning to rank
Python
9
star
38

payload-component

Solr component that surfaces payloads for matching terms
Java
9
star
39

goRank

click tracking for creating judgement lists for search-y stuff
Go
8
star
40

puppet-solr

Puppet module for installing solr with a stand alone jetty server
Shell
7
star
41

SolrSwan

SolrSwan is a query parser and highlighter for Solr that accepts proximity and Boolean queries.
Java
7
star
42

semantic-search-course

Semantic Search Course, Originally delivered at Code4Lib
Python
7
star
43

Sample-Spark-Project

Sample Spark project with Scala and SBT
Scala
7
star
44

solr_dump

Dump Solr docs to file; Write dumped docs to a Solr
Python
7
star
45

lucene_codec_hello_world

Starting point and instructions on developing a Lucene Codec
Java
7
star
46

solr-docker

Sample Dockerfiles for running Solr in a container
6
star
47

o19s-lambda

AWS Lambda Functions to make your life easier.
JavaScript
6
star
48

StackExchangeElasticSearch

Playing with ElasticSearch and the SciFi Stackexchange Dataset
Python
6
star
49

highlighting-pdf-viewer

A component (written in Vue) that supports highlighting of words in the PDF document.
Vue
6
star
50

opensearch-ubi

OpenSearch plugin for User Behavior Insights
Java
6
star
51

elasticsearch-vagrant

An ubuntu 14.04 vagrant box running Elasticsearch
Shell
5
star
52

jackhanna

Simple CLI for Zookeeper
Java
5
star
53

tlre-nlp

Materials for "Think Like A Relevance Engineer - NLP" Training
Jupyter Notebook
5
star
54

keel

This gem provides a few easy to run rake tasks to deploy your Rails application to a Kubernetes cluster.
Ruby
5
star
55

bad-libs

πŸ“ Automatically converts any book into a Mad-Libs style game of silliness using spaCy. Free Charles Dickens included!
Jupyter Notebook
4
star
56

elasticsearch-query-builder-example

Basic Elasticsearch Query Builder Plugin
Java
4
star
57

natural-language-search

Colaboratory notebooks for OSC's Natural Language Search training
Jupyter Notebook
4
star
58

word2vec-experiments

Some experimentation with word2vec
Jupyter Notebook
3
star
59

trec-news-index

Index for the TREC Washington Post corpus
Jupyter Notebook
3
star
60

twittalytics

Twitter Analytics with Cassandra
Python
3
star
61

solr-monitor

Java
2
star
62

search-viz

Various experiments demonstrating pairing realtime visualizations with search results.
JavaScript
2
star
63

tm-import

Importing public domain Trademark XML from Google
Go
2
star
64

elasticsearch-heatmap

Java
2
star
65

o19s-blog-ltr

Using the Elasticsearch LTR demo w/ some hand-created judgments
Python
2
star
66

JodaTimeCodecs

A collection of Cassandra TypeCodecs for serializing and deserializing Joda Time objects.
Java
2
star
67

Spark-Cassandra-Demo

Demo code for loading data into Cassandra and Solr with Spark.
Java
2
star
68

trec-podcasts-index

Index Spotify's 100k podcasts dataset into Elasticsearch
Python
2
star
69

ispy_component

Relevance debugging component for Solr
Java
2
star
70

quepid-jupyterlite

Jupyter notebooks to help with search relevancy measurements, optimized for Quepid.
Jupyter Notebook
2
star
71

clustering-lowes-grouts

Code to support a blog post about extracting tags from Lowes.com for clustering unsanded grout search results
JavaScript
2
star
72

visualizing-signals

A Practical Introduction to Exploring and Visualizing E-Commerce Search Signal Data
Shell
2
star
73

solr-query-parser-demo

A "surround"-like and capitalization custom query parsers demo
Java
2
star
74

user-behavior-insights-elasticsearch

User Behavior Insights (UBI) plugin for Elasticsearch
Java
2
star
75

metric-plots

Plots for search metrics nDCG and ERR
JavaScript
1
star
76

jupyter-blogs

Drafts of Doug's Jupyter Notebook Blog Posts
Python
1
star
77

os-tmdb

TLRE OpenSearch
Python
1
star
78

ndoch-trademark-challenge

Applications built for the National Day of Civic Hacking's USPTO Trademarks Challenge
Ruby
1
star
79

movielens-judgments

experiments using movielens genome tags as an experimental ltr training set
Python
1
star
80

training_coms

R scripts to manage bulk training communications and certificate generation
R
1
star
81

jarjar

Joint Analysis Review of Judgements And Raters
Jupyter Notebook
1
star
82

puppet-modules

puppet modules for o19s
Puppet
1
star
83

thats-trackable

Running app for XC team.
Ruby
1
star
84

ggoodggraphics

The grammer of graphics is powerful and now in Python thanks for `plotnine`!
Jupyter Notebook
1
star