• This repository has been archived on 06/May/2021
  • Stars
    star
    155
  • Rank 240,086 (Top 5 %)
  • Language
    Java
  • Created over 8 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Elasticsearch plugin offering Neo4j integration for Personalized Search

GraphAware Graph-Aided Search - RETIRED

GraphAware Graph-Aided Search Has Been Retired

As of May 2021, this repository has been retired.

ElasticSearch Plugin providing integration with Neo4j

GraphAware Graph-Aided Search is an enterprise-grade bi-directional integration between Neo4j and Elasticsearch. It consists of two independent modules plus test suites. Both modules can be used independently or together to achieve full integration.

The first module is a plugin for Neo4j (more precisely, a GraphAware Transaction-Driven Runtime Module), which can be configured to transparently and asynchronously replicate data from Neo4j to ElasticSearch.

The second module (this module) is a plugin for Elasticsearch that can query the Neo4j graph database during a search query to enrich the result (boost the score) by results that are more efficiently calculated in a graph database, e.g. recommendations.

Both modules are now open-source production-ready for everyone. They are also officially supported by GraphAware for GraphAware Enterprise subscribers.

Feature Overview: Graph-Aided Search

This module is a plugin for Elasticsearch that enables users to improve search results by boosting or filtering them using data stored in the Neo4j graph database. After performing a search in Elasticsearch, just before returning the results to the user, this plugin requests additional information from Neo4j via its REST API in order to boost or filter the results.

Two main features are exposed by the plugin:

  • Result Boosting: This feature allows changing the scores of the results. The score can be changed in different ways: mixing graph score with Elasticsearch score or replacing it entirely are just two examples. It is possible to customize this behaviour with different formulas, rewriting some methods of the Graph-Aided Search Booster. Usage examples include boosting (i) based on interest prediction (recommendations), (ii) based on friends' interests/likes, (iii) all use cases that are a good fit for Neo4j

  • Result Filtering: This feature allows filtering, thus removing documents from the results list. By providing a Cypher query, it is possible to return to the user only documents with IDs matching the results of the Cypher query.

Detailed workflow:

  1. Intercept and parse any "Search query" and try to find the GraphAidedSearch extension parameter;
  2. Process the query extension identifying the type of the extension (boosting or a filter), and instantiate the related class;
  3. Perform the operation required to boost or filter by calling the Neo4j REST API (or a Neo4j extension like Graphaware Recommendation Engine, passing all necessary information, e.g. Cypher query, target user, etc...;
  4. Return the filtered/boosted result set back to the user;

overview


Usage: Installation

Install Graph-Aided Search Binary

Elasticsearch 2.2.2:

$ $ES_HOME/bin/plugin install com.graphaware.es/graph-aided-search/2.2.2.0

Elasticsearch 2.3.1:

$ $ES_HOME/bin/plugin install com.graphaware.es/graph-aided-search/2.3.2.0

Build from source

$ git clone [email protected]:graphaware/graph-aided-search.git
$ mvn clean package
$ $ES_HOME/bin/plugin install file:///path/to/project/graph-aided-search/target/releases/graph-aided-search-2.X.X.0.zip

Start elasticsearch

Configuration

Then configure indexes with the url of Neo4j. This can be done in two ways. First:

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.hostname=http://localhost:7474
$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.enable=true

indexname as your index name. e.g. curl -XPUT http://localhost:9200/neo4j-index-node/_settings?index.gas.enable=true

If the Neo4j Rest Api is protected by Basic Authentication confire username and password for neo4j in the following way:

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.user=neo4j
$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.password=password

If the neo4j server supports bolt it can be enable and managed using the following configuration

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.boltHostname=bolt://localhost:7687
$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.bolt.secure=false (default is true)

Since bolt is still not a stabel release, the default protocol is http, to enable it add the following line in the booster or filter configuration:

"protocol": "bolt"

Second, you can use also template to configure settings in the index:

    curl -XPOST http://localhost:9200/_template/template_gas -d \
    '{
      "template": "*",
      "settings": {
        "index.gas.neo4j.hostname": "http://localhost:7474",
        "index.gas.neo4j.boltHostname": "bolt://localhost:7687",
        "index.gas.enable": true,
        "index.gas.neo4j.user": "neo4j",
        "index.gas.neo4j.password": "password"
      }
    }'

Disable Plugin

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.enable=false

Queries will continue to work even with Graph-Aided-Search-specific elements, e.g. "gas-boost" and "gas-filter".

Usage: Search Phase

The integration with a pre-existing search query is seamless, since the plugin only requires the addition of new elements into the query.

Booster example

Boosters allow to change the score by an external score source. This could be a recommender, a Cypher query, or any custom booster provider. A simple Elasticsearch query could have the following structure:

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    }';

In this case all the Elasticsearch result hits will have a relevancy score value of 1. If you would like to boost these results according to user interest computed by Graphaware Recommendation Plugin on top of Neo4j, you would change the query in the following way.

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    },
    "gas-booster" :{
          "name": "SearchResultNeo4jBooster",
          "target": "2",
          "maxResultSize": 10,
          "keyProperty": "objectId",
          "neo4j.endpoint": "/graphaware/recommendation/movie/filter/"
       }
  }';

The gas-booster clause identifies the type of operation, in this case it defines a boost operation. The name parameter is mandatory and allows to specify the Booster class. The remaining parameters depend on the type of the booster. In the following paragraph the available boosters are described.

SearchResultNeo4jBooster

This booster uses Neo4j through custom REST APIs available as plugins for the database. In this case, the name value must be set to SearchResultNeo4jBooster.

The following parameters are available for this booster:

  • target: (Mandatory) This parameter contains the identifier of the target for which the boosting values are computed. Since the boosting is customized according to a target, this parameter is mandatory and allows getting different results for different target (typically a user).

  • maxResultSize: (Default is set to the max result windows size of elasticsearch, defined by the parameter index.max_result_window) When search query is changed before submitting it to elasticsearch engine, the value of "size" for the results returned is changed according to this parameter. This is necessary since once the boosting function is applied, the order may change. Some of the results that wouldn't "make it" may be boosted and fall into the "size" window.

  • keyProperty: (Default value is uuid) the id of each document in the search results must match some property value of the nodes in the graph. In order to avoid ambiguities in the results, this property must identify a single node. Using GraphAware UUID with Neo4j is recommended for this purpose.

  • operator: (Default is multiply [*]) It specifies how to combine the Elasticsearch score with the score provided by Neo4j. Available operators are: * (multiply), + (sum), - (substract), / (divide), replace (replace score).

  • neo4j.endpoint: (Default /graphaware/recommendation/filter) It defines the endpoint to which the request is submitted in order to get a boosting. It is added to the Neo4j host value defined for the index.

Information about the list of IDs that should be boosted as well as the target is passed to the API running atop Neo4j. The REST API should expose a POST endpoint that accepts the following parameters:

  • target (url parameter): This is the value of target defined above and it is used to identify the user or item for which the score will be computed from the recommender;

  • limit: This value can be used to limit the number of results provided be the REST API;

  • from: In order to support pagination this value allows to skip a number of results;

  • keyProperty: Specify the property on the nodes used to identify the nodes. Such property will be used to filter the results, according to the lists of "ids";

  • ids: Comma-separated list of node identifiers that must be evaluated and then returned;

Example Call:

http://localhost:7474/graphaware/recommendation/movie/filter/2

Parameters:
limit=2147483647&from=0&keyProperty=objectId&ids=99,166,486,478,270,172,73,84,351,120

This component supposes that the results are a json array with the following structure.

[
  {
    "nodeId": 1212,
    "objectId": "270",
    "score": 3
  },
  {
    "nodeId": 1041,
    "objectId": "99",
    "score": 1
  },
  {
    "nodeId": 1420,
    "objectId": "478",
    "score": 1
  },
  {
    "nodeId": 1428,
    "objectId": "486",
    "score": 1
  }
]

SearchResultCypherBooster

This booster uses Neo4j through custom REST APIs available as plugins for the database. In this case the name value must be set to SearchResultCypherBooster.

The following parameters are available for this booster:

  • query: (Mandatory) This parameter contains the query to submit to the Neo4j instance.

  • scoreName: (Default value is "score") The name of the returned value that is used as scoring function.

  • identifier: (Default value is "id") The name of the returned value that is used for matching IDs.

  • maxResultSize: (Default is set to the max result windows size of elasticsearch, defined by the parameter index.max_result_window) When search query is changed before submitting it to elasticsearch engine, the value of "size" for the results returned is changed according to this parameter. This is necessary since once the boosting function is applied, the order may change. Some of the results that wouldn't "make it" may be boosted and fall into the "size" window.

  • operator: (Default is multiply [*]) It specifies how to combine the Elasticsearch score with the score provided by Neo4j. Available operators are: * (multiply), + (sum), - (substract), / (divide), replace (replace score).

The Elasticsearch result hits ids are passed as Cypher query parameter as a List of strings named items.

Example Use:

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    },
    "gas-booster" :{
          "name": "SearchResultCypherBooster",
          "query": "MATCH (input:User) WHERE id(input) = 2
                    MATCH p=(input)-[r:RATED]->(movie)<-[r2:RATED]-(other)
                    WITH other, collect(p) as paths
                    WITH other, reduce(x=0, p in paths | x + reduce(i=0, r in rels(p) | i+r.rating)) as score
                    WITH other, score
                    ORDER BY score DESC
                    MATCH (other)-[:RATED]->(reco)
                    RETURN reco.objectId as id, score
                    LIMIT 500",
          "maxResultSize": 1000,
          "scoreName": "score",
          "identifier": "id"
       }
  }';

Filter Example

Filters allow to filter the results using information stored in the graph. For example, you can filter movies based on what the user's friends have seen. If you would like to filter results according to a user's friends evaluation, it is possible to change the Elasticsearch query as follows:

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    },
    "gas-filter" :{
          "name": "SearchResultCypherFilter",
          "query": "MATCH (input:User) WHERE id(input) = 2
                   MATCH (input)-[f:FRIEND_OF]->(friend)-[r:RATED]->(movie)
                   WHERE r.rate > 3
                   RETURN movie.uuid as id",
          "exclude": false
       }
  }';

The gas-filter clause identifies the type of the operation; in this case a filter operation. The name parameter is mandatory and allows to specify the Filter class. The remaining parameters depends on the type of filter. In the following paragraph the available filters are described.

SearchResultCypherFilter

This filter allows to filter results using a Cypher query on Neo4j. In this case the name value must be set to SearchResultCypherFilter.

The following parameters are available for this filter:

  • query: (Mandatory) This parameter contains the query to submit to the Neo4j instance.

  • maxResultSize: (Default is set to the max result windows size of elasticsearch, defined by the parameter index.max_result_window) When search query is changed before submitting it to elasticsearch engine, the value of "size" for the results returned is changed according to this parameter. This is necessary since once the filtering function is applied, some of the results that wouldn't "make it" may fall into the "size" window.

  • exclude: (Default true) This parameter allows to define the behaviour of the Filter. If set to true (default), it will filter out the Neo4j results from the results provided by Elasticsearch. If set to false, it will keep the intersection of Neo4j and Elasticsearch results, i.e. exclude everything that has not been returned by Neo4j.

Customize the plugin

The plugin allows to implement custom boosters and filters. In order to implement a booster, SearchResultBooster must be implemented and it needs to have the following annotation:

@SearchBooster(name = "MyCustomBooster")

Moreover, it should be in the package com.graphaware.es.gas.

In order to implement a filter, SearchResultFilter must be implemented and it needs to have the following annotation:

@SearchFilter(name = "MyCustomFilter")

Also in this case, it should be in the package com.graphaware.es.gas.

Version Matrix

The following version are currently supported

Version (this project) Elasticsearch
master 2.4.4
2.3.2.2 2.3.2
2.3.1.0 2.3.1
2.2.2.0 2.2.2

Issues/Questions

Please file an issue.

License

Copyright (c) 2016 GraphAware

GraphAware is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

More Repositories

1

neo4j-reco

Neo4j-based recommendation engine module with real-time and pre-computed recommendations.
Java
374
star
2

neo4j-nlp

NLP Capabilities in Neo4j
Java
335
star
3

neo4j-to-elasticsearch

GraphAware Framework Module for Integrating Neo4j with Elasticsearch
Java
261
star
4

neo4j-framework

GraphAware Neo4j Framework
Java
243
star
5

neo4j-timetree

Java and REST APIs for working with time-representing tree in Neo4j
Java
206
star
6

neo4j-php-ogm

Neo4j Object Graph Mapper for PHP
PHP
153
star
7

reco4php

Neo4j based Recommendation Engine Framework for PHP
PHP
130
star
8

neo4j-uuid

GraphAware Runtime Module that assigns a UUID to all nodes (and relationships) in the graph transparently
Java
103
star
9

neo4j-bolt-php

PHP Driver for Neo4j's Binary Protocol : Bolt
PHP
42
star
10

neo4j-algorithms

Custom graph algorithms for Neo4j with own Java and REST APIs
Java
34
star
11

neo4j-expire

GraphAware Module for Expiring (Deleting) Nodes and Relationships
Java
29
star
12

recommendations-meetup

Skeleton for Meetup - Building your own recommendation engine in an hour
Java
29
star
13

neo4j-casual-cluster-quickstart

A demonstration of causal clustering using Docker
27
star
14

neo4j-nlp-stanfordnlp

Stanford NLP implementation for Neo4j
Java
26
star
15

neo4j-importer

Java importer skeleton for complicated, business-logic-heavy high-performance Neo4j imports directly from SQL databases, CSV files, etc.
Java
26
star
16

neo4j-noderank

GraphAware Timer-Driven Runtime Module that executes PageRank-like algorithm on the graph
Java
26
star
17

neo4j-php-commons

Common Utility Classes for using Neo4j in PHP
PHP
24
star
18

graph-technology-landscape

Graph Technology Landscape
23
star
19

graph-aided-search-demo

21
star
20

neo4j-config-cli

neo4j-config-cli is a Neo4j utility to ensure the desired configuration state of a Neo4j database based on a json file definition.
Java
16
star
21

neo4j-changefeed

A GraphAware Framework Runtime Module allowing users to find out what were the latest changes performed on the graph
Java
16
star
22

fix-your-microservices

Code examples for talk 'Fix your microservice architecture using graph analysis'
Shell
14
star
23

neo4j-resttest

Library for testing Neo4j code over REST
Java
13
star
24

neo4j-nlp-opennlp

Java
12
star
25

neo4j-relcount

GraphAware Relationship Count Module
Java
11
star
26

neo4j-warmup

Simple library that warms up Neo4j caches with a single REST call
Java
10
star
27

neo4j-graphgen-procedure

Neo4j Procedure for generating test data
Java
9
star
28

neo4j-triggers

Neo4j Triggers on Steroids
Java
8
star
29

offheap

Java
7
star
30

graphaware-starter

A sample project that gets you quickly started with the GraphAware Framework
Java
7
star
31

neo4j-full-text-search-extra

Extra components for working with Neo4j Full Text Search
Java
6
star
32

monitoring-neo4j-prometheus-grafana

Docker-compose setup to test monitoring Neo4j Causal Cluster with Prometheus and Grafana
6
star
33

php-graphunit

Neo4j Graph Database Assertion Tool for PHPUnit
PHP
6
star
34

reco

Generic Recommendation Engine Skeleton
Java
5
star
35

node-local-relationship-index

Java
5
star
36

neo4j-rabbitmq-integration

GraphAware module offering transaction data to be sent as json to RabbitMQ
Java
5
star
37

ga-cytoscape

Cytoscape.js Web Component built with Stencil
TypeScript
5
star
38

node-neo4j-bolt-adapter

An adapter for the official neo4j-javascript-driver, allowing it to be used as a drop-in replacement for the node-neo4j community driver.
JavaScript
5
star
39

neo4j-php-ogm-tutorial

Code repository for the neo4j-php-ogm documentation's tutorial http://neo4j-php-ogm.readthedocs.io/en/latest/getting_started/tutorial/
PHP
5
star
40

neo4j-discourse-slack

App that notifies on Slack about a new message on the Neo4j discourse
Java
5
star
41

neo4j-nlp-docker

4
star
42

neoclient-timetree-extension

Leveraging the Neo4j TimeTree Extension in PHP with NeoClient
PHP
4
star
43

neo4j-testcontainers-blog

Repository with examples accompanying blog post about using testcontainers with Neo4j
Java
4
star
44

hume-nodes2020

Shell
3
star
45

neo4j-stress-test

Java
3
star
46

monitoring-procedure-example

Repository with example accompanying blog post about monitoring Neo4j and custom metrics
Java
3
star
47

hume-starters

Shell
2
star
48

hume-workshop-sep-2021

Shell
2
star
49

hume-helm-charts

Helm charts for deploying GraphAware Hume on Kubernetes
Smarty
2
star
50

rd-neo4j-streaming

Java
2
star
51

recommendation-bundle

PHP
2
star
52

hume-iframe-example

Example Docker setup for iframing GraphAware Hume inside a React application
Shell
2
star
53

neo4j-logging-logstash-elk

2
star
54

hume-helm

Helm charts for running GraphAware Hume on Kubernetes
Smarty
2
star
55

graphite

Define a graph schema. Get a fully working web application using Spring Boot, Spring Data Neo4j and Angular.
Java
2
star
56

neo4j-lucene-custom-analyzer

Java
1
star
57

nodes-2020-security-in-action

Java
1
star
58

docker-elk

1
star
59

php-simplemq

RabbitMQ's Rapid Application Development based on YAML definition
PHP
1
star
60

elasticsearch-tests-integration

Testing Support for GraphAware Neo4j<->Elasticsearch Integration
Java
1
star
61

issuebot_nlp_meetup

Issue Mention Bot using Neo4j and NLP demo code for the Neo4jFR meetup at Prestashop
Python
1
star
62

neo4j-jmeter-load-tests

Load testing Neo4j queries and procedures with JMeter examples
1
star
63

custom-fulltext-analyzer-blog

Java
1
star
64

GithubNeo4j

Demo Application importing User Github Public Events into Neo4j
PHP
1
star
65

neo4j-php-response-formatter

Advanced Neo4j Http Response Formatter Extension for NeoClient
PHP
1
star
66

neo4j-lifecycle

Java
1
star
67

neo4j-reactive-data-copy

Data copy from/to Neo4j example using reactive programming
Java
1
star
68

hume-configuration-as-code-example

Example Repository for a Hume Movies Knowledge Graph configured from YAML definitions
1
star
69

test-recommender

Java
1
star
70

hume-orchestra-workshop-mar-2022

Python
1
star
71

neo4j-multiple-drivers-example

Java
1
star