• Stars
    star
    130
  • Rank 277,575 (Top 6 %)
  • Language
    PHP
  • License
    Other
  • Created almost 9 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Neo4j based Recommendation Engine Framework for PHP

GraphAware Reco4PHP

Neo4j based Recommendation Engine Framework for PHP

GraphAware Reco4PHP is a library for building complex recommendation engines atop Neo4j.

Build Status

Features:

  • Clean and flexible design
  • Built-in algorithms and functions
  • Ability to measure recommendation quality
  • Built-in Cypher transaction management

Requirements:

  • PHP7.0+
  • Neo4j 2.2.6+ (Neo4j 3.0+ recommended)

The library imposes a specific recommendation engine architecture, which has emerged from our experience building recommendation engines and solves the architectural challenge to run recommendation engines remotely via Cypher. In return it handles all the plumbing so that you only write the recommendation business logic specific to your use case.

Recommendation Engine Architecture

Discovery Engines and Recommendations

The purpose of a recommendation engine is to recommend something, should be users you should follow, products you should buy, articles you should read.

The first part in the recommendation process is to find items to recommend, it is called the discovery process.

In Reco4PHP, a DiscoveryEngine is responsible for discovering items to recommend in one possible way.

Generally, recommender systems will contains multiple discovery engines, if you would write the who you should follow on github recommendation engine, you might end up with the non-exhaustive list of Discovery Engines :

  • Find people that contributed on the same repositories than me
  • Find people that FOLLOWS the same people I follow
  • Find people that WATCH the same repositories I'm watching
  • ...

Each Discovery Engine will produce a set of Recommendations which contains the discovered Item as well as the score for this item (more below).

Filters and BlackLists

The purpose of Filters is to compare the original input to the discovered item and decide whether or not this item should be recommended to the user. A very straightforward filter could be ExcludeSelf which would exclude the item if it is the same node as the input, which can relatively happen in a densely connected graph.

BlackLists on the other hand are a set of predefined nodes that should not be recommended to the user. An example could be to create a BlackList with the already purchased items by the user if you would recommend him products he should buy.

PostProcessors

PostProcessors are providing the ability to post process the recommendation after it has passed the filters and blacklisting process.

For example, if you would reward a recommended person if he/she lives in the same city than you, it wouldn't make sense to load all people from the database that live in this city in the discovery phase (this could be millions if you take London as an example).

You would then create a RewardSameCity post processor that would adapt the score of the produced recommendation if the input node and the recommended item are living in the same city.

Summary

To summarize, a typical recommendation engine will be a set of :

  • one or more Discovery Engines
  • zero or more Fitlers and BlackLists
  • zero or more PostProcessors

Let's start it !

Usage by example

We will use the small dataset available from MovieLens containing movies, users and ratings as well as genres.

The dataset is publicly available here : http://grouplens.org/datasets/movielens/. The data set to download is in the MovieLens Latest Datasets section and is named ml-latest-small.zip.

Once downloaded and extracted the archive, you can run the following Cypher statements for importing the dataset, just adapt the file urls to match your actual path to the files :

CREATE CONSTRAINT ON (m:Movie) ASSERT m.id IS UNIQUE;
CREATE CONSTRAINT ON (g:Genre) ASSERT g.name IS UNIQUE;
CREATE CONSTRAINT ON (u:User) ASSERT u.id IS UNIQUE;
LOAD CSV WITH HEADERS FROM "file:///Users/ikwattro/dev/movielens/movies.csv" AS row
WITH row
MERGE (movie:Movie {id: toInt(row.movieId)})
ON CREATE SET movie.title = row.title
WITH movie, row
UNWIND split(row.genres, '|') as genre
MERGE (g:Genre {name: genre})
MERGE (movie)-[:HAS_GENRE]->(g)
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "file:///Users/ikwattro/dev/movielens/ratings.csv" AS row
WITH row
MATCH (movie:Movie {id: toInt(row.movieId)})
MERGE (user:User {id: toInt(row.userId)})
MERGE (user)-[r:RATED]->(movie)
ON CREATE SET r.rating = toInt(row.rating), r.timestamp = toInt(row.timestamp)

For the purpose of the example, we will assume we are recommending movies for the User with ID 460.

Installation

Require the dependency with composer :

composer require graphaware/reco4php

Usage

Discovery

In order to recommend movies people should watch, you have decided that we should find potential recommendations in the following way :

  • Find movies rated by people who rated the same movies than me, but that I didn't rated yet

As told before, the reco4php recommendation engine framework makes all the plumbing so you only have to concentrate on the business logic, that's why it provides base class that you should extend and just implement the methods of the upper interfaces, here is how you would create your first discovery engine :

<?php

namespace GraphAware\Reco4PHP\Tests\Example\Discovery;

use GraphAware\Common\Cypher\Statement;
use GraphAware\Common\Type\Node;
use GraphAware\Reco4PHP\Context\Context;
use GraphAware\Reco4PHP\Engine\SingleDiscoveryEngine;

class RatedByOthers extends SingleDiscoveryEngine
{
    public function discoveryQuery(Node $input, Context $context)
    {
        $query = 'MATCH (input:User) WHERE id(input) = {id}
        MATCH (input)-[:RATED]->(m)<-[:RATED]-(o)
        WITH distinct o
        MATCH (o)-[:RATED]->(reco)
        RETURN distinct reco LIMIT 500';

        return Statement::create($query, ['id' => $input->identity()]);
    }

    public function name()
    {
        return "rated_by_others";
    }
}

The discoveryMethod method should return a Statement object containing the query for finding recommendations, the name method should return a string describing the name of your engine (this is mostly for logging purposes).

The query here has some logic, we don't want to return as candidates all the movies found, as in the initial dataset it would be 10k+, so imagine what it would be on a 100M dataset. So we are summing the score of the ratings and returning the most rated ones, limit the results to 500 potential recommendations.

The base class assumes that the recommended node will have the identifier reco and the score of the produced recommendation the identifier score. The score is not mandatory, and it will be given a default score of 1.

All these defaults are customizable by overriding the methods from the base class (see the Customization section).

This discovery engine will then produce a set of 500 scored Recommendation objects that you can use in your filters or post processors.

Filtering

As an example of a filter, we will filter the movies that were produced before the year 1999. The year is written in the movie title, so we will use a regex for extracting the year in the filter.

<?php

namespace GraphAware\Reco4PHP\Tests\Example\Filter;

use GraphAware\Common\Type\Node;
use GraphAware\Reco4PHP\Filter\Filter;

class ExcludeOldMovies implements Filter
{
    public function doInclude(Node $input, Node $item)
    {
        $title = $item->value("title");
        preg_match('/(?:\()\d+(?:\))/', $title, $matches);

        if (isset($matches[0])) {
            $y = str_replace('(','',$matches[0]);
            $y = str_replace(')','', $y);
            $year = (int) $y;
            if ($year < 1999) {
                return false;
            }

            return true;
        }

        return false;
    }
}

The Filter interfaces forces you to implement the doInclude method which should return a boolean. You have access to the recommended node as well as the input in the method arguments.

Blacklist

Of course we do not want to recommend movies that the current user has already rated, for this we will create a Blacklist building a set of these already rated movie nodes.

<?php

namespace GraphAware\Reco4PHP\Tests\Example\Filter;

use GraphAware\Common\Cypher\Statement;
use GraphAware\Common\Type\Node;
use GraphAware\Reco4PHP\Filter\BaseBlacklistBuilder;

class AlreadyRatedBlackList extends BaseBlacklistBuilder
{
    public function blacklistQuery(Node $input)
    {
        $query = 'MATCH (input) WHERE id(input) = {inputId}
        MATCH (input)-[:RATED]->(movie)
        RETURN movie as item';

        return Statement::create($query, ['inputId' => $input->identity()]);
    }

    public function name()
    {
        return 'already_rated';
    }
}

You really just need to add the logic for matching the nodes that should be blacklisted, the framework takes care for filtering the recommended nodes against the blacklists provided.

Post Processors

Post Processors are meant to add additional scoring to the recommended items. In our example, we could reward a produced recommendation if it has more than 10 ratings :

<?php

namespace GraphAware\Reco4PHP\Tests\Example\PostProcessing;

use GraphAware\Common\Cypher\Statement;
use GraphAware\Common\Result\Record;
use GraphAware\Common\Type\Node;
use GraphAware\Reco4PHP\Post\RecommendationSetPostProcessor;
use GraphAware\Reco4PHP\Result\Recommendation;
use GraphAware\Reco4PHP\Result\Recommendations;
use GraphAware\Reco4PHP\Result\SingleScore;

class RewardWellRated extends RecommendationSetPostProcessor
{
    public function buildQuery(Node $input, Recommendations $recommendations)
    {
        $query = 'UNWIND {ids} as id
        MATCH (n) WHERE id(n) = id
        MATCH (n)<-[r:RATED]-(u)
        RETURN id(n) as id, sum(r.rating) as score';

        $ids = [];
        foreach ($recommendations->getItems() as $item) {
            $ids[] = $item->item()->identity();
        }

        return Statement::create($query, ['ids' => $ids]);
    }

    public function postProcess(Node $input, Recommendation $recommendation, Record $record)
    {
        $recommendation->addScore($this->name(), new SingleScore($record->get('score'), 'total_ratings_relationships'));
    }

    public function name()
    {
        return "reward_well_rated";
    }
}

Wiring all together

Now that our components are created, we need to build effectively our recommendation engine :

<?php

namespace GraphAware\Reco4PHP\Tests\Example;

use GraphAware\Reco4PHP\Engine\BaseRecommendationEngine;
use GraphAware\Reco4PHP\Tests\Example\Filter\AlreadyRatedBlackList;
use GraphAware\Reco4PHP\Tests\Example\Filter\ExcludeOldMovies;
use GraphAware\Reco4PHP\Tests\Example\PostProcessing\RewardWellRated;
use GraphAware\Reco4PHP\Tests\Example\Discovery\RatedByOthers;

class ExampleRecommendationEngine extends BaseRecommendationEngine
{
    public function name()
    {
        return "example";
    }

    public function discoveryEngines()
    {
        return array(
            new RatedByOthers()
        );
    }

    public function blacklistBuilders()
    {
        return array(
            new AlreadyRatedBlackList()
        );
    }

    public function postProcessors()
    {
        return array(
            new RewardWellRated()
        );
    }

    public function filters()
    {
        return array(
            new ExcludeOldMovies()
        );
    }
}

As in your recommender service, you might have multiple recommendation engines serving different recommendations, the last step is to create this service and register each RecommendationEngine you have created. You'll need to provide also a connection to your Neo4j database, in your application this could look like this :

<?php

namespace GraphAware\Reco4PHP\Tests\Example;

use GraphAware\Reco4PHP\Context\SimpleContext;
use GraphAware\Reco4PHP\RecommenderService;

class ExampleRecommenderService
{
    /**
     * @var \GraphAware\Reco4PHP\RecommenderService
     */
    protected $service;

    /**
     * ExampleRecommenderService constructor.
     * @param string $databaseUri
     */
    public function __construct($databaseUri)
    {
        $this->service = RecommenderService::create($databaseUri);
        $this->service->registerRecommendationEngine(new ExampleRecommendationEngine());
    }

    /**
     * @param int $id
     * @return \GraphAware\Reco4PHP\Result\Recommendations
     */
    public function recommendMovieForUserWithId($id)
    {
        $input = $this->service->findInputBy('User', 'id', $id);
        $recommendationEngine = $this->service->getRecommender("user_movie_reco");

        return $recommendationEngine->recommend($input, new SimpleContext());
    }
}

Inspecting recommendations

The recommend() method on a recommendation engine will returns you a Recommendations object which contains a set of Recommendation that holds the recommended item and their score.

Each score is inserted so you can easily inspect why such recommendation has been produced, example :

$recommender = new ExampleRecommendationService("http://localhost:7474");
$recommendation = $recommender->recommendMovieForUserWithId(460);

print_r($recommendations->getItems(1));

Array
(
    [0] => GraphAware\Reco4PHP\Result\Recommendation Object
        (
            [item:protected] => GraphAware\Bolt\Result\Type\Node Object
                (
                    [identity:protected] => 13248
                    [labels:protected] => Array
                        (
                            [0] => Movie
                        )

                    [properties:protected] => Array
                        (
                            [id] => 2571
                            [title] => Matrix, The (1999)
                        )

                )

            [scores:protected] => Array
                (
                    [rated_by_others] => GraphAware\Reco4PHP\Result\Score Object
                        (
                            [score:protected] => 1067
                            [scores:protected] => Array
                                (
                                    [0] => GraphAware\Reco4PHP\Result\SingleScore Object
                                        (
                                            [score:GraphAware\Reco4PHP\Result\SingleScore:private] => 1067
                                            [reason:GraphAware\Reco4PHP\Result\SingleScore:private] =>
                                        )

                                )

                        )

                    [reward_well_rated] => GraphAware\Reco4PHP\Result\Score Object
                        (
                            [score:protected] => 261
                            [scores:protected] => Array
                                (
                                    [0] => GraphAware\Reco4PHP\Result\SingleScore Object
                                        (
                                            [score:GraphAware\Reco4PHP\Result\SingleScore:private] => 261
                                            [reason:GraphAware\Reco4PHP\Result\SingleScore:private] =>
                                        )

                                )

                        )

                )

            [totalScore:protected] => 261
        )
)

License

This library is released under the Apache v2 License, please read the attached LICENSE file.

Commercial support or custom development/extension available upon request to [email protected].

More Repositories

1

neo4j-reco

Neo4j-based recommendation engine module with real-time and pre-computed recommendations.
Java
374
star
2

neo4j-nlp

NLP Capabilities in Neo4j
Java
335
star
3

neo4j-to-elasticsearch

GraphAware Framework Module for Integrating Neo4j with Elasticsearch
Java
261
star
4

neo4j-framework

GraphAware Neo4j Framework
Java
243
star
5

neo4j-timetree

Java and REST APIs for working with time-representing tree in Neo4j
Java
206
star
6

graph-aided-search

Elasticsearch plugin offering Neo4j integration for Personalized Search
Java
155
star
7

neo4j-php-ogm

Neo4j Object Graph Mapper for PHP
PHP
153
star
8

neo4j-uuid

GraphAware Runtime Module that assigns a UUID to all nodes (and relationships) in the graph transparently
Java
103
star
9

neo4j-bolt-php

PHP Driver for Neo4j's Binary Protocol : Bolt
PHP
42
star
10

neo4j-algorithms

Custom graph algorithms for Neo4j with own Java and REST APIs
Java
34
star
11

neo4j-expire

GraphAware Module for Expiring (Deleting) Nodes and Relationships
Java
29
star
12

recommendations-meetup

Skeleton for Meetup - Building your own recommendation engine in an hour
Java
29
star
13

neo4j-casual-cluster-quickstart

A demonstration of causal clustering using Docker
27
star
14

neo4j-nlp-stanfordnlp

Stanford NLP implementation for Neo4j
Java
26
star
15

neo4j-importer

Java importer skeleton for complicated, business-logic-heavy high-performance Neo4j imports directly from SQL databases, CSV files, etc.
Java
26
star
16

neo4j-noderank

GraphAware Timer-Driven Runtime Module that executes PageRank-like algorithm on the graph
Java
26
star
17

neo4j-php-commons

Common Utility Classes for using Neo4j in PHP
PHP
24
star
18

graph-technology-landscape

Graph Technology Landscape
23
star
19

graph-aided-search-demo

21
star
20

neo4j-config-cli

neo4j-config-cli is a Neo4j utility to ensure the desired configuration state of a Neo4j database based on a json file definition.
Java
16
star
21

neo4j-changefeed

A GraphAware Framework Runtime Module allowing users to find out what were the latest changes performed on the graph
Java
16
star
22

fix-your-microservices

Code examples for talk 'Fix your microservice architecture using graph analysis'
Shell
14
star
23

neo4j-resttest

Library for testing Neo4j code over REST
Java
13
star
24

neo4j-nlp-opennlp

Java
12
star
25

neo4j-relcount

GraphAware Relationship Count Module
Java
11
star
26

neo4j-warmup

Simple library that warms up Neo4j caches with a single REST call
Java
10
star
27

neo4j-graphgen-procedure

Neo4j Procedure for generating test data
Java
9
star
28

neo4j-triggers

Neo4j Triggers on Steroids
Java
8
star
29

offheap

Java
7
star
30

graphaware-starter

A sample project that gets you quickly started with the GraphAware Framework
Java
7
star
31

neo4j-full-text-search-extra

Extra components for working with Neo4j Full Text Search
Java
6
star
32

monitoring-neo4j-prometheus-grafana

Docker-compose setup to test monitoring Neo4j Causal Cluster with Prometheus and Grafana
6
star
33

php-graphunit

Neo4j Graph Database Assertion Tool for PHPUnit
PHP
6
star
34

reco

Generic Recommendation Engine Skeleton
Java
5
star
35

node-local-relationship-index

Java
5
star
36

neo4j-rabbitmq-integration

GraphAware module offering transaction data to be sent as json to RabbitMQ
Java
5
star
37

ga-cytoscape

Cytoscape.js Web Component built with Stencil
TypeScript
5
star
38

node-neo4j-bolt-adapter

An adapter for the official neo4j-javascript-driver, allowing it to be used as a drop-in replacement for the node-neo4j community driver.
JavaScript
5
star
39

neo4j-php-ogm-tutorial

Code repository for the neo4j-php-ogm documentation's tutorial http://neo4j-php-ogm.readthedocs.io/en/latest/getting_started/tutorial/
PHP
5
star
40

neo4j-discourse-slack

App that notifies on Slack about a new message on the Neo4j discourse
Java
5
star
41

neo4j-nlp-docker

4
star
42

neoclient-timetree-extension

Leveraging the Neo4j TimeTree Extension in PHP with NeoClient
PHP
4
star
43

neo4j-testcontainers-blog

Repository with examples accompanying blog post about using testcontainers with Neo4j
Java
4
star
44

hume-nodes2020

Shell
3
star
45

neo4j-stress-test

Java
3
star
46

monitoring-procedure-example

Repository with example accompanying blog post about monitoring Neo4j and custom metrics
Java
3
star
47

hume-starters

Shell
2
star
48

hume-workshop-sep-2021

Shell
2
star
49

hume-helm-charts

Helm charts for deploying GraphAware Hume on Kubernetes
Smarty
2
star
50

rd-neo4j-streaming

Java
2
star
51

recommendation-bundle

PHP
2
star
52

hume-iframe-example

Example Docker setup for iframing GraphAware Hume inside a React application
Shell
2
star
53

neo4j-logging-logstash-elk

2
star
54

hume-helm

Helm charts for running GraphAware Hume on Kubernetes
Smarty
2
star
55

graphite

Define a graph schema. Get a fully working web application using Spring Boot, Spring Data Neo4j and Angular.
Java
2
star
56

neo4j-lucene-custom-analyzer

Java
1
star
57

nodes-2020-security-in-action

Java
1
star
58

docker-elk

1
star
59

php-simplemq

RabbitMQ's Rapid Application Development based on YAML definition
PHP
1
star
60

elasticsearch-tests-integration

Testing Support for GraphAware Neo4j<->Elasticsearch Integration
Java
1
star
61

issuebot_nlp_meetup

Issue Mention Bot using Neo4j and NLP demo code for the Neo4jFR meetup at Prestashop
Python
1
star
62

custom-fulltext-analyzer-blog

Java
1
star
63

neo4j-jmeter-load-tests

Load testing Neo4j queries and procedures with JMeter examples
1
star
64

GithubNeo4j

Demo Application importing User Github Public Events into Neo4j
PHP
1
star
65

neo4j-php-response-formatter

Advanced Neo4j Http Response Formatter Extension for NeoClient
PHP
1
star
66

neo4j-lifecycle

Java
1
star
67

neo4j-reactive-data-copy

Data copy from/to Neo4j example using reactive programming
Java
1
star
68

hume-configuration-as-code-example

Example Repository for a Hume Movies Knowledge Graph configured from YAML definitions
1
star
69

test-recommender

Java
1
star
70

hume-orchestra-workshop-mar-2022

Python
1
star
71

neo4j-multiple-drivers-example

Java
1
star