• Stars
    star
    706
  • Rank 64,138 (Top 2 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created about 4 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Library extending Jupyter notebooks to integrate with Apache TinkerPop, openCypher, and RDF SPARQL.

Graph Notebook: easily query and visualize graphs

The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the Apache TinkerPop, openCypher or the RDF SPARQL graph models. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including knowledge graphs and identity graphs.

A colorful graph picture

Visualizing Gremlin queries:

Gremlin query and graph

Visualizing openCypher queries

openCypher query and graph

Visualizing SPARQL queries:

SPARL query and graph

Instructions for connecting to the following graph databases:

Endpoint Graph model Query language
Gremlin Server property graph Gremlin
Blazegraph RDF SPARQL
Amazon Neptune property graph or RDF Gremlin or SPARQL
Neo4J property graph Cypher

We encourage others to contribute configurations they find useful. There is an additional-databases folder where more information can be found.

Features

Notebook cell 'magic' extensions in the IPython 3 kernel

%%sparql - Executes a SPARQL query against your configured database endpoint. Documentation

%%gremlin - Executes a Gremlin query against your database using web sockets. The results are similar to those a Gremlin console would return. Documentation

%%opencypher or %%oc Executes an openCypher query against your database. Documentation

%%graph_notebook_config - Sets the executing notebook's database configuration to the JSON payload provided in the cell body.

%%graph_notebook_vis_options - Sets the executing notebook's vis.js options to the JSON payload provided in the cell body.

%%neptune_ml - Set of commands to integrate with NeptuneML functionality, as described here. Documentation

TIP ๐Ÿ‘‰ %%sparql, %%gremlin, and %%oc share a suite of common arguments that be used to customize the appearance of rendered graphs. Example usage of these arguments can also be found in the sample notebooks under 02-Visualization.

TIP ๐Ÿ‘‰ There is syntax highlighting for language query magic cells to help you structure your queries more easily.

Notebook line 'magic' extensions in the IPython 3 kernel

%gremlin_status - Obtain the status of Gremlin queries. Documentation

%sparql_status - Obtain the status of SPARQL queries. Documentation

%opencypher_status or %oc_status - Obtain the status of openCypher queries. Documentation

%load - Generate a form to submit a bulk loader job. Documentation

%load_ids - Get ids of bulk load jobs. Documentation

%load_status - Get the status of a provided load_id. Documentation

%cancel_load - Cancels a bulk load job. You can either provide a single load_id, or specify --all-in-queue to cancel all queued (and not actively running) jobs. Documentation

%neptune_ml - Set of commands to integrate with NeptuneML functionality, as described here. You can find a set of tutorial notebooks here. Documentation

%status - Check the Health Status of the configured host endpoint. Documentation

%seed - Provides a form to add data to your graph, using sets of insert queries instead of a bulk loader. Sample RDF and Property Graph data models are provided with this command. Alternatively, you can select a language type and provide a file path(or a directory path containing one or more of these files) to load the queries from.

%stream_viewer - Interactively explore the Neptune CDC stream (if enabled)

%graph_notebook_config - Returns a JSON payload that contains connection information for your host.

%graph_notebook_host - Set the host endpoint to send queries to.

%graph_notebook_version - Print the version of the graph-notebook package

%graph_notebook_vis_options - Print the Vis.js options being used for rendered graphs

TIP ๐Ÿ‘‰ You can list all the magics installed in the Python 3 kernel using the %lsmagic command.

TIP ๐Ÿ‘‰ Many of the magic commands support a --help option in order to provide additional information.

Example notebooks

This project includes many example Jupyter notebooks. It is recommended to explore them. All of the commands and features supported by graph-notebook are explained in detail with examples within the sample notebooks. You can find them here. As this project has evolved, many new features have been added. If you are already familiar with graph-notebook but want a quick summary of new features added, a good place to start is the Air-Routes notebooks in the 02-Visualization folder.

Keeping track of new features

It is recommended to check the ChangeLog.md file periodically to keep up to date as new features are added.

Prerequisites

You will need:

  • Python 3.7.x-3.10.11
  • A graph database that provides one or more of:
    • A SPARQL 1.1 endpoint
    • An Apache TinkerPop Gremlin Server compatible endpoint
    • An endpoint compatible with openCypher

Installation

Begin by installing graph-notebook and its prerequisites, then follow the remaining instructions for either Jupyter Classic Notebook or JupyterLab.

# install the package
pip install graph-notebook

Jupyter Classic Notebook

# Enable the visualization widget
jupyter nbextension enable  --py --sys-prefix graph_notebook.widgets

# copy static html resources
python -m graph_notebook.static_resources.install
python -m graph_notebook.nbextensions.install

# copy premade starter notebooks
python -m graph_notebook.notebooks.install --destination ~/notebook/destination/dir

# create nbconfig file and directory tree, if they do not already exist
mkdir ~/.jupyter/nbconfig
touch ~/.jupyter/nbconfig/notebook.json

# start jupyter notebook
python -m graph_notebook.start_notebook --notebooks-dir ~/notebook/destination/dir

JupyterLab 3.x

# install jupyterlab
pip install "jupyterlab>=3,<4"

# copy premade starter notebooks
python -m graph_notebook.notebooks.install --destination ~/notebook/destination/dir

# start jupyterlab
python -m graph_notebook.start_jupyterlab --jupyter-dir ~/notebook/destination/dir

Loading magic extensions in JupyterLab

When attempting to run a line/cell magic on a new notebook in JupyterLab, you may encounter the error:

UsageError: Cell magic `%%graph_notebook_config` not found.

To fix this, run the following command, then restart JupyterLab.

python -m graph_notebook.ipython_profile.configure_ipython_profile

Alternatively, the magic extensions can be manually reloaded for a single notebook by running the following command in any empty cell.

%load_ext graph_notebook.magics

Upgrading an existing installation

# upgrade graph-notebook
pip install graph-notebook --upgrade

After the above command completes, rerun the commands given at Jupyter Classic Notebook or JupyterLab 3.x based on which flavour is installed.

Connecting to a graph database

Gremlin Server

In a new cell in the Jupyter notebook, change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. Optionally, modify traversal_source if your graph traversal source name differs from the default value, username and password if required by the graph store, or message_serializer for a specific data transfer format. For a local Gremlin server (HTTP or WebSockets), you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 8182,
  "ssl": false,
  "gremlin": {
    "traversal_source": "g",
    "username": "",
    "password": "",
    "message_serializer": "graphsonv3"
  }
}

To setup a new local Gremlin Server for use with the graph notebook, check out additional-databases/gremlin server

Blazegraph

Change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. For a local Blazegraph database, you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 9999,
  "ssl": false,
  "sparql": {
    "path": "sparql"
  }
}

You can also make use of namespaces for Blazegraph by specifying the path graph-notebook should use when querying your SPARQL like below:

%%graph_notebook_config

{
  "host": "localhost",
  "port": 9999,
  "ssl": false,
  "sparql": {
    "path": "blazegraph/namespace/foo/sparql"
  }
}

This will result in the url localhost:9999/blazegraph/namespace/foo/sparql being used when executing any %%sparql magic commands.

To setup a new local Blazegraph database for use with the graph notebook, check out the Quick Start from Blazegraph.

Amazon Neptune

Change the configuration using %%graph_notebook_config and modify the defaults as they apply to your Neptune cluster:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "DEFAULT",
  "load_from_s3_arn": "",
  "ssl": true,
  "ssl_verify": true,
  "aws_region": "your-neptune-region"
}

To setup a new Amazon Neptune cluster, check out the Amazon Web Services documentation.

When connecting the graph notebook to Neptune, make sure you have a network setup to communicate to the VPC that Neptune runs on. If not, you can follow this guide.

Authentication (Amazon Neptune)

If you are running a SigV4 authenticated endpoint, ensure that your configuration has auth_mode set to IAM:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "IAM",
  "load_from_s3_arn": "",
  "ssl": true,
  "ssl_verify": true,
  "aws_region": "your-neptune-region"
}

Additionally, you should have the following Amazon Web Services credentials available in a location accessible to Boto3:

  • Access Key ID
  • Secret Access Key
  • Default Region
  • Session Token (OPTIONAL. Use if you are using temporary credentials)

These variables must follow a specific naming convention, as listed in the Boto3 documentation

A list of all locations checked for Amazon Web Services credentials can also be found here.

Neo4J

Change the configuration using %%graph_notebook_config and modify the fields for host, port, ssl, and neo4j authentication.

If your Neo4J instance supports multiple databases, you can specify a database name via the database field. Otherwise, leave the database field blank to query the default database.

For a local Neo4j Desktop database, you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 7687,
  "ssl": false,
  "neo4j": {
    "username": "neo4j",
    "password": "password",
    "auth": true,
    "database": ""
  }
}

Ensure that you also specify the %%oc bolt option when submitting queries to the Bolt endpoint.

To setup a new local Neo4J Desktop database for use with the graph notebook, check out the Neo4J Desktop User Interface Guide.

Building From Source

A pre-release distribution can be built from the graph-notebook repository via the following steps:

# 1) Clone the repository and navigate into the clone directory
git clone https://github.com/aws/graph-notebook.git
cd graph-notebook

# 2) Create a new virtual environment

# 2a) Option 1 - pyenv
pyenv install 3.10.11  # Only if not already installed; this can be any supported Python 3 version in Prerequisites
pyenv virtualenv 3.10.11 build-graph-notebook
pyenv local build-graph-notebook

# 2b) Option 2 - venv
rm -rf /tmp/venv
python3 -m venv /tmp/venv
source /tmp/venv/bin/activate

# 3) Install build dependencies
pip install --upgrade pip setuptools wheel twine
pip install "jupyterlab>=3,<4"

# 4) Build the distribution
python3 setup.py bdist_wheel

You should now be able to find the built distribution at

./dist/graph_notebook-3.8.2-py3-none-any.whl

And use it by following the installation steps, replacing

pip install graph-notebook

with

pip install ./dist/graph_notebook-3.8.2-py3-none-any.whl

Contributing Guidelines

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

More Repositories

1

aws-cli

Universal Command Line Interface for Amazon Web Services
Python
14,304
star
2

chalice

Python Serverless Microframework for AWS
Python
10,654
star
3

aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
JavaScript
10,440
star
4

amazon-sagemaker-examples

Example ๐Ÿ““ Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using ๐Ÿง  Amazon SageMaker.
Jupyter Notebook
9,542
star
5

serverless-application-model

The AWS Serverless Application Model (AWS SAM) transform is a AWS CloudFormation macro that transforms SAM templates into CloudFormation templates.
Python
9,342
star
6

aws-sdk-js

AWS SDK for JavaScript in the browser and Node.js
JavaScript
7,476
star
7

aws-sam-cli

CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM
Python
6,506
star
8

aws-sdk-php

Official repository of the AWS SDK for PHP (@awsforphp)
PHP
5,886
star
9

containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
Shell
5,164
star
10

karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
Go
4,615
star
11

s2n-tls

An implementation of the TLS/SSL protocols
C
4,465
star
12

aws-sdk-java

The official AWS SDK for Java 1.x. The AWS SDK for Java 2.x is available here: https://github.com/aws/aws-sdk-java-v2/
Java
4,117
star
13

aws-lambda-go

Libraries, samples and tools to help Go developers develop AWS Lambda functions.
Go
3,624
star
14

aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Python
3,537
star
15

copilot-cli

The AWS Copilot CLI is a tool for developers to build, release and operate production ready containerized applications on AWS App Runner or Amazon ECS on AWS Fargate.
Go
3,488
star
16

aws-sdk-ruby

The official AWS SDK for Ruby.
Ruby
3,462
star
17

amazon-freertos

DEPRECATED - See README.md
C
2,535
star
18

aws-sdk-go-v2

AWS SDK for the Go programming language.
Go
2,518
star
19

aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
TypeScript
2,476
star
20

jsii

jsii allows code in any language to naturally interact with JavaScript classes. It is the technology that enables the AWS Cloud Development Kit to deliver polyglot libraries from a single codebase!
TypeScript
2,371
star
21

sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
Python
2,095
star
22

amazon-vpc-cni-k8s

Networking plugin repository for pod networking in Kubernetes using Elastic Network Interfaces on AWS
Go
2,071
star
23

aws-eks-best-practices

A best practices guide for day 2 operations, including operational excellence, security, reliability, performance efficiency, and cost optimization.
Python
2,022
star
24

amazon-ecs-agent

Amazon Elastic Container Service Agent
Go
2,005
star
25

lumberyard

Amazon Lumberyard is a free AAA game engine deeply integrated with AWS and Twitch โ€“ with full source.
C++
1,965
star
26

aws-sdk-net

The official AWS SDK for .NET. For more information on the AWS SDK for .NET, see our web site:
1,945
star
27

eks-anywhere

Run Amazon EKS on your own infrastructure ๐Ÿš€
Go
1,899
star
28

aws-sdk-java-v2

The official AWS SDK for Java - Version 2
Java
1,822
star
29

aws-sdk-cpp

AWS SDK for C++
1,779
star
30

amazon-ecs-cli

The Amazon ECS CLI enables users to run their applications on ECS/Fargate using the Docker Compose file format, quickly provision resources, push/pull images in ECR, and monitor running applications on ECS/Fargate.
Go
1,725
star
31

aws-sdk-php-laravel

A Laravel 5+ (and 4) service provider for the AWS SDK for PHP
PHP
1,589
star
32

serverless-java-container

A Java wrapper to run Spring, Spring Boot, Jersey, and other apps inside AWS Lambda.
Java
1,483
star
33

aws-node-termination-handler

Gracefully handle EC2 instance shutdown within Kubernetes
Go
1,443
star
34

aws-lambda-dotnet

Libraries, samples and tools to help .NET Core developers develop AWS Lambda functions.
C#
1,430
star
35

aws-fpga

Official repository of the AWS EC2 FPGA Hardware and Software Development Kit
VHDL
1,380
star
36

eks-distro

Amazon EKS Distro (EKS-D) is a Kubernetes distribution based on and used by Amazon Elastic Kubernetes Service (EKS) to create reliable and secure Kubernetes clusters.
Shell
1,263
star
37

eks-charts

Amazon EKS Helm chart repository
Mustache
1,184
star
38

s2n-quic

An implementation of the IETF QUIC protocol
Rust
1,152
star
39

aws-toolkit-vscode

CodeWhisperer, CodeCatalyst, Local Lambda debug, SAM/CFN syntax, ECS Terminal, AWS resources
TypeScript
1,150
star
40

opsworks-cookbooks

Chef Cookbooks for the AWS OpsWorks Service
Ruby
1,058
star
41

aws-codebuild-docker-images

Official AWS CodeBuild repository for managed Docker images http://docs.aws.amazon.com/codebuild/latest/userguide/build-env-ref.html
Dockerfile
1,032
star
42

amazon-ssm-agent

An agent to enable remote management of your EC2 instances, on-premises servers, or virtual machines (VMs).
Go
975
star
43

aws-iot-device-sdk-js

SDK for connecting to AWS IoT from a device using JavaScript/Node.js
JavaScript
957
star
44

aws-iot-device-sdk-embedded-C

SDK for connecting to AWS IoT from a device using embedded C.
C
926
star
45

aws-health-tools

The samples provided in AWS Health Tools can help users to build automation and customized alerting in response to AWS Health events.
Python
887
star
46

aws-graviton-getting-started

Helping developers to use AWS Graviton2, Graviton3, and Graviton4 processors which power the 6th, 7th, and 8th generation of Amazon EC2 instances (C6g[d], M6g[d], R6g[d], T4g, X2gd, C6gn, I4g, Im4gn, Is4gen, G5g, C7g[d][n], M7g[d], R7g[d], R8g).
Python
850
star
47

aws-app-mesh-examples

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication.
Shell
844
star
48

deep-learning-containers

AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet.
Python
800
star
49

aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
Python
782
star
50

aws-lambda-runtime-interface-emulator

Go
771
star
51

aws-toolkit-jetbrains

AWS Toolkit for JetBrains - a plugin for interacting with AWS from JetBrains IDEs
Kotlin
735
star
52

aws-iot-device-sdk-python

SDK for connecting to AWS IoT from a device using Python.
Python
670
star
53

amazon-chime-sdk-js

A JavaScript client library for integrating multi-party communications powered by the Amazon Chime service.
TypeScript
655
star
54

amazon-ec2-instance-selector

A CLI tool and go library which recommends instance types based on resource criteria like vcpus and memory
Go
642
star
55

studio-lab-examples

Example notebooks for working with SageMaker Studio Lab. Sign up for an account at the link below!
Jupyter Notebook
625
star
56

aws-secretsmanager-agent

The AWS Secrets Manager Agent is a local HTTP service that you can install and use in your compute environments to read secrets from Secrets Manager and cache them in memory.
Rust
584
star
57

event-ruler

Event Ruler is a Java library that allows matching many thousands of Events per second to any number of expressive and sophisticated rules.
Java
564
star
58

aws-sdk-rails

Official repository for the aws-sdk-rails gem, which integrates the AWS SDK for Ruby with Ruby on Rails.
Ruby
554
star
59

aws-mwaa-local-runner

This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.
Shell
553
star
60

amazon-eks-pod-identity-webhook

Amazon EKS Pod Identity Webhook
Go
534
star
61

aws-lambda-java-libs

Official mirror for interface definitions and helper classes for Java code running on the AWS Lambda platform.
C++
518
star
62

aws-lambda-base-images

506
star
63

aws-appsync-community

The AWS AppSync community
HTML
495
star
64

sagemaker-training-toolkit

Train machine learning models within a ๐Ÿณ Docker container using ๐Ÿง  Amazon SageMaker.
Python
493
star
65

dotnet

GitHub home for .NET development on AWS
487
star
66

aws-cdk-rfcs

RFCs for the AWS CDK
JavaScript
476
star
67

aws-sam-cli-app-templates

Python
472
star
68

aws-elastic-beanstalk-cli-setup

Simplified EB CLI installation mechanism.
Python
453
star
69

amazon-cloudwatch-agent

CloudWatch Agent enables you to collect and export host-level metrics and logs on instances running Linux or Windows server.
Go
403
star
70

secrets-store-csi-driver-provider-aws

The AWS provider for the Secrets Store CSI Driver allows you to fetch secrets from AWS Secrets Manager and AWS Systems Manager Parameter Store, and mount them into Kubernetes pods.
Go
393
star
71

amazon-braket-examples

Example notebooks that show how to apply quantum computing in Amazon Braket.
Python
376
star
72

aws-for-fluent-bit

The source of the amazon/aws-for-fluent-bit container image
Shell
375
star
73

aws-pdk

The AWS PDK provides building blocks for common patterns together with development tools to manage and build your projects.
TypeScript
361
star
74

aws-extensions-for-dotnet-cli

Extensions to the dotnet CLI to simplify the process of building and publishing .NET Core applications to AWS services
C#
346
star
75

aws-sdk-php-symfony

PHP
346
star
76

aws-app-mesh-roadmap

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication
344
star
77

aws-lambda-builders

Python library to compile, build & package AWS Lambda functions for several runtimes & framework
Python
337
star
78

aws-iot-device-sdk-python-v2

Next generation AWS IoT Client SDK for Python using the AWS Common Runtime
Python
335
star
79

constructs

Define composable configuration models through code
TypeScript
332
star
80

pg_tle

Framework for building trusted language extensions for PostgreSQL
C
329
star
81

graph-explorer

React-based web application that enables users to visualize both property graph and RDF data and explore connections between data without having to write graph queries.
TypeScript
321
star
82

aws-codedeploy-agent

Host Agent for AWS CodeDeploy
Ruby
316
star
83

aws-sdk-ruby-record

Official repository for the aws-record gem, an abstraction for Amazon DynamoDB.
Ruby
313
star
84

aws-ops-wheel

The AWS Ops Wheel is a randomizer that biases for options that havenโ€™t come up recently; you can also outright cheat and specify the next result to be generated.
JavaScript
308
star
85

aws-xray-sdk-python

AWS X-Ray SDK for the Python programming language
Python
304
star
86

sagemaker-inference-toolkit

Serve machine learning models within a ๐Ÿณ Docker container using ๐Ÿง  Amazon SageMaker.
Python
303
star
87

efs-utils

Utilities for Amazon Elastic File System (EFS)
Python
286
star
88

amazon-ivs-react-native-player

A React Native wrapper for the Amazon IVS iOS and Android player SDKs.
TypeScript
286
star
89

sagemaker-spark

A Spark library for Amazon SageMaker.
Scala
282
star
90

apprunner-roadmap

This is the public roadmap for AWS App Runner.
280
star
91

aws-xray-sdk-go

AWS X-Ray SDK for the Go programming language.
Go
274
star
92

aws-toolkit-eclipse

(End of life: May 31, 2023) AWS Toolkit for Eclipse
Java
273
star
93

elastic-beanstalk-roadmap

AWS Elastic Beanstalk roadmap
272
star
94

aws-logging-dotnet

.NET Libraries for integrating Amazon CloudWatch Logs with popular .NET logging libraries
C#
271
star
95

sagemaker-tensorflow-training-toolkit

Toolkit for running TensorFlow training scripts on SageMaker. Dockerfiles used for building SageMaker TensorFlow Containers are at https://github.com/aws/deep-learning-containers.
Python
270
star
96

aws-lc-rs

aws-lc-rs is a cryptographic library using AWS-LC for its cryptographic operations. The library strives to be API-compatible with the popular Rust library named ring.
Rust
263
star
97

elastic-load-balancing-tools

AWS Elastic Load Balancing Tools
Java
262
star
98

aws-step-functions-data-science-sdk-python

Step Functions Data Science SDK for building machine learning (ML) workflows and pipelines on AWS
Python
261
star
99

amazon-braket-sdk-python

A Python SDK for interacting with quantum devices on Amazon Braket
Python
254
star
100

aws-xray-sdk-node

The official AWS X-Ray SDK for Node.js.
JavaScript
248
star