• Stars
    star
    5
  • Rank 2,860,262 (Top 57 %)
  • Language
    Python
  • Created over 4 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Objective: *Given a graph, learn embeddings of the nodes using only the graph structure and the node features, without using any known node class labels **Unsupervised GraphSAGE model:** In the Unsupervised GraphSAGE model, node embeddings are learnt by solving a simple classification task: given a large set of "positive" '(target, context)' node pairs generated from random walks performed on the graph (i.e., node pairs that co-occur within a certain context window in random walks), and an equally large set of "negative" node pairs that are randomly selected from the graph according to a certain distribution, learn a binary classifier that predicts whether arbitrary node pairs are likely to co-occur in a random walk performed on the graph. Through learning this simple binary node-pair-classification task, the model automatically learns an inductive mapping from attributes of nodes and their neighbors to node embeddings in a high-dimensional vector space, which preserves structural and feature similarities of the nodes. Unlike embeddings obtained by algorithms such as 'node2vec', this mapping is inductive: given a new node (with attributes) and its links to other nodes in the graph (which was unseen during model training), we can evaluate its embeddings without having to re-train the model. In our implementation of Unsupervised GraphSAGE, the training set of node pairs is composed of an equal number of positive and negative '(target, context)' pairs from the graph. The positive '(target, context)' pairs are the node pair’s co-occurring on random walks over the graph whereas the negative node pairs are sampled randomly from a global node degree distribution of the graph. The architecture of the node pair classifier is the following. Input node pairs (with node features) are fed, together with the graph structure, into a pair of identical GraphSAGE encoders, producing a pair of node embeddings. These embeddings are then fed into a node pair classification layer, which applies a binary operator to those node embeddings (e.g., concatenating them), and passes the resulting node pair embeddings through a linear transform followed by a binary activation (e.g., sigmoid), thus predicting a binary label for the node pair. The entire model is trained end-to-end by minimizing the loss function of choice (e.g., binary cross-entropy between predicted node pair labels and true link labels) using stochastic gradient descent (SGD) updates of the model parameters, with minibatch of 'training' links generated on demand and fed into the model. Node embeddings obtained from the encoder part of the trained classifier can be used in various downstream tasks. In this demo, we show how these can be used for predicting node labels.

More Repositories

1

Linear-Regression-on-Wine-Data

Linear Regression in order to find out quality on Wine Dataset
Python
5
star
2

Link-Prediction---Inductive-Representation-Learning-on-Large-Graphs

To address this problem, we build a a base 'GraphSAGE' model. First we build a two-layer GraphSAGE model that takes labeled node pairs corresponding to possible citation links, and outputs a pair of node embeddings for the nodes of the pair. These embeddings are then fed into a link classification layer, which first applies a binary operator to those node embeddings (e.g., concatenating them) to construct the embedding of the potential link. Thus obtained link embeddings are passed through the dense link classification layer to obtain link predictions - probability for these candidate links to actually exist in the network. The entire model is trained end-to-end by minimizing the loss function of choice ( e.g., binary cross-entropy between predicted link probabilities and true link labels, with true/false citation links having labels 1/0) using stochastic gradient descent (SGD) updates of the model parameters, with minibatches of 'training' links fed into the model.
Python
5
star
3

MNIST

2
star
4

Big-Sorting-using-Python---HackerRank

Big Sorting using Python - HackerRank
Python
2
star
5

Missing-Numbers-Using-Python---HackerRank

Missing Numbers Using Python - HackerRank
Python
2
star
6

ashikrafi.github.io

HTML
2
star
7

HouseRent

House Rent Prediction with XGboost & Linear Regression Using Python
Python
2
star
8

Employee-Attrition-Analysis-Logistic-Regression-Model-

Employee Attrition Analysis (Logistic Regression Model)
Python
2
star
9

Running-Time-of-Algorithm-Using-Python---HackerRank

Running Time of Algorithm Using Python - HackerRank
Python
2
star
10

node-embeddings-with-a-traditional-community-detection-method

The goal of this use case is to demonstrate how node embeddings from graph convolutional neural networks trained in unsupervised manner are comparable to standard community detection methods based on graph partitioning. Here, we demonstrate, using the terrorist group dataset, that the infomap communities and the graphSAGE embedding clusters (GSEC) provide qualitatively different insights into underlying data patterns.
Python
2
star
11

Ice-Cream-Parlor-Binary-Search-Python-HackerRank

Ice Cream Parlor-Binary Search-Python-HackerRank
Python
2
star
12

TestLang

1
star