• Stars
    star
    700
  • Rank 62,555 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 4 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Library of community detection algorithms and visualization tools

communities

communities is a Python library for detecting community structure in graphs. It implements the following algorithms:

  • Louvain method
  • Girvan-Newman algorithm
  • Hierarchical clustering
  • Spectral clustering
  • Bron-Kerbosch algorithm

You can also use communities to visualize these algorithms. For example, here's a visualization of the Louvain method applied to the karate club graph:

Demo

Installation

communities can be installed with pip:

$ pip install communities

Getting Started

Each algorithm expects an adjacency matrix representing an undirected graph, which can be weighted or unweighted. This matrix should be a 2D numpy array. Once you have this, simply import the algorithm you want to use from communities.algorithms and plug in the matrix, like so:

import numpy as np
from communities.algorithms import louvain_method

adj_matrix = np.array([[0, 1, 1, 0, 0, 0],
                       [1, 0, 1, 0, 0, 0],
                       [1, 1, 0, 1, 0, 0],
                       [0, 0, 1, 0, 1, 1],
                       [0, 0, 0, 1, 0, 1],
                       [0, 0, 0, 1, 1, 0]])
communities, _ = louvain_method(adj_matrix)
# >>> [{0, 1, 2}, {3, 4, 5}]

The output of each algorithm is a list of communities, where each community is a set of nodes. Each node is referred to by the index of its row in the adjacency matrix.

Some algorithms, like louvain_method and girvan_newman, will return two values: the list of communities and data to plug into a visualization algorithm. More on this in the Visualization section.

Algorithms

Louvain's Method

louvain_method(adj_matrix : numpy.ndarray, n : int = None) -> list

Implementation of the Louvain method, from Fast unfolding of communities in large networks. This algorithm does a greedy search for the communities that maximize the modularity of the graph. A graph is said to be modular if it has a high density of intra-community edges and a low density of inter-community edges.

Louvain's method runs in O(nα†žlog2n) time, where n is the number of nodes in the graph.

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph
  • n (int or None, optional (default=None)): Terminates the search once this number of communities is detected; if None, then the algorithm will behave normally and terminate once modularity is maximized

Example Usage:

from communities.algorithms import louvain_method

adj_matrix = [...]
communities, _ = louvain_method(adj_matrix)

Girvan-Newman algorithm

girvan_newman(adj_matrix : numpy.ndarray, n : int = None) -> list

Implementation of the Girvan-Newman algorithm, from Community structure in social and biological networks. This algorithm iteratively removes edges to create more connected components. Each component is considered a community, and the algorithm stops removing edges when no more gains in modularity can be made. Edges with the highest betweenness centralities (i.e. those that lie between many pairs of nodes) are removed. Formally, edge betweenness centrality is defined as:

where

  • Οƒ(i,j) is the number of shortest paths from node i to j
  • Οƒ(i,j|e) is the number of shortest paths that pass through edge e

The Girvan-Newman algorithm runs in O(m2n) time, where m is the number of edges in the graph and n is the number of nodes.

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph
    • If your graph is weighted, then the weights need to be transformed into distances, since that's how they'll be interpreted when searching for shortest paths. One way to do this is to simply take the inverse of each weight.
  • n (int or None, optional (default=None)): Terminates the search once this number of communities is detected; if None, then the algorithm will behave normally and terminate once modularity is maximized

Example Usage:

from communities.algorithms import girvan_newman

adj_matrix = [...]
communities, _ = girvan_newman(adj_matrix)

Hierarchical clustering

hierarchical_clustering(adj_matrix : numpy.ndarray, metric : str = "cosine", linkage : str = "single", n : int = None) -> list

Implementation of a bottom-up, hierarchical clustering algorithm. Each node starts in its own community. Then, the most similar pairs of communities are merged as the hierarchy is built up. Communities are merged until no further gains in modularity can be made.

There are multiple schemes for measuring the similarity between two communities, C1 and C2:

  • Single-linkage: min({sim(i, j) | i ∊ C1, j ∊ C2})
  • Complete-linkage: max({sim(i, j) | i ∊ C1, j ∊ C2})
  • Mean-linkage: mean({sim(i, j) | i ∊ C1, j ∊ C2})

where sim(i, j) is the similarity between nodes i and j, defined as either the cosine similarity or inverse Euclidean distance between their row vectors in the adjacency matrix, Ai and Aj.

This algorithm runs in O(n3) time, where n is the number of nodes in the graph.

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph
  • metric (str, optional (default="cosine")): Scheme for measuring node similarity; options are "cosine", for cosine similarity, or "euclidean", for inverse Euclidean distance
  • linkage (str, optional (default="single")): Scheme for measuring community similarity; options are "single", "complete", and "mean"
  • n (int or None, optional (default=None)): Terminates the search once this number of communities is detected; if None, then the algorithm will behave normally and terminate once modularity is maximized

Example Usage:

from communities.algorithms import hierarchical_clustering

adj_matrix = [...]
communities = hierarchical_clustering(adj_matrix, metric="euclidean", linkage="complete")

Spectral clustering

spectral_clustering(adj_matrix : numpy.ndarray, k : int) -> list

Implementation of a spectral clustering algorithm. This type of algorithm assumes the eigenvalues of the adjacency matrix hold information about community structure. Here's how it works:

  1. Compute the Laplacian matrix, L = D - A, where A is the adjacency matrix and D is the diagonal matrix
  2. Compute the k smallest eigenvectors of L, skipping the first eigenvector
  3. Create a matrix V containing eigenvectors v1, v2, ... vn as columns
  4. Cluster the rows in V using k-means into k communities

This algorithm is NP-hard.

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph
  • k (int): Number of communities to cluster nodes into

Example Usage:

from communities.algorithms import spectral_clustering

adj_matrix = [...]
communities = spectral_clustering(adj_matrix, k=5)

Bron-Kerbosch algorithm

bron_kerbosch(adj_matrix : numpy.ndarray, pivot : bool = False) -> list

Implementation of the Bron-Kerbosch algorithm for maximal clique detection. A maximal clique in a graph is a subset of nodes that forms a complete graph and would no longer be complete if any other node was added to the subset. Treating maximal cliques as communities is reasonable, as cliques are the most densely connected groups of nodes in a graph. Because a node can be a member of more than one clique, this algorithm will sometimes identify overlapping communities.

If your input graph has less than 3n/3 maximal cliques, then this algorithm runs in O(3n/3) time (assuming pivot=True).

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph
    • Note that this algorithm treats the graph as unweighted
  • pivot (bool, optional (default=False)): If True, the pivot variant of the algorithm (described here) will be used
    • This will make the algorithm more efficient if your graph has several non-maximal cliques

Example Usage:

from communities.algorithms import bron_kerbosch

adj_matrix = [...]
communities = bron_kerbosch(adj_matrix, pivot=True)

Visualization

Plot communities

draw_communities(adj_matrix : numpy.ndarray, communities : list, dark : bool = False, filename : str = None, seed : int = 1)

Visualize your graph such that nodes are grouped into their communities and color-coded.

Returns a matplotlib.axes.Axes representing the plot.

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph
  • dark (bool, optional (default=False)): If True, the plot will have a dark background and color scheme, else it will have a light color scheme
  • filename (str or None, optional (default=None)): If you want to save the plot as a PNG, filename is the path of the file to save it as; set to None to display the plot interactively
  • dpi (int or None, optional (default=None)): Dots per inch (controls the resolution of the image)
  • seed (int, optional (default=2)): Random seed

Example Usage:

from communities.algorithms import louvain_method
from communities.visualization import draw_communities

adj_matrix = [...]
communities, frames = louvain_method(adj_matrix)

draw_communities(adj_matrix, communities)

Animate the Louvain method

louvain_animation(adj_matrix : numpy.ndarray, frames : list, dark : bool = False, duration : int = 15, filename : str = None, dpi : int = None, seed : int = 2)

Use this to animate the application of the Louvain method to your graph. In this animation, the color of each node represents the community it's assigned to, and nodes in the same community are clustered together. Each step of the animation will show a node changing color (i.e. being assigned to a different community) and being moved to a new cluster, and the corresponding update to the graph's modularity.

This function returns a matplotlib.animation.FuncAnimation object representing the animation.

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph
  • frames (list): List of dictionaries representing each iteration of the algorithm
    • Each dictionary has two keys: "C", which holds a node-to-community lookup table, and "Q", the modularity value of the graph
    • This list of dictionaries is the second return value of the louvain_method
  • dark (bool, optional (default=False)): If True, the animation will have a dark background and color scheme, else it will have a light color scheme
  • duration (int, optional (default=15)): The desired duration of the animation in seconds
  • filename (str or None, optional (default=None)): If you want to save the animation as a GIF, filename is the path of the file to save it as; set to None to display the animation as an interactive plot
  • dpi (int or None, optional (default=None)): Dots per inch (controls the resolution of the animation)
  • seed (int, optional (default=2)): Random seed

Example Usage:

from communities.algorithms import louvain_method
from communities.visualization import louvain_animation

adj_matrix = [...]
communities, frames = louvain_method(adj_matrix)

louvain_animation(adj_matrix, frames)

Demo

Utilities

Inter-community adjacency matrix

intercommunity_matrix(adj_matrix : numpy.ndarray, communities : list, aggr : Callable = sum) -> numpy.ndarray

Creates an inter-community adjacency matrix. Each node in this matrix represents a community in communities, and each edge between nodes i and j is created by aggregating (e.g. summing) the weights of edges between nodes in communities[i] and nodes in communities[j].

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of the graph from which communities were extracted
  • communities (list): List of communities
  • aggr (Callable, optional (default=sum)): Function that takes a list of inter-community edge weights and combines them into a single edge weight

Example Usage:

from statistics import mean
from communities.algorithms import louvain_method
from communities.utilities import intercommunity_matrix

adj_matrix = [...]
communities = louvain_method(adj_matrix)
intercomm_adj_matrix = intercommunity_matrix(adj_matrix, communities, mean)

Graph Laplacian

laplacian_matrix(adj_matrix : numpy.ndarray) -> numpy.ndarray

Computes the graph Laplacian. This matrix is used in the spectral_clustering algorithm, and is generally useful for revealing properties of a graph. It is defined as L = D - A, where A is the adjacency matrix of the graph, and D is the degree matrix, defined as:

where wik is the edge weight between a node i and its neighbor k.

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph

Example Usage:

from communities.utilities import laplacian_matrix

adj_matrix = [...]
L = laplacian_matrix(adj_matrix)

Modularity matrix

modularity_matrix(adj_matrix : numpy.ndarray) -> numpy.ndarray

Computes the modularity matrix for a graph. The modularity matrix is defined as:

where

  • Aij is the weight of the edge between nodes i and j
  • ki and kj are the sum of the weights of the edges attached to nodes i and j, respectively
  • m is the sum of all of the edge weights in the graph

Parameters:

  • adj_matrix (numpy.ndarray): Adjacency matrix representation of your graph

Modularity

modularity(mod_matrix : numpy.ndarray, communities : list) -> float

Computes modularity of a partitioned graph. Modularity is defined as:

where

  • Aij is the weight of the edge between nodes i and j
  • ki and kj are the sum of the weights of the edges attached to nodes i and j, respectively
  • m is the sum of all of the edge weights in the graph
  • ci and cj are the communities of the nodes
  • Ξ΄ is the Kronecker delta function (Ξ΄(x, y) = 1 if x = y, 0 otherwise)

Parameters:

  • mod_matrix (numpy.ndarray): Modularity matrix computed from the adjacency matrix representation of your graph
  • communities (list): List of (non-overlapping) communities identified in the graph

Example Usage:

from communities.algorithms import louvain_method
from communities.utilities import modularity_matrix, modularity

adj_matrix = [...]
communities = louvain_method(adj_matrix)

mod_matrix = modularity_matrix(adj_matrix)
Q = modularity(mod_matrix, communities)

Authors

communities was created by Jonathan Shobrook with the help of Paul C. Bogdan as part of our research in the Dolcos Lab at the Beckman Institute for Advanced Science and Technology.

More Repositories

1

rebound

Command-line tool that instantly fetches Stack Overflow results when an exception is thrown
Python
4,077
star
2

adrenaline

Instant answers to any programming question
3,736
star
3

BitVision

Terminal dashboard for trading Bitcoin, predicting price movements, and losing all your money
JavaScript
1,190
star
4

stackexplain

Explain your error message with ChatGPT
Python
510
star
5

sequitur

Library of autoencoders for sequential data
Python
406
star
6

ChatOverflow

AI-generated answers to every coding question
JavaScript
328
star
7

statcode

Man pages for HTTP status codes
Python
312
star
8

openlimit

Maximize your usage of OpenAI models without hitting rate limits
Python
124
star
9

SmarterReply

Chrome extension for creating custom Smart Replies in Gmail
JavaScript
43
star
10

densify

Data augmentation algorithm for point clouds
Python
19
star
11

git-pull

Parallelized web scraper for Github
Python
16
star
12

CommunityNet

Hierarchical GNN for graph datasets with community structure
Python
12
star
13

saplings

Analyze usage patterns of imported modules in a Python program
Python
11
star
14

SeqConv

Graph convolutional operator that uses a LSTM as a filter
Python
9
star
15

BTC-Mining-Calculator

Simple command-line tool for predicting the amount of Bitcoin your device can mine in the next 24hrs
Python
8
star
16

gnn-dtsp

MATH 490 Final Project: Approximating solutions to the decision variant of the TSP with Graph Neural Networks
HTML
7
star
17

overcast

Desktop app that employs end-to-end encryption with forward secrecy for FB Messenger
JavaScript
7
star
18

DeepFCN

Deep learning tool for predicting individual differences (e.g. diagnostic status, IQ, etc.) from brain networks
Python
7
star
19

MatrixConv

PyTorch implementation of a GNN with a CNN filter
Python
6
star
20

neuropipe

Easy scaffolding for machine learning pipelines in Scikit-Learn
Python
6
star
21

TypeSense

Chrome extension that analyzes a Messenger conversation's sentiment in real-time
JavaScript
6
star
22

mvpa

Multivoxel pattern analysis (MVPA) tool for fMRI data
Python
4
star
23

tabber

Chrome extension for saving (and organizing) interesting FB messages, i.e. Pocket for Messenger.com
JavaScript
4
star
24

topigraph

A simple graph-based topic modeling algorithm
Python
3
star
25

PyReserve

Generate a project template and reserve a name on PyPi with one command
Python
3
star
26

dasher-landing-page

Dasher Software's prelaunch landing page
CSS
2
star
27

Equaliser

Automated unit testing for IEquatable objects
C#
2
star
28

excusabot

Mobile app that auto-notifies your company's Slack channel when you're running late for work (made for the ROSS hackathon)
JavaScript
2
star
29

adrenaline-vscode

2
star
30

outgraph

Outlier detection tool for graph datasets
Python
1
star
31

Mirror

Front-end for a chrome extension built at MHacks VI
CSS
1
star
32

enumerast

Algorithm that enumerates all possible execution paths in a Python AST
1
star
33

personal-site

CSS
1
star
34

Course-Checker

A script that scrapes the status of any given UIUC course
Python
1
star
35

Sediment

Tutorial project that uses linear regression to predict a wine's quality given its chemical properties
Python
1
star
36

test-repository

This is for testing the reindexing process on Adrenaline
1
star
37

University-Infographic

Infographic website demonstrating the growing college tuition bubble
CSS
1
star
38

Overcast-Website

Coming soon page for Overcast, an encrypted messaging app
CSS
1
star
39

coinhopper.io

An FPGA mining interface that dynamically mines cryptocurrencies based on each coin's predicted yield
Python
1
star
40

Vacation-Site

Landing page for a vacation rental located in Sanibel, Florida
JavaScript
1
star
41

ChatOverflow-site

Website for the ChatOverflow browser plugin
JavaScript
1
star