• Stars
    star
    1,996
  • Rank 23,210 (Top 0.5 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created about 10 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Search and browse documents and data; find the people and companies you look for.

Truth cannot penetrate a closed mind. If all places in the universe are in the Aleph, then all stars, all lamps, all sources of light are in it, too.

—The Aleph, Jorge Luis Borges

Aleph is a tool for indexing large amounts of both documents (PDF, Word, HTML) and structured (CSV, XLS, SQL) data for easy browsing and search. It is built with investigative reporting as a primary use case. Aleph allows cross-referencing mentions of well-known entities (such as people and companies) against watchlists, e.g. from prior research or public datasets.

For further details on the software, how to use it, install it or manage data imports, please check the documentation at:

Support

Aleph is used and developed by multiple organisations and interested individuals. If you're interested in participating in this process, please read the support policy (SUPPORT.md), the contribution rules (CONTRIBUTING.md), and the code of conduct (CODE_OF_CONDUCT.md) and then get in touch:

Release process

If you are interested in, or have been tasked with releasing a new version of Aleph. The following steps should be followed:

Overview

The basic process for releasing Aleph is this:

  1. Check internal libraries for updates and merge. Release our libraries in the following order
  1. servicelayer
  2. followthemoney
  3. ingest-file
  4. react-ftm
  1. Ensure that all libraries for a release are up to date in aleph and merged to the develop branch.
  2. Ensure that any features, bugfixes are merged into develop and that all builds are passing
  3. Ensure that the CHANGELOG.md file is up to date on the develop branch. Add information as required.
  4. Create a RC release of Aleph.
  5. Test and verify the RC. Perform further RC releases as required.
  6. Merge all changes to main
  7. Create a final version of Aleph

As far as possible apply the rules of semantic versioning when determining the type of release to perform.

Technical process

RC releases

If you need to perform an RC release of Aleph, follow these steps:

  1. Ensure that the CHANGELOG` is up to date on the develop branch and that all outstanding PR's have been merged
  2. From the develop branch run bump2version (major, minor, patch) this will create a x.x.x-rc1 version of aleph
  3. push the tags to the remote with git push --tags
  4. push the version bump with git push
  5. If there are problems with the RC you can fix them and use bump2version build to generate new rc release

Major, minor, patch releases

  1. switch to main and pull from remote
  2. If not already done merge develop into main
  3. Update translations using make translate
  4. If you get npm errors, go into the ui folder and run npm install
  5. commit translations to main and push to remote
  6. run bump2version --verbose --sign-tags release. Note that bump2version won't show changes when you make the change, but it will work (see git log to check)
  7. push the tags to the remote with git push --tags
  8. push version bump to remote with git push
  9. merge main back into develop. Slightly unrelated to the release process but this is a good time to do it so that the new version numbers appear in develop as well

More Repositories

1

memorious

Lightweight web scraping toolkit for documents and structured data.
Python
311
star
2

followthemoney

Data model and processing tools for investigative entity data
Python
207
star
3

fingerprints

Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.
Python
142
star
4

cronodump

A Cronos database converter
Python
70
star
5

countrynames

Utility library to turn country names into ISO two-letter codes
Python
65
star
6

ingest-file

Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.
Python
54
star
7

datadesktop

DEPRECATED. Desktop graph visualization application
TypeScript
50
star
8

pdflib

Binary Python bindings for poppler utils for content extraction
Python
42
star
9

synonames

Trying to generate name synonyms from wikidata
Python
33
star
10

alephclient

API client for Aleph, supports bulk entity and document upload.
Python
27
star
11

pantomime

Python library for MIME type parsing, normalisation and grouping.
Python
12
star
12

offshoreleaks

Converter for ICIJ Offshore Leaks data into FollowTheMoney format
Python
12
star
13

followthemoney-store

Fragment storage/database layer for FollowTheMoney entities
Python
10
star
14

react-ftm

React UI component library for aleph/followthemoney
TypeScript
10
star
15

languagecodes

A Python helper library to convert between ISO 639 two- and three-letter codes.
Python
10
star
16

countrytagger

Extract names of places from text and determine which country they may refer to
Python
8
star
17

servicelayer

Common interface definitions for aleph toolkit services and applications
Python
7
star
18

followthemoney-ocds

Import data formatted as OpenContracting Data Standard (OCDS) objects into FollowTheMoney
Python
7
star
19

panama

Parser for a 2008 scrape of the Panama companies registry
Python
6
star
20

docs

GitHub mirror of the GitBook documentation
6
star
21

followthemoney-predict

Experiments with FtM record linkage
Jupyter Notebook
5
star
22

alephr

R package wrapper for Aleph API
R
4
star
23

translate-service

Demo: document processing service for automated translation
Python
4
star
24

example-personadeinteres

Example how to load mixed document/entity graphs to Aleph
Python
4
star
25

aleph-elasticsearch

Custom ElasticSearch configuration for Aleph
Shell
3
star
26

followthemoney-typepredict

Predict ftm types for string input data
Python
3
star
27

followthemoney-compare

followthemoney-compare
Jupyter Notebook
2
star
28

document-categorization

DSSG document categorization repository
Jupyter Notebook
1
star
29

followthemoney-graph

Python
1
star