• Stars
    star
    131
  • Rank 267,581 (Top 6 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

LitStudy: Using the power of Python to automate scientific literature analysis from the comfort of a Jupyter notebook

LitStudy

Logo

github DOI License Version Build and Test

LitStudy is a Python package that enables analysis of scientific literature from the comfort of a Jupyter notebook. It provides the ability to select scientific publications and study their metadata through the use of visualizations, network analysis, and natural language processing.

In essence, this package offers five main features:

  • Extract metadata from scientific documents sourced from various locations. The data is presented in a standardized interface, allowing for the combination of data from different sources.
  • Filter, select, deduplicate, and annotate collections of documents.
  • Compute and plot general statistics for document sets, such as statistics on authors, venues, and publication years.
  • Generate and plot various bibliographic networks as interactive visualizations.
  • Topic discovery using natural language processing (NLP) allows for the automatic discovery of popular topics.

Frequently Asked Questions

If you have any questions or run into an error, see the Frequently Asked Questions section of the documentation. If your question or error is not on the list, please check the GitHub issue tracker for a similar issue or create a new issue.

Supported Source

LitStudy supports the following data sources. The table below lists which metadata is fully (โœ“) or partially (*) provided by each source.

Name Title Authors Venue Abstract Citations References
Scopus โœ“ โœ“ โœ“ โœ“ โœ“ โœ“
SemanticScholar โœ“ โœ“ โœ“ โœ“ * (count only) โœ“
CrossRef โœ“ โœ“ โœ“ โœ“ * (count only) โœ“
DBLP โœ“ โœ“ โœ“
arXiv โœ“ โœ“ โœ“
IEEE Xplore โœ“ โœ“ โœ“ โœ“ * (count only)
Springer Link โœ“ โœ“ โœ“ โœ“ * (count only)
CSV file โœ“ โœ“ โœ“ โœ“
bibtex file โœ“ โœ“ โœ“ โœ“
RIS file โœ“ โœ“ โœ“ โœ“

Example

An example notebook is available in notebooks/example.ipynb and here.

Example notebook

Installation Guide

LitStudy is available on PyPI! Full installation guide is available here.

pip install litstudy

Or install the latest development version directly from GitHub:

pip install git+https://github.com/NLeSC/litstudy

Documentation

Documentation is available here.

Requirements

The package has been tested for Python 3.7. Required packages are available in requirements.txt.

litstudy supports several data sources. Some of these sources (such as semantic Scholar, CrossRef, and arXiv) are openly available. However to access the Scopus API, you (or your institute) requires a Scopus subscription and you need to request an Elsevier Developer API key (see Elsevier Developers). For more information, see the guide by pybliometrics.

License

Apache 2.0. See LICENSE.

Change log

See CHANGELOG.md.

Contributing

See CONTRIBUTING.md.

Citation

If you use LitStudy in your work, please cite the following publication:

S. Heldens, A. Sclocco, H. Dreuning, B. van Werkhoven, P. Hijma, J. Maassen & R.V. van Nieuwpoort (2022), "litstudy: A Python package for literature reviews", SoftwareX 20

As BibTeX:

@article{litstudy,
    title = {litstudy: A Python package for literature reviews},
    journal = {SoftwareX},
    volume = {20},
    pages = {101207},
    year = {2022},
    issn = {2352-7110},
    doi = {https://doi.org/10.1016/j.softx.2022.101207},
    url = {https://www.sciencedirect.com/science/article/pii/S235271102200125X},
    author = {S. Heldens and A. Sclocco and H. Dreuning and B. {van Werkhoven} and P. Hijma and J. Maassen and R. V. {van Nieuwpoort}},
}

Don't forget to check out these other amazing software packages!

  • ScientoPy: Open-source Python based scientometric analysis tool.
  • pybliometrics: API-Wrapper to access Scopus.
  • ASReview: Active learning for systematic reviews.
  • metaknowledge: Python library for doing bibliometric and network analysis in science.
  • tethne: Python module for bibliographic network analysis.
  • VOSviewer: Software tool for constructing and visualizing bibliometric networks.

More Repositories

1

mcfly

A deep learning tool for time series classification and regression
JavaScript
362
star
2

python-template

Netherlands eScience Center Python Template
Python
162
star
3

xtas

Distributed text analysis suite based on Celery
Python
94
star
4

Massive-PotreeConverter

Convert massive pointcloud, for example ahn2 (640 Billion points) to potree format.
Python
79
star
5

mcfly-tutorial

tutorial for mcfly repository
Jupyter Notebook
76
star
6

awesome-research-software-registries

Awesome list of Research Software Registries
69
star
7

structure-from-motion

Structure from Motion Pipeline
Python
61
star
8

guide

Software Development Guide
HTML
46
star
9

scriptcwl

Create cwl workflows by writing a simple Python script
Python
40
star
10

root-conda-recipes

Conda recipes for building ROOT 5 and ROOT 6 binaries, root_numpy, rootpy, root_pandas, with both Python 2 and Python 3 support.
Shell
30
star
11

ShiCo

Netherlands eScience Center - Shifting Concepts Through Time project
Python
26
star
12

XAI

Prototyping about eXplainable Artificial Inteligence (XAI)
Jupyter Notebook
26
star
13

DiVE

An interactive 3D web viewer of up to million points on one screen that represent data. Provides interaction for viewing high-dimensional data that has been previously embedded in 3D or 2D. Based on graphosaurus.js and three.js. For a Linux release of a complete embedding+visualization pipeline please visit https://github.com/sonjageorgievska/Embed-Dive.
HTML
25
star
14

yeap16-ai-3d-printing

CNN's for bone segmentation of CT-scans.
Python
24
star
15

spot

Try the demo
JavaScript
21
star
16

ahn-pointcloud-viewer

3D point cloud visualization of the Netherlands
JavaScript
21
star
17

noodles

Computational workflow engine, making distributed computing in Python easy!
Jupyter Notebook
21
star
18

polyphase-filter-bank-generator

This code generates the filter weights for polyphase filter banks with arbitrary numbers of channels, and with configurable windows.
C++
19
star
19

spudisc-emotion-classification

Python
16
star
20

geospatial-voxels

geospatial-voxels
C#
16
star
21

PattyAnalytics

Reusable point cloud analytics software. Includes segmentation, registration, file format conversion.
Python
14
star
22

deep-learning-assignments-solutions

CS231n: Convolutional Neural Networks for Visual Recognition Assignment solutions
Jupyter Notebook
14
star
23

PattyVis

Webgl pointcloud visualization of the Via Appia based on potree
JavaScript
13
star
24

MAGMa

eMetabolomics project: Mass Annotation based on in silico Generated Metabolites
Python
13
star
25

Machine_Learning_SIG

The topics discussed in the Machine Learning SIG group.
Jupyter Notebook
12
star
26

cptm

Cross-Perspective Topic Modeling
Python
10
star
27

DifferentialEvolution

Java implementation of the Differential Evolution algorithm by Storn & Price
Java
10
star
28

pointcloud-benchmark

Python
9
star
29

case-law-app

JavaScript
8
star
30

PattyData

HTML
7
star
31

ahn-pointcloud-viewer-ws

Webservice for ahn pointcloud viewer
Java
6
star
32

Chemical-Analytics-Platform

Scripts to create chemical analytics platform based on Knime
Shell
6
star
33

full-stack-recipes

Full-stack guidelines and recipes with examples to create offline first reactive Web Applications with 'pluggable' world-wide adopted technologies.
6
star
34

Analytics-SIG

NLeSc Analytics SIG
Mathematica
5
star
35

software.esciencecenter.nl

eStep website with projects software and people.
Python
5
star
36

eAstroViz

Java
5
star
37

softwarehorrorgame

https://nlesc.github.io/softwarehorrorgame/SoftwareHorrorGame.html
HTML
5
star
38

enram

IDL
5
star
39

com-com-kernels

Kernels for computation and communication overlap
Cuda
5
star
40

nlesc.github.io

Overview of Github organizations of all NLeSC projects.
SCSS
4
star
41

nuxt-apollo-hasura

Full Sack Recipe with Nuxt + Apollo + Hasura
Vue
4
star
42

EEG-epilepsy-diagnosis

Detection of Epilepsy from EEG data (14 channel x 5 minutes)
R
4
star
43

boatswain

A simple build system for docker images
Python
4
star
44

reinforcement-learning-course

Code from excersizes and other documents related to following this course can be stored here.
Jupyter Notebook
3
star
45

Neon

OpenGL and Java (JOGL) based Visualization library
Java
3
star
46

pycoeman

Python Commands Execution Manager
Python
3
star
47

baklava

Deploy a Kubernetes cluster and big data services on the cloud.
Python
3
star
48

PowerSensor

PowerSensor is a low-cost, custom-built device that measures the instantaneous power consumption of GPUs and other devices at a high time resolution.
C++
3
star
49

LargeScaleImaging

Data, Software,Results and Publications for the Large Scale Imaging research @ NLeSc
TeX
3
star
50

nlesc-serverless-boilerplate

An AWS amplify boilerplate web application
TypeScript
3
star
51

spot-desktop-app

Desktop version of SPOT
CSS
3
star
52

OCTSegmentation

EYR4-OCTSEG
MATLAB
3
star
53

teamwork-for-research-software-development

This lesson teaches how to successfully work together in a team. It is geared primarily towards people that create research software in an academic setting, however the lesson is most likely also useful for anyone trying to work on a team in scientific projects. Finally we hope that anyone wanting to get better at working in teams can learn something from this lesson.
HTML
3
star
54

eEcology-Annotation-WS

Webservice for eEcology Annotation project.
HTML
2
star
55

spot-tutorial

Tutorial for SPOT
Python
2
star
56

natural-language-processing-sig

Information related to the meetings of the Natural Language Processing SIG of the Netherlands eScience Center
2
star
57

collab-demos

This repository collects knowledge about the demos in the Collaboratorium
Python
2
star
58

eEcology-Classification

Classification tool
Java
2
star
59

candYgene

Python
2
star
60

cwl-object-model

Work in progress object model proposal for CWL2
2
star
61

docker-couch-admin

Configures a web service using angular-schema-form and CouchDB
JavaScript
2
star
62

embodied-emotions-scripts

Python
2
star
63

TEAM2018

This is the repo for the 2018 TEAM sprint
2
star
64

3D_geospatial_risk_management

C
2
star
65

pycwl

Python Library for CWL
Python
2
star
66

SalientDetector-python

Python package for Large Scale Imaging research @ NLeSC
Jupyter Notebook
2
star
67

spot-framework

Try the demo
JavaScript
2
star
68

gpu-sig

The GPU Computing Special Interest Group (SIG) of the Netherlands eScience Center.
2
star
69

app-estep.esciencecenter.nl

Angular app used as frontend for estep website
JavaScript
2
star
70

eEcology-Annotation-UI

User interface for eEcology annotation project.
JavaScript
2
star
71

esibayes

Optimization and state estimation of dynamic models
HTML
2
star
72

zarrviz

A package to visualize zarr files using threejs
Svelte
2
star
73

eEcology-Classification-Database

Java
2
star
74

teamwork-sig

Special interest group on teamwork
1
star
75

eEcology-SMS-reciever

Webservice to store SMS messages into a database table
Python
1
star
76

NumTech-sig

Special Interest Group on Numerical Techniques
TeX
1
star
77

Aether

Communication library for distributed, heterogeneous and dynamic environments
Java
1
star
78

research.esciencecenter.nl

HTML
1
star
79

spot-server

JavaScript
1
star
80

SalientDetector-matlab

MATLAB Software for Large Scale Imaging research @ NLeSc
MATLAB
1
star
81

wasmbase64

C
1
star
82

Constellation

Java
1
star
83

Visualization-SIG

Visualization and Computer Graphics SIG
1
star
84

TuneDinia

C
1
star
85

eSalsa-Visualization

eSalsa globe visualization based on eSight
Java
1
star
86

eSalsa-MPI

This project contains the wide area MPI used in the eSalsa project
C
1
star
87

dask-cassandra-loader

A data loader to load load data from a Cassandra table into a Dask Dataframe.
Python
1
star
88

recordTimings

C++
1
star
89

DeskTracker

A Repository to track issues related to the desks in the eScience Center
1
star
90

eSalsa-POP

eSalsa project version of The Parallel Ocean Program
Fortran
1
star
91

embem-ml-dataset

1
star
92

geospatial-handbook

NLeSC Geospatial handbook
1
star
93

knowledge.esciencecenter.nl

NLeSC knowledge website
CSS
1
star
94

cwl-tutorial

Simple CWL tutorial based on the gentle introduction to CWL
1
star
95

eEcology-Classification-Web

Web interface for eEcology classification
Java
1
star
96

eEcology-Annotation-Model

classification of the bird behavoiur from the accelerometer data
MATLAB
1
star
97

asterisk

Astrophysics visualization for AMUSE, based on eSight
Java
1
star
98

rsemg

Code for exploration of clinical electromyography
Jupyter Notebook
1
star
99

eEcology-CartoDB

Documentation and script to setup a CartoDB server with eEcology tables
1
star
100

monthly_report

File monthly reports
PowerShell
1
star