SimFin Tutorials
Introduction
SimFin is a database with financial data such as Income Statements, Balance Sheets and Cash Flow Statements, along with a simple Python API for downloading and using the data. These tutorials show how to use the SimFin API and data.
Videos
There is a video on YouTube with an overview of these tutorials, and another video on how to backtest and optimize a stock-screener based on Tutorial 7.
Tutorials
- Basics (Notebook) (Google Colab)
- Resampling (Notebook) (Google Colab)
- Growth & Returns (Notebook) (Google Colab)
- Signals (Notebook) (Google Colab)
- Data Hubs (Notebook) (Google Colab)
- Performance Tips (Notebook) (Google Colab)
- Stock Screener (Notebook) (Google Colab)
- Statistical Analysis (Notebook) (Google Colab)
- Machine Learning (Notebook) (Google Colab)
- Neural Networks (Notebook) (Google Colab)
There is also a collection of small recipes (Notebook) (Google Colab)
Downloading
If you want to run these tutorials on your own computer, then it is recommended that you download the whole repository from GitHub, instead of just downloading the individual Python Notebooks.
Git
The easiest way to download and install this is by using git from the command-line:
git clone https://github.com/simfin/simfin-tutorials.git
This creates the directory simfin-tutorials
and downloads all the files to it.
This also makes it easy to update the files, simply by executing this command inside that directory:
git pull
Zip-File
You can also download the contents of the GitHub repository as a Zip-file and extract it manually.
Installation
If you want to run these tutorials on your own computer, then it is best to use a virtual environment when installing the required packages, so you can easily delete the environment again. You write the following in a Linux terminal:
virtualenv simfin-env
Or you can use Anaconda instead of a virtualenv:
conda create --name simfin-env python=3
Then you switch to the virtual environment and install the required packages.
source activate simfin-env
pip install -r requirements.txt
When you are done working on the project you can deactivate the virtualenv:
source deactivate
How To Run
Once you have installed the required Python packages in a virtual environment,
you run the following command from the simfin-tutorials
directory to view
and edit the Notebooks:
source activate simfin-env
jupyter notebook
Run in Google Colab
If you do not want to install anything on your own computer, then the Notebooks can be viewed, edited and run entirely on the internet by using Google Colab.
You can click the "Google Colab"-link next to the tutorials listed above. You can view the Notebook on Colab but in order to run it you need to login using your Google account.
All the required Python packages should already be installed on Google Colab, except for simfin which you can install by executing the following command at the top of the Notebook:
!pip install simfin
If that is insufficient, then you can clone this entire GitHub repository to your Google Colab account, and execute the following commands at the top of the Notebook, to install all requirements:
# Clone the repository from GitHub to Google Colab's temporary drive.
import os
work_dir = "/content/simfin-tutorials/"
if not os.path.exists(work_dir):
!git clone https://github.com/simfin/simfin-tutorials.git
os.chdir(work_dir)
# Install the required Python packages.
!pip install -r requirements.txt
Note that you will need to run this every time you login to Google Colab.
Testing
All the Notebooks can be run automatically and tested for errors. This is particularly useful for developers who are making changes to the simfin package, because it complements the unit-tests and data-tests with more realistic use-cases.
First you need to install nbval:
pip install nbval
Then you can execute all the Notebooks and test them for errors by running the following command from the directory where the Notebooks are located:
pytest --nbval-lax -v
Note that this will only test for errors and exceptions. It will not test whether the new output matches the old output found in the Notebooks, because the datasets are continually updated.
License (MIT)
This is published under the MIT License which allows very broad use for both academic and commercial purposes.
You are very welcome to modify and use this source-code in your own project. Please keep a link to the original repository.