• Stars
    star
    125
  • Rank 279,750 (Top 6 %)
  • Language
    Python
  • Created about 10 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Simple example of using a Naive Bayesian classification to classify entries in bank statements

BankClassify - automatically classify your bank statement entries

Note: This is not 'finished' software. I use it for dealing with my bank statements, but it is not 'production-ready' and may crash or do strange things. It is also set up for my particular usage, so may not work for you. However, I hope it will be a useful resource.

This code will classify each entry in your bank statement into categories such as 'Supermarket', 'Petrol', 'Eating Out' etc. It learns from previously classified data, and corrections you make when it guesses a category incorrectly, and improves its performance over time.

How to use

  1. Install the required libraries: pip install -r requirements.txt

  2. Run the code in example.py as a demonstration. This will interactively classify the example bank statement data in Statement_Example.txt and save the results in AllData.csv. In the interactive classification you will be presented with a list of categories (with ID numbers), the details of a transaction, and a guessed category. You have three choices:

    • To accept the guessed category, just press Enter
    • To correct the classifier to a category that is in the list shown, enter the ID number of the category and press Enter
    • To add a new category, type the name of the category and press Enter
  3. Examine the output in AllData.csv manually, or run bc._prep_for_analysis() and look at bc.in and bc.out for incomings and outgoings respectively. You will see there is a cat column with the category in it.

To use it with your own data:

  • If you use Santander UK as your bank: just run bc.add_data(filename) with the filename of your downloaded statement file. Delete AllData.py first though, or the example data will be used as part of the training data.
  • If you use another bank: Write your own function to read in your statement data from your bank. It must return a pandas dataframe with columns of date, desc and amount. Add this to the BankClassify class and call it instead of _read_santander_file.

Known issues

For Barclays bank sometimes the CSV file contains multiple commas within the 'memo' (transaction description) column. You can either manually patch your data before you run the tool or be aware that due to the work-around implemented we could potentially be losing valuable information beyond the comma.

More Repositories

1

Py6S

A Python interface to the 6S Radiative Transfer Model
Python
164
star
2

PandasToPowerpoint

Python utility to take a Pandas DataFrame and create a Powerpoint table
Python
103
star
3

RPiNDVI

Raspberry PI NDVI Code
Python
61
star
4

AutoZotBib

JavaScript
37
star
5

LatLongToWRS

Python code to get a Landsat WRS-2 path and row from a latitude/longitude co-ordinate
Python
34
star
6

daterangeparser

Python module to parse human-style date ranges (eg. 15th-19th March 2011) to datetimes
Python
32
star
7

pywavethermo

Basic module to control the Worcester Wave thermostat from Python
Python
27
star
8

XArrayAndRasterio

Experimental code for loading/saving XArray DataArrays to Geographic Rasters using rasterio
Jupyter Notebook
22
star
9

PyProSAIL

Python interface to the ProSAIL leaf/canopy reflectance model
Fortran
21
star
10

XArray_PyConUK2018

Code and slides for my talk at PyCon UK 2018 on XArray
Jupyter Notebook
18
star
11

PyFMask

Python version of the FMask Landsat Cloud Masking code
Python
13
star
12

RastersRevealedTalk

Slides and resources for talk given at Rasters Revealed, 21st Feb 2017
Jupyter Notebook
9
star
13

DropboxBasedWordCount

Code to download revisions of files from Dropbox, then use texcount to do a word count of them
Jupyter Notebook
9
star
14

Python-DocSets

Python docsets for use with Dash (http://kapeli.com/dash/)
JavaScript
8
star
15

PyAURN

A Python module to import data from the UK Automatic Urban Rural Network (air pollution monitoring network)
Python
7
star
16

sqlalchemy-units-example

Example code from 'Pint + SQLAlchemy = Unit consistency and enforcement in your database' poster at PyData Global 2020
Python
7
star
17

LandsatUtils

Python
5
star
18

manifestoclouds

Creates word clouds from political party manifestos
Python
5
star
19

CloudFrequencyApp

CloudFrequency webapp, using Google App Engine
JavaScript
4
star
20

ParentZonePhotoDownloader

Downloads photos from a ParentZone account
Python
4
star
21

DunesGIS

ENVI/IDL and ArcGIS routines for processing DEMs of sand dunes from the DECAL model
Prolog
4
star
22

6S

6S with CMake files
Fortran
4
star
23

py6s-website

Website for Py6S
HTML
3
star
24

Web6S

Web-based interface to Py6S
Python
3
star
25

GDALUtils

Python
3
star
26

bib2coins

Utility to convert BibTeX files to COINS metadata (see http://ocoins.info/) for use in webpages
Python
3
star
27

RTWToolsForArcGIS

RTW Tools for ArcGIS
Python
3
star
28

rtwrtm

Monte Carlo Ray Tracing Radiative Transfer Model (RTM)
IDL
2
star
29

LeafletExample_IMD_Choropleth

Example Leaflet code for a choropleth map with auto-generated and auto-hiding legend
JavaScript
2
star
30

PyToENVI

Python module to easily allow you to display files in ENVI
Python
2
star
31

julian_timehop

Sends a daily 'Julian Timehop' email reminding me of things Julian did on this day in previous years
Python
2
star
32

LeafletExample_IMD_Choropleth_With_Limits

JavaScript
2
star
33

rtwtools-website

HTML
2
star
34

pytest_examples

Simple Pytest examples
Python
2
star
35

WernerModel

The Werner (1995) model and its extension, written for COMP6023 at the University of Southampton
Python
2
star
36

CW-ideas

Hack day project from CW21 working on collating and analysing collaborative ideas and hack day projects from previous Collaborations Workshops
HTML
2
star
37

VisAOT

Code for paper: Are visibility-derived AOT estimates suitable for parameterising satellite data atmospheric correction algorithms?
2
star
38

Academic-Website

My academic website - hosted at www.rtwilson.com/academic
PHP
1
star
39

BreathingSpacesWebmap

Web map for Breathing Spaces project
JavaScript
1
star
40

sse_powercuts

1
star
41

SESG6028Coursework2

Code for SESG6028 Coursework 2
C
1
star
42

FlaskTemperature

Flask app for reading temperature from W1 sensor on the RPi
Python
1
star
43

WeddingSite

Files for our wedding website
PHP
1
star
44

first_app

First app for RoR tutorial application
Ruby
1
star
45

recipy-website

HTML
1
star
46

SESG6028Coursework3

Code for SESG6028 Coursework 3
C
1
star
47

FreeGISData

HTML
1
star
48

RTWOBIA

Robin's Object-based Image Analysis software
Prolog
1
star
49

sample_app

RoR Tutorial Sample Application
Ruby
1
star
50

RWords

Search words in the SOWPODs dictionary in interesting ways
Ruby
1
star
51

AOT2PM

Python
1
star
52

RTWIDL

RTWTools for IDL - a library of useful IDL routines
IDL
1
star
53

Personal-Website

Source for my personal website - available at www.rtwilson.com/personal
PHP
1
star
54

TheatreScraper

Python
1
star
55

CrestExtract

New version of Crest Extraction code in DunesGIS
Prolog
1
star
56

COMP3008Coursework

Code for my COMP3008 Coursework
R
1
star
57

pandas-FSDR

Python
1
star
58

SESG6028Coursework3New

New (working?) version of SESG6028 Coursework 3
C
1
star
59

WoodgetDietrichWilson2019

Code for the Machine Learning analyses in Woodget, Dietrich and Wilson (2019)
Jupyter Notebook
1
star
60

SESG6028Coursework1

Coursework for SESG6028
C
1
star
61

IWVComparison

Code to run the validations & comparisons presented in 'A global comparison of integrated water vapour estimates from WMO radiosondes, AERONET sun photometers and GPS for the 17 year period from 1997 to 2013' by Wilson et al. Edit
R
1
star
62

PySun

Python sun.py as a proper module - originally from http://kortis.to/radix/python/code/Sun.py (Public Domain)
Python
1
star