• Stars
    star
    169
  • Rank 224,453 (Top 5 %)
  • Language
    Jupyter Notebook
  • Created almost 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ICR - Automated and Intelligent Company Report Built in Python (by @firmai)

README

For a sampled version of the report (webapp) see FirmAI Report.

This report endeavours to provide ratings of four corporate dimensions: employees, customers, shareholders and management, as benchmarked against competitors. It also shows the change in ratings over time. The competitors are automatically identified from the data using statistical distance metrics.

This report consists of Programmatic Competitor Analysis, NLP Sentiment Analysis, NLP Summarisation, ML Time Series and Cross-Section Prediction (Valuation, Closures, Geographic Opportunity), Employee Growth and Qualifications Measures, Location Ratings, Rating Growth, Social Media Analytics, Compensation Satisfaction Analysis, Interview Analysis, Product Analysis and Financial PCA. It is my hope that this report, analysis, generated data and scraping scripts (in functionality folder), will benefit smaller firms who do not necessarily have access to this technology stack.

Description

The report is built out of a Dash example. It is fully automated and updates on a monthly basis. It allows companies to study multiple competitors and company locations without strenuous user input. It is the first interactive report of its kind. It is in PDF style, making it easily digestible and also easy to print for meetings.

All information is extracted from the public domain using modern programming tools. This report uses state of the art machine learning and natural language processing techniques for deep sentiment analysis and prediction tasks. The report looks analysis a company’s from four dimensions, being the employees, customers, shareholders (owners) and management. Information is gathered from numerous online sources, the majority of which do not sit behind pay-walls. This report serves the following functions.

  • Identify the overall sentiment of your firm on the before-mentioned dimensions.
  • Identify the extent to which your firm is currently under or overvalued as per qualitative and quantitative metrics using machine learning.
  • Compare the valuation of your firm against that of close competitors, and programatically identify close competitors.
  • Get an overview as to which locations are the most and least at risk of closing using inbuilt machine learning tools.
  • Get to understand the different attributes leading to higher customer satisfaction.
  • Get an indication as to how well the company has done by following various metrics over time.
  • Gain a deeper insight into how your employee and management cohort compares against industry benchmarks.
  • Isolate competitor firms using five different algorithmic benchmarks.
  • Identify the relationship between firm value and three machine learning satisfaction ratings (employee, customer and manager satisfaction).
  • Identify the top employment regions historically and more recently by analysing open job locations.
  • Look at different positive and negative sentiment summaries from employees and customers as identified with natural language processing tools.
  • Get to know the composition of employees such as their level of qualifications, skill and their hierarchical position across different benchmarks.
  • Identify the level of employee growth among competitors.
  • Understand employee's level of satisfaction with their compensation packages.
  • Survey the surroundings to understand the geographic competitiveness.
  • Explore the difference in ratings across states and counties.
  • Get an understanding of the sentiment as it relates to different categories.
  • Identify some of the key financial metrics and patterns leading to company success.
  • Compare competitor's website and social media stats.
  • Get an understanding of each firm's online footprint and how it changes over time.
  • Get an overall rating of the firm at present and historically to gauge possible future rating changes.
  • Gain a better understanding of customers both locally and nationally.
  • Obtain a better understanding of the interview process and other details.
  • Identify competitor's top products and categorical prices.

Report

Development

The report will grow dynamically over time and eventually become more prescriptive in nature.

  • In the future the report would attempt to predict prospective revenue and identify the portion of revenue generated from each location.
  • Furthermore, the different level of overall firm financial health would be estimated using machine learning techniques.
  • A further procedure would include the analysis of firm financial filings and financial statement readability along with anomaly detection.
  • A further 30 novel databases are to be compiled to estimate the level of corporate social responsibility of each firm.
  • Finally, the creation of an improved valuation model for firms that are not publicly traded and the addition of causal analysis.
  • Any additional forms of analysis as requested by the client. It is likely that for a more granular exploration would require internal data.

Running Your Own

  • Download Repository
  • Run scrapers with setup.py (only if you want to generate new data)
  • Install dependencies in requirements.txt
  • Run main.py
  • Note, this repository is big (4GB), it already contains data

More Repositories

1

industry-machine-learning

A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
Jupyter Notebook
6,804
star
2

financial-machine-learning

A curated list of practical financial machine learning tools and applications.
Python
3,354
star
3

machine-learning-asset-management

Machine Learning in Asset Management (by @firmai)
Jupyter Notebook
1,534
star
4

awesome-google-colab

Google Colaboratory Notebooks and Repositories (by @firmai)
Jupyter Notebook
1,385
star
5

data-science-career

Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
790
star
6

business-machine-learning

A curated list of practical business machine learning (BML) and business data science (BDS) applications for Accounting, Customer, Employee, Legal, Management and Operations (by @firmai)
Jupyter Notebook
709
star
7

pandapy

PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)
Python
541
star
8

deltapy

DeltaPy - Tabular Data Augmentation (by @firmai)
Jupyter Notebook
493
star
9

atspy

AtsPy: Automated Time Series Models in Python (by @firmai)
Python
479
star
10

pandasvault

Advanced Pandas Vault β€” Utilities, Functions and Snippets (by @firmai).
Python
389
star
11

python-business-analytics

Python solutions to solve practical business problems.
Jupyter Notebook
375
star
12

datagene

DataGene - Identify How Similar TS Datasets Are to One Another (by @firmai)
Jupyter Notebook
184
star
13

tsgan

Time-series Generative Adversarial Networks (fork from the ML-AIM research group on bitbucket))
Python
105
star
14

mtss-gan

MTSS-GAN: Multivariate Time Series Simulation with Generative Adversarial Networks (by @firmai)
80
star
15

techniques

Jupyter Notebook and Python business intelligence tools and techniques. [Raw upload]
Jupyter Notebook
76
star
16

scrapers

Scrapers from a project in 2018. Yelp, Spyfu, Similarweb, Morningstar, Linkedin, Instagram, Inside, Glassdoor, Facebook, Eat24, Doordash, Angellist.
Python
72
star
17

ml-fairness-framework

FairPut - Machine Learning Fairness Framework with LightGBM β€” Explainability, Robustness, Fairness (by @firmai)
Jupyter Notebook
64
star
18

business-analytics-and-mathematics-python-book

Advanced Business Analytics and Mathematics with Python (by @firmai)
56
star
19

business-datasets

A selection of business datasets
15
star
20

firmai.github.io

Open Business Analytics and Data Science Research
JavaScript
15
star
21

python-for-finance

Jupyter Notebook
13
star
22

business-machine-learning-vendors

A directory of the top business machine learning vendors
13
star
23

financial-machine-learning-regulation

A look at regulatory challenges and recommendation in the age of AI. Investigating topics like monopoly formation, machine learning auditability, bias mitigation strategies and automated regulatory monitoring.
11
star
24

firmai

9
star
25

reddit-data-science-project-ideas

Reddit Data Science Project Ideas
8
star
26

simple-machine-learning-glossary

Simple Machine Learning and Data Science Definitions without Copyright
7
star
27

xaib

XAIB - Explainable AI in Business
Jupyter Notebook
5
star
28

quant-finance-seminars

Weekly Quant Finance Seminars
5
star
29

numfin

Numpy for Finance Examples
5
star
30

tflm

Advanced Transformations and Interactions for Linear Models using Hybrid Machine Learning Models and SHapley Additive exPlanations
Python
5
star
31

google-colab-website

FirmAI Labs - World's First Google Colab Website
4
star
32

random-assets

Jupyter Notebook
4
star
33

tabular-data-generators

A Collection of Cross-Sectional and Time-Series Generators
4
star
34

firmai_analytics

Website
HTML
3
star
35

datastat

Dataset Statistics to Compare Real or Training Data with Generated or Test Data
3
star
36

financial-pde-discovery

Financial PDE Discovery using Machine Learning
3
star
37

experimental-statistics

A repository of experimental statistical techniques that improve on "well-accepted" solutions.
3
star
38

demo-ml-ai-invest

2
star
39

numfy

Fast Vectorised NumPy Functions for Finance
2
star
40

ffood

FFOOD - Framework for Feature and Observation Outlier Detection using ML-based Residual Analysis Methods
Python
2
star
41

random-assets-two

2
star
42

bit

Forked template from Christoph Molnar, testing out website integration
HTML
1
star
43

sov.ai

AI Asset Management Research
1
star
44

admin

1
star
45

contributor

Medium Contributor Guidelines
1
star
46

plotsfinml

HTML
1
star
47

fairdata

A Python package that implements model-agnostic pre-and post-processing to mitigate unfairness in machine learning prediction
1
star