Edd Webster Football Analytics
Edd Webster, including a curated list of publicly available resources published by the football analytics community.
A space for football analytics projects by👋 About This Repository and Author
The README of this repository is a concise resources guide of learning materials, data sources, libraries, papers, blogs, , etc., created by all those that have made contributions to the open source football analytics community. This GitHub repository and resources list is always a work in progress, with new resources added semi-regularly. If you feel there's any resource(s) that I've missed, please feel free to create a pull request or send me a message on the links above and I'll get back to you as quick as I can!
If you like the repo, please feel free to give it a ⭐ (top right). Cheers!
For more information about this repository and the author, see the following:
📖 Table of Contents
Table of Contents
- About This Repository and Author
- Table of Contents
- Prerequisites
- Repository Structure
- Notebooks
- Data Visualisation and Tableau
-
Resources
- Other Resources Guides
- Getting Started with Football Analytics
- Data
- Data Sources
- Event data
- Tracking data
- Broadcast Tracking data
- Aggregated Player/Team Performance data
- Team Rating data
- Physical data
- Results and Matchsheet data
- Financial, Valuation, and Transfer data
- Odds, Betting, and Predictions data
- Plotting tools
- Reference data
- Miscellaneous data
- Documentation
- Data Companies and Types
- Tutorials
- Libraries
- GitHub Repositories
- Apps
- Data Visualisation Resources and Tools
- Written Pieces
- Video
- YouTube Playlists
- YouTube Channels
- Video Analysis
- Webinars and Lectures
- Ted Talks
- Documentaries
- Match Highlights
- Other
- Podcasts
- Notable Figures and Twitter Accounts
- Events and Conferences
- Competitions
- Courses
- Jobs
- Discord / Slack Groups
- Key Concepts
- History of Football Analytics
- Expected Goals (xG) Modeling
- Web Scraping Football Data
- Tracking Data
- Pitch Control Modeling
- Passing Networks
- Possession Value (PV) Frameworks
- General
- Expected Threat (xT)
- Valuing Actions by Estimating Probabilities (VAEP)
- Goals Added (g+)
- On-Ball Value (OBV)
- Dixon Coles Modeling
- Player Similarity and Style Analysis
- Team Playing Style Analysis
- Player Rating
- Reinforcement Learning for Football Simulation
- Set Pieces
- Radars
- Recruitment Analysis
- Player Valuation Modeling
- Quantifying Relative Club and League Strength
- Tactics
- Game Win Probability Modelling
- Goalkeeper Analysis
- Citations
- Contributing
- Star Tracker
- Acknowledgements
🍴 Prerequisites
The only prerequisites for using this GitHub repo is that you have a computer, internet connection and the desire to learn more about football analytics.
The code in this GitHub repository is written in Python and uses the open-sourcelibraries listed below. Python, R, as well as most of these libraries can be obtained by downloading and installing Anaconda. Step-by-step guides to do this can be found for Windows here and Mac here, as well as in the Anaconda documentation itself here.
🌵 Repository Structure
The contents of this GitHub repository is organised as follows:
eddwebster/football_analytics
.
│
├── dashboards
│
├── data
│ ├── capology
│ ├── elo
│ ├── export
│ ├── fbref
│ ├── fifa
│ ├── guardian
│ ├── metrica-sports
│ ├── opta
│ ├── reference
│ ├── sb
│ ├── shots
│ ├── stats-perform
│ ├── stratabet
│ ├── tm
│ ├── touchline-analytics
│ ├── twenty-first-group
│ ├── understat
│ └── wyscout
│
├── docs
│ ├── centre-circle
│ ├── metrica-sports
│ ├── opta
│ ├── sb
│ ├── shots
│ ├── stratabet
│ └── wyscout
│
├── gif
│ └── fig
│
├── img
│ ├── club_badges
│ ├── eddwebster
│ ├── fig
│ ├── logos
│ ├── pitches
│ └── vizpiration
│
├── notebooks
│ │
│ ├── 1_data_scraping
│ │ ├── Capology Player Salary Web Scraping.ipynb
│ │ ├── FBref Player Stats Web Scraping.ipynb
│ │ └── TransferMarkt Player Bio and Status Web Scraping.ipynb
│ │
│ ├── 2_data_parsing
│ │ ├── ELO Team Ratings Data Parsing.ipynb
│ │ ├── StatsBomb Data Parsing.ipynb
│ │ └── Wyscout Data Parsing.ipynb
│ │
│ ├── 3_data_engineering
│ │ ├── Capology Player Salary Data Engineering.ipynb
│ │ ├── Centre Circle Opta CPL Data Engineering.ipynb
│ │ ├── FBref Player Stats Data Engineering.ipynb
│ │ ├── Opta #mcfcanalytics PL 2011-2012.ipynb
│ │ ├── StatsBomb Data Engineering.ipynb
│ │ ├── StrataBet Data Engineering.ipynb
│ │ ├── The Guardian Player Recorded Transfer Fees Data Engineering.ipynb
│ │ ├── TransferMarkt Historical Market Value Data Engineering.ipynb
│ │ ├── TransferMarkt Player Bio and Status Data Engineering.ipynb
│ │ ├── TransferMarkt Player Recorded Transfer Fees Data Engineering.ipynb
│ │ ├── Understat Data Engineering.ipynb
│ │ └── Wyscout Data Engineering.ipynb
│ │
│ ├── 4_data_unification
│ │ └── Unification of Aggregated Seasonal Football Datasets.ipynb
│ │
│ ├── 5_data_analysis_and_projects
│ │ │
│ │ ├── player_similarity_and_clustering
│ │ │ └── PCA and K-Means Clustering of 'Piqué-like' Defenders.ipynb
│ │ │
│ │ ├──tracking_data
│ │ │ ├── metrica_sports
│ │ │ │ └── Metrica Tracking Data EDA.ipynb
│ │ │ │
│ │ │ └── signality
│ │ │ ├── Signality Tracking Data Engineering.ipynb
│ │ │ └── Signality Tracking Data EDA.ipynb
│ │ │
│ │ └──xg_modeling
│ │ │ │
│ │ │ ├── shots_dataset
│ │ │ │ │
│ │ │ │ ├── chance_quality_modelling
│ │ │ │ │ ├── 1) Logistic Regression Expected Goals Model.ipynb
│ │ │ │ │ └── 2) XGBoost Expected Goals Model.ipynb
│ │ │ │ │
│ │ │ │ └── metrica-sports
│ │ │ │ └── Metrica Sports.ipynb
│ │ │ │
│ │ │ └── opta_dataset
│ │ │ └── Training of an Expected Goals Model Using Opta Event Data.ipynb
│ │ │
│ └── 6_data_visualisation
│
├── research
│ ├── papers
│ └── slides
│
├── scripts
│
├── spreadsheets
│
└── video
📔 Notebooks
The code in this repository is mostly written in Jupyter notebooks or Python scripts, organised in the following workflow:
- Webscraping
- Data Parsing
- Data Engineering
- Data Unification
- Data Analysis - projects include working with Tracking data, constructing VAEP models (as introduced by SciSports), building xG models using Logistic Regression, Random Forests and Gradient Booested Decision Tree algorithms such as XGBoost, and analysing player similarity using PCA and K-Means clustering.
📊 Data Visualisation and Tableau Dashboards
For Tableau dashboards produced using the data engineered in the notebooks in this repository, please see my Tableau Public profile: public.tableau.com/profile/edd.webster.
Example Tableau dashboards:
- 2018 FIFA Men's World Cup
- FA WSL
- ‘Big 5’ European leagues
- EFL
- StrataBet Chance creation
- Opta #mcfcanalytics (see #mcfcanalytics).
📑 Resources
🔖 Other Football Analytics Resources Guides
Credit to the following resources that were all used to plug gaps in this resources guide once it was published:
analytics-handbook
GitHub repo by Devin Pleuler- a GitHub repo for getting started in soccer analyticsawesome-football
by football.db (Gerald Bauer) - a collection of awesome football datasetsawesome-football-analytics
by Diego Pastorawesome-soccer-analytics
by Matias MasciotoguideR
by Dom Samangy - a Google spreadsheet with 200+ R resources, 100+ Python tutorials, 30+ packages, 25+ accounts to follow, 10 cheatsheets, and several free books & blogs. GitHub repo [link]- Jan Van Haaren's Soccer Analytics Reviews:
soccer-analytics-resources
Github repo by Jan Van Haaren
🏃 Getting Started with Football Analytics
Good resources for those new for the use of data in football:
- Articles and blog posts:
- Getting into Sports Analytics and Getting into Sports Analytics 2.0 by Sam Gregory
- What do you need to learn to work in football analytics? by David Sumpter for Barça Innovation Hub
- Getting Into Scouting by Luke Griffin
- You Want to be a Performance Analyst? by Rob Carroll
- An Introduction to Soccer Analytics by John Muller
- Introduction to Analytics in...Soccer by Valentin Stolbunov
- [Sports Analytics Advice]((https://linktr.ee/sportsanalyticsadvice) by Jan Van Haaren
- Some of the useful resources in Football Analytics
- Soccer Analytics 101 by Kevin Minkus (using Web Archive)
- A Career in Football Analytics blog posts by Benoit Pimpaud. Check out his Substack newsletter From An Engineer Sight. See also the accompanying Twitter thread by Jan Van Haaren that discusses these posts [link]
- Football Reference 101 — Finding your way through a gold mine by Ninad Barbadikar
- Mikhail Zhilkin: How to hire your first data scientist by Training Ground Guru
- Gerard Moore on the "challenging but extremely rewarding" life" of a professional football analyst for Twenty3
- How to get started in data and the football industry by Liam Henshaw
- How to get into football analysis by La Notice
- Getting Started with Football Analytics by OddAlerts
- Want to Learn Football Analytics? by Irfan Alghani Khalid
- How to get a job in Sports Analysis... by Chris Gill
- 7 Easy Steps to Get Started in Football Data & Analytics by Jobs in Football
- 11 tips to get started in the Football industry by Jobs in Football
- A Friendly Introduction to FPL Analytics by Sertalp B. Çay
- GitHub repositories:
- Twitter threads:
- Measureables (Brendan Kent)'s Sports Analytics 101 unrolled Twitter thread [link]:
- Tom Worville's Twitter thread
- Will Spearman's Twitter thread
- Jan Van Haaren's Twitter thread for free, open-source software libraries for computing and visualising advanced soccer analytics metrics
- Measureables (Brendan Kent)'s Twitter thread for resources for learning to code in the context of sports analytics [link]
- Sancho Quinn's unrolled Twitter thread for learning more about video/performance analysis [link]
- Ninad Barbadikar's 'big football analytics' Twitter thread for getting started with football analytics [link]
- McKay Johns's Twitter threads for the best resources in football analytics [link] and [link]
- Joe Gallagher's Twitter thread for the best resources to get started [link]
- Sam Goldberg's Twitter thread for "lessons American Soccer Analysis wish we knew prior to working in sports analytics." [link]
- Floris Goes-Smit's Tweet's:
- Mathew Barlowe's Twitter thread for "how to get into the sports analytics industry" [link]
- Aaron Moniz's Tweet and responses [link]
- LinkedIn Posts:
- WHERE TO LEARN FOOTBALL ANALYTICS? by Irfan Alghani Khalid
- The following LinkedIn posts by Hadi Sotudeh:
- Videos:
- Friends of Tracking videos:
- How to become a football data scientist with Pascal Bauer, Javier Fernández, Sudarshan 'Suds Gopaladesikan, Fran Peralta, and David Sumpter
- Tools for getting started in football analytics. talk for Friends of Tracking with David Sumpter, Laurie Shaw, Pascal Bauer, Sudarshan 'Suds' Gopaladesikan and Fran Peralta
- What do data analysts and data scientists do at a football club? talk for Friends of Tracking with David Sumpter, Ashwin Raman, Hannah Roberts, Sam Gregory, and Rob Suddaby
- HANIC Panel "How to get into Sports Analytics & Media + Analytics" with Alison Lukan, Sarah Bailey, Harman Dayal, Asmae Toumi, and Mike Johnson
- Careers in Sports Analytics
- Chris Gill's Sports Analysis YouTube Channel, including videos for Writing the perfect CV, How to get a job in sports analysis, LinkedIn tips, amoungst other videos added regularly
- Friends of Tracking videos:
- Glossaries:
- The Athletic’s football analytics glossary: explaining xG, PPDA, field tilt and how to use them by Mark Carey and Tom Worville (requires subscription)
- Stat Glossary by Ashwin Raman
- Football Analytics Glossary by Ashwin Raman and Mark Thompson
- Expected goals, expected assists, pressures, carries, high turnovers and more | Advanced stats explained by Sky Sports Football
- Podcasts:
- Fanalytics podcast with Mike Lewis - Getting Your Foot in the Door with Sean Steffen
- What is sports analytics? episode of the Measureables podcast by Measureables (Brendan Kent)
💾 Data
ℹ️ Data Sources
Publicly available data sources and datasets relating to football, from Tracking data, Event data, aggregated player performance data, detailed match statistics, injury records and transfer values, and more.
Data sources that have been used in the code and analysis in this repository can be found in the data
subfolder of this repository or in Google Drive (due to GitHub's 100mb file limit) [link]. All code however in this repository should enable you to scrape, parse, and engineer the datasets as per the output used for analysis and visualisations featured.
To learn more about the different types of data available, such as Event and Tracking data, see the "Where can I get data?" section of Devin Pleuler's soccer_analytics_handbook
[link].
For a quick primer of the free football data resources available, see the following Twitter thread by James Nalton [link].
Event data
Event Data is labelled data for each on-the-ball event that takes place during a game. The data is manually collected from television footage. To learn more about the data collection, see the following video [link].
Each match of event data has around 2-3 thousand individual events (rows), depending on the provider.
The main providers of this data are StatsBomb, Stats Perform (formally Opta), and Wyscout.
Name | Comments | Source / method(s) to get the data |
---|---|---|
StatsBomb Open Data |
|
StatsBomb Open Data GitHub Repo |
StrataData by StrataBet | Chance shooting data provided | No longer made available (since 2018), however, it can be found in GitHub repos of old analysis (including this one) [link]. |
Soccer Video and Player Position Dataset | Dataset of elite soccer player movements and corresponding videos, made available by the University of Oslo. See the accompanying paper [link] | [Link] (appears to no longer be working) |
Opta | Event data for 20+ leagues including the 'Big 5' European leagues, some of which go back to the 09/10 season, | Data available through scraping WhoScored? Match Centre through the following methods:
|
Opta (11/12 sample dataset) | Match-by-match aggregated player performance data for the 11/12 season and F24 Event data for a 11/12 match of Manchester City vs. Bolton Wanders as part of the #mcfcanalytics initiative | No longer made available (since 2012), however, it can be found in GitHub repos of old analysis (including this one). |
Understat | Shooting and meta data including xG values for the 'Big 5' European leagues and Russian Premier League | This data can be accessed through the following:
|
Wyscout | Event data for the 17/18 season for the 'Big 5' European leagues, Euro 2016 Chanpionship, and 2018 World Cup made available by Luca Pappalardo, Alessio Rossi, and Paolo Cintia. See their paper A public data set of spatio-temporal match events in soccer competitions. | Figshare |
Tracking data
Tracking Data records the x and y coordinates of every player on the field, as well as the ball, a number of times per second (usually 10-25). For this reason, the dataset is quite large, much larger than event data at around 2-3 million rows per game.
The data is collected by cameras installed in a stadium and is therefore not widely available, with teams usually only having access to the data in their own league.
The main providers of this data are Second Spectrum, STATS Perform, Metrica Sports, and Signality.
Name | Comments | Source / method(s) to get the data |
---|---|---|
Last Row Tracking-like data by Ricardo Tavares | Tracking-like data collected by Ricardo Tavares. See the Liverpool Analytics Challenge for which this data was used (winners discussed on Friends of Tracking [link]). | GitHub repo |
Metrica Sports Sample Tracking and corresponding Event data | Three sample matches of synced event and tracking data. For code to work with this data including Pitch Control modellng, see the LaurieOnTracking GitHub repo by Laurie Shaw and the corresponding Friends of Tracking tutorials. |
GitHub repo |
Signality Tracking data | Three matches of tracking data from the Allsvenskan - Hammarby vs. IF Elfsborg (22/07/2019), Hammarby 5 vs. 1 Örebrö (30/09/2019), and Hammarby vs. Malmö FF (20/10/2019). | This data was made available as part of the 2020 Mathematical Modelling of Football course. The password to download the data is not publicly available, but can be found in the Uppsala Mathematical Modelling of Football Slack group [link]. For access, contact Novosom Salvador Twitter and [email protected], or feel free to contact myself. Note, that the 2nd half of the Hammarby-Örebro match is incomplete. |
Broadcast Tracking data
Broadcast Tracking is collected from broadcast footage using computer vision techniques. Unlike in-stadium tracking data, the dataset is not complete and missing players out of shot of the broadcast footage. However, the great benefit is that the data collected is much cheaper and the coverage for what leagues are available is much greater which is extremely useful for tasks such as recruitment analysis.
The main providers of this data are SkillCorner and Sportlogiq.
Name | Comments | Source / method(s) to get the data |
---|---|---|
SkillCorner broadcast Tracking data | 9 matches of broadcast tracking data, including matches from 2019/2020 for the league champions and runners up in English Premier League, French L1, Spanish LaLiga, Italian Serie A and German Bundesliga. To find out more about broadcast tracking data and its use cases, see the following Medium article [link]. | GitHub repo |
Aggregated Player/Team Performance data
Name | Comments | Source / method(s) to get the data |
---|---|---|
DAVIES modelling data | Estimated player evaluation data by Sam Goldberg and Mike Imburgio for American Soccer Analysis. To learn more about DAVIES, see the following blog post [link]. | Shiny App |
FBref season-on-season aggregated player performance data provided by StatsPerform. | Aggregated player performance data for the following competitions:
|
Note: there was a change in the data provider used by FBref for their statistics in October 2022, from StatsBomb to StatsPerform. Therefore, the following scraping code is split into current working solutions and archived solutions:
|
Stats Perform and Centre Circle Canadian Premiere League data | Aggregated player performance data | Google Drive |
Team Rating data
Name | Comments | Source / method(s) to get the data |
---|---|---|
Elo club rankings | Elo ratings for club football based on past results to allow for estimation of each club's strength, allowing predictions for the future. | Data available through:
|
Euro Club Index | Ranking of the football teams in the highest division of all European countries, that shows their relative playing strengths at a given point in time, and the development of playing strengths in time. To see more about the methodology used to calculate these rankings, see the following page [link] | Link |
FiveThirtyEight Club Ranking | Global Club Soccer Rankings. How 637 international club teams compare by Soccer Power Index | Data available through:
|
Opta Power Rankings | Opta Power Rankings | Data available through:
|
UEFA Club Coefficients | UEFA club coefficient rankings based on the results of all European clubs in UEFA club competition. | Data available through: |
World Football / Soccer Clubs Ranking | Club ranking website | Link |
Physical data
Name | Comments | Source / method(s) to get the data |
---|---|---|
Bundesliga physical data | Bundesliga player stats, powered by AWS | Link (not scraped into a CSV) |
Results and Match Sheet data
Name | Comments | Source / method(s) to get the data |
---|---|---|
2018 FIFA World Cup Rosters | Goals, caps, club, and date of birth for players on 2018 FIFA World Cup rosters. Source: data.world | Excel |
engsoccerdata | English and European soccer results 1871-2017 | GitHub repo |
FIFA World Cup Match Results | Matchups and results of FIFA World Cup matches from 1930 - 2014. Source: data.world | Excel |
FotMob | Dataset including team and play stats including xG and post-shot xG. | This data can be scraped using:
|
Football Lineups | A database of teams tactics and formations crowdsourced by the users. | Link |
international_results |
Repository of results of 44,353 results of international football matches starting from the very first official match in 1872 up to 2022. | GitHub repo |
smarterscout | Scouting and player rating information platform for evaluating the performance of football players around the world. The platform was developed by Dan Altman at North Yard Analytics to assess players' contributions to winning, their playing style, and their skill level. Note: this is a subscription service. | Link |
SofaScore | Live scores, lineups, standings, heatmaps, and basic teams, coaches and player data | Link |
Soccerway | Match sheet data | Link |
Financial, Valuation, and Transfer data
Name | Comments | Source / method(s) to get the data |
---|---|---|
Capology | Player salaries | See the Capology Player Salary Web Scraping notebook for Python code to scrape Capology data or access saved CSV files in data subfolder |
KPMG Football Benchmark | player valuation data | |
The Price of Football Master Spreadsheet | data from the finance/business aspect of football by Kieran Maguire | Link |
spotrac | Player contracts, salaries, and transfer information for the Premier League, MLS, and NWSL | |
TransferMarket | Player bio, contractual, and estimated value data | This data can be accessed through the following:
|
Guardian Player Transfer data | Collated by Tom Worville (see Tweet [link]) | GitHub |
Odds, Betting, and Predictions data
Name | Comments | Source / method(s) to get the data |
---|---|---|
BetExplorer | odds data | Link |
FiveThirtyEight Soccer Predictions database | football prediction data | Link |
Football-Data.co.uk | free bets and football betting, historical football results and a betting odds archive, live scores, odds comparison, betting advice and betting articles | Link |
International football results from 1872 to 2020 | an up-to-date dataset of over 40,000 international football results by Mart Jürisoo | Link |
Plotting Tools
See Mark Wilkin's Twitter thread for more about how to plot your own event data [link]:
- Football (soccer) pitch tracker by John Burn-Murdoch
- Expected Goals Event Logger by Ben Torvaney
- Chalkboard by Neil Charles
Reference data
Name | Comments | Source / method(s) to get the data |
---|---|---|
xT grid | League-wide Expected Threat (xT) values from the 2017-18 Premier League season (12x8 grid) determined by Karun Singh. For more information about about xT, see Karun's blog post [link] | Link |
EPV grid | Grid of Expected Possession Values determined by Laurie Shaw. See the following lecture for more information [link] | Link |
Zones of a pitch | Breakdown of a pitch into zones, for use with visualisation.Created by Rob Carroll | Link |
Miscellaneous Data
Name | Comments | Source / method(s) to get the data |
---|---|---|
awesome-football ⭐ by football.db (Gerald Bauer) |
A collection of awesome football (national teams, clubs, match schedules, players, stadiums, etc.) datasets | GitHub repo |
Data Hub Football data | Link | |
European Soccer Database | 25k+ matches, players & teams attributes for European Professional Football | Link |
FIFA 15-22 player rating data | Scraped from SoFIFA by Stefano Leone | Link |
FIFA 18 Player Ratings | 17k+ players, 70+ attributes extracted from FIFA 18, provided by sofifa | Link |
FootballData |
"A hodgepodge of JSON and CSV Football data" | GitHub |
footballcsv |
Historical soccer results in CSV format | Link |
football.db | A free and open public domain football database & schema for use in any (programming) language (e.g. uses plain datasets) | Link |
Football xG | Link | |
Guide to Football/Soccer data and APIs by Joe Kampschmid | Link | |
My Football Facts | Link | |
Physio Room | Link | |
PlusMinusData | play by play data from espn.com | Link |
Rec.Sport.Soccer Statistics Foundation | Historical league tables and football results | Link |
RoboCup Soccer Simulator | RoboCup Soccer Simulator Data | Link |
Squawka | Link | |
Stat Bunker | Link | |
Tableau data resources | including sports data | Link |
Transfer League | Link | |
Twelve Football | Link | |
wosostats | Women's soccer data from around the world | Link |
📄 Documentation
All documentation saved locally in the documentation subfolder, including:
Data Types and Companies
Data Providers
- DataFactory
- InStat
- K-Sport
- Opta Sports
- smarterscout
- Sportlogiq
- Sport radar
- Stats Peform
- StatsBomb
- StrataBet (now defunct)
- TransferMarket
- understat
- WhoScored? (data provided by Opta Sports data)
- Wyscout
Tracking
- Catapult
- ChyronHego
- Metrica Sports
- Second Spectrum
- Signality
- SkillCorner
- STATS SportVU
- Kinexon
- Oliver
Videos / Performances Analysis
- dataFootball
- ERIC Sports
- Futbolytics
- hudl
- LBi Dynasty
- LongoMatch
- MEDIACOACH
- nacsport
- Olocip
- SICO
- Wise
Consultancy / Service Providers
🧑🎓 Tutorials
Python
- Soccermatics course taught by David Sumpter, a comprehensive education on how to work with football data.
- Friends of Tracking YouTube channel [link] and Mathematical Modelling of Football course by Uppsala University [link]. The GitHub repo with all code featured can be found at the following [link], taught by David Sumpter. Lectures of note include:
- Laurie Shaw's Metrica Sports Tracking data series for Friends of Tracking - Introduction, Measuring Physical Performance, Pitch Control modelling, and Valuing Actions. See the following for code [link]
- Lotte Bransen and Jan Van Haaren's 'Valuating Actions in Football' series - Valuing Actions in Football: Introduction, Valuing Actions in Football 1: From Wyscout Data to Rating Players, Valuing Actions in Football 2: Generating Features, Valuing Actions in Football 3: Training Machine Learning Models, and Valuing Actions in Football 4: Analyzing Models and Results. See the following for code [link]
- David Sumpter's Expected Goals webinars - How to Build An Expected Goals Model 1: Data and Model, How to Build An Expected Goals Model 2: Statistical fitting, and The Ultimate Guide to Expected Goals. See the following for code 3xGModel, 4LinearRegression, 5xGModelFit.py, and 6MeasuresOfFit
- Peter McKeever's 'Good practice in data visualisation' webinar. See the following for code [link]
- Serio Llana's step-by-step guide for creating Passing Networks [link]
- Luca Pappalardo and Paolo Cintia's step-by-step guide to exploring the Wyscout Event data - Video 1 and Video 2. See their paper A public data set of spatio-temporal match events in soccer competitions.
- Soccer Analytics Handbook by Devin Pleuler. See tutorial notebooks (also available in Google Colab): 1. Data Extraction & Transformation, 2. Linear Regression, 3. Logistic Regression, 4. Clustering, 5. Database Population & Querying, 7. Data Visualization, 8. Non-Negative Matrix, 9. Pitch Dominance, 10. Convolutional Neural Networks
- FC Python's Python tutorials [link]
- McKay Johns YouTube channel
soccer_analytics
by Kraus Clemens - a Python project that facilitates the starting point for analytics- Son of a Corner's matplotlib and Python tutorials [link]. If you like their work, consider supporting by signing up as a member [link].
- xG Rolling Charts- a matplotlib tutorial
- Effective Bar Charts- a matplotlib tutorial
- Tiled Shot Maps- a matplotlib tutorial
- Figuring Figures Out, Part 1- a matplotlib tutorial
- Figuring Figures Out, Part 2- a matplotlib tutorial
- Beautiful Tables - a matplotlib tutorial
- An Intro to Web Scraping Efficiently - a Python tutorial
- An Introduction to Monte Carlo Simulation - a Python Tutorial
- DataViz, Python, and matplotlib tutorials by Peter McKeever [link] - his website is currently in redevelopment, with many of the old tutorials not currently available (28/02/2021). Check out his revamped How to Draw a Football Pitch tutorial
- Get Goalside's Python & FBref data tutorials by Mark Thompson. Check out his blog Get Goalside!
- Python for Fantasy Football series by Fantasy Futopia (Thomas Whelan). This series covers the basics of working with data in Python, working with APIs and parsing StatsBombJSON data, scraping data using Beautifulsoup and Selenium, and Machine Learning with scikit-learn and XGBoost, See GitHub repo for all code [link]
- Football Data Visualizations - Passing Networks by Karol Działowski - a great blog post on how to create passing networks from first principles, using Opta Event data acquired from WhoScored. This data is then visualised using matplotlib.
- Tech how-to: build your own Expected Goals model by Jan Van Haaren and SciSports. See the Bitbucket repository and GitHub for all code [Bitbucket] and [GitHub]
Football-Analytics-With-Python
by Anmol Durgapal- Training Ground Guru Python Masterclasses:
- Introduction To Python Masterclass Jamie Dos Anjos (FC Python)
- Python Match Analysis Masterclass Jamie Dos Anjos (FC Python)
R
- FCrSTATS tutorials [link]
- Mark Wilkins's BiscuitChaserFC blog. See his Twitter thread of R tutorials [link]. Tutorials include: Shot Maps In R With StatsBomb Data, [Getting Started with StatsBombData in R](https://biscuitchaserfc.substack.com/p/getting-started-with-statsbomb-data), Understat Meta/Shot Data, FBref - Pressures
- StatsBomb R Guide 2.0 by Euan Dewar
- Sudarshan Golaladesikan's R series for Friends of Tracking - Getting Started with R + StatsBomb | Analyzing Squad Rotation & Clustering Passes and creating interactive shot maps - Part 1/3), Part 2/3, (I believe no part 3 currently). See the following for code [link]
- Creating a pass flow graph in R by Abhishek Mishra.
Tableau
Check out the Tableau for Sports Discord server organised by Ninad Barbadikar, to interact with a community of Tableau developers
For a YouTube playlist of Tableau-football videos and tutorials that I have collated from various sources including the Tableau Football User Group, Rob Carroll, Tom Goodall, and Ninad Barbadikar, see the following [link].
- Tableau Football User Group - featuring Eva Murray, Oscar Hall, James Smith, Rob Carroll, Tom Goodall, Ravi Mistry, Adam Cook, Hannah Roberts, Chris Baker, Rusty Parker, Ruud van Elk, Johannes Riegger, and Sébastien Coustou
- March 2020, part 1
- March 2020, part 2 with Tom Goodall
- May 2020
- July 2020
- December 2020
- Match 2021
- September 2021
- November 2022 with Sébastien Coustou
- Tableau for Sport by Rob Carroll - completely free tutorials for using football data in Tableau, including creating shot maps, pass maps, pass matrxces, xG race-chart timelines. See also his YouTube playlist [link]
- Tom Goodall's Tactics, Training & Tableau: Football Tableau User Group. Check out his Football Tableau training courses [link. Check out also as an unrolled Twitter thread, how he uses Tableau to create an opposition report for Burton vs. Gillingham on 9th January 2021 [link]
- Training Ground Guru Tableau Masterclass by Tom Goodall
- [Visually Analysing Direct Set Pieces in Football using StatsBombData, R and Tableau](https://www.biztory.com/blog/visually-analysing-direct-set-pieces-in-football-using-statsbomb-data-r-and-tableau) by James Smith
- CJ Mayes's Tableau blog, with posts including how to make a Radial Tournament Bracket, Understat data in Tableau, and Player Metrics Pizza plots
- Tableau Tunnel series by Ninad Barbadikar. Check out his Twitter thread [link] and his YouTube channel [link]:
- Medium blog posts by Sagnik Das - Tableau Guide #1: Making Shot Maps, Tableau Guide #2: Making Pass Maps, Tableau Guide #3: Convex Hulls, Tableau Guide #4 : Football Radars
- Medium blog posts by Rahul Iyer - Guide to Creating Passing Networks in Tableau , Guide to Creating Pass Sonars in Tableau, Guide to Creating Hexagonal Shot Maps in Tableau
- A Guide to Player Comparison Bar Graphs (And How I Make Them) by Ashwin Raman
- Creating a Shop Map by James Vaughan
- How to create Football Pitches/Goals as Backgrounds in Tableau by James Smith
- Creating Waffle Charts in Tableau by Harsh Krishna
- Exporting your pass flow map to Tableau by Abhishek Mishra
- Plotting Starting XIs in Tableau by Marton Balla
- Tableau Public profiles of note (not exhaustive by any means): Ashwin Raman, Brian Prestidge, Carlon Carpenter, CJ Mayes, Eva Murray, Foot en Stats, James Smith, James Vaughan - see his Twitter thread of projects [link], Mark Carey, Matt Trevillion, Ninad Barbadikar - see his Tableau Tunnel series, Oscar Hall, Paul Riley, Peter McKeever, Rahul Iyer, Ravi Mistry, Rob Carroll, Rob Suddaby, Sathish Prasad V.T - see his Canadian Premier League post-match reports [link], Sancho Quinn, Sushruta Nandy, Tom Worville
PowerBI
For a YouTube playlist of Power BI-football videos and tutorials that I have collated from various sources including Futbol AnalysR and PowerBI for Sports, see the following [link].
- Futbol AnalysR by Josh Trewin - for PowerBI tutorials. Check out his website [link]
- PowerBI for Sports by Roberto D'Onofrio Rondón
- Training Ground Guru PowerBI Masterclass by Harriet Eastham
SQL
Excel
PowerPoint
- Maram AlBaharna's Medium post - Yes, Powerpoint: xG Trend Line
- Luke Griffin's pitch graphics - slides. Drop him a donation of PayPal if you're using his work [link]. See original Tweet [link]
- Tony Bambrick's short video describing the process of creating an animated tactics board using PowerPoint. See original Tweet [link].
🏛️ Libaries
GitHub libraries that are considered to be 'Top rated' are those with 50 or more stars (at the time of writing) and have been indicated with a star emoji (⭐).
For a full list of Football Analytics GitHub repositories and libraries, see the following list on GitHub [link].
Python
codeball
by Metrica Sports - data driven tactical and video analysis of soccer gamesFootball Packing
- a Python package to calculate packing rate for a given pass in football by Samira Kumar. This is a variation of the metric created by Impectkloppy
⭐ - a package for standardising tracking and event data by Koen Vossen and Jan Van Haaren. See the YouTube tutorial [link]floodlight
by floodlight-sports - package for streamlined analysis of sports data. It is designed with a clear focus on scientific computing and built upon popular libraries such as numpy or pandas. See the following documentation [link]matplotsoccer
- a Python library for visualising soccer event data by Tom Decroosmplsoccer
⭐ - a Python library for plotting football pitches in matplotlib by Andrew Rowlinsonnayra
⭐ - API that allows you track soccer player from camera inputs, and evaluate them with an Expected Discounted Goal (EDG) Agent. See the Evaluating Soccer Player paper by Paul Garnier and Théophane Gregoirnorthpitch
- a Python football plotting library that sits on top of matplotlib by Devin PleulerPySport
includingPySport Soccer
- collection of open-source sport packages including many of those mentioned in this section, by Koen VossenPyWaffle
- an open source, MIT-licensed Python package for plotting waffle charts by Peter McKeeverScraperFC
⭐ by Owen Seymour - a Python package to scrape data from FiveThirtyEight data, FBref, Understat, Club Elo, Capology and TransferMarkt. Previously scraped Opta event data through the WhoScored? match center (functionality now removed but see old versions and GitHub repos to find this code)Scrape-FBref-data
⭐ by Parthe Athale, which in turn was updated from Christopher Martin's repository - Python library to scrape FBref datastatsbombapi
- a Python API wrapper and dataclasses for StatsBomb datastatsbombpy
- a Python library written by Francisco Goitia to access StatsBomb datastatsbomb-parser
⭐ by Imran Khan - Python library to convert StatsBomb's JSON data into easy-to-use CSV formatsocceraction
⭐ - a Python library for valuing the individual actions performed by soccer players. Includes an Expected Threat (xT) implementation by Tom Decroos et. al.soccer_xg
⭐ by ML KU Leuven- a Python package for training and analyzing expected goals (xG) models in footballsoccerdata
⭐ - scrape soccer data from Club Elo, ESPN, FBref, FiveThirtyEight, Football-Data.co.uk, SoFIFA and WhoScored by Pieter Robberechtssoccerplots
⭐ - a Python package that can be used for making visualisations for football analytics by Anmol Durgapal. Now part of themplsoccer
packagesync.soccer
⭐ by Marek Kwiatkowski - a Python package to synchronise football datasets, so that an event in one dataset is matched to the corresponding event or snapshot in the other. This repository contains an implementation that aligns Opta's (now Stat Perform) F24 feeds to ChyronHego's Tracab files. See the following blog post for methodology [linktmscrape
by danzn1 - a Python TransferMarkt webscrapertyrone_mings
⭐ by FCrSTATS - a Python TransferMarkt webscraperunderstat
⭐ by Amos Bastian - a asynchronous Python package for webscraper for Understat shooting and player meta data.
R
ggsoccer
⭐ by Ben Torvaney - a soccer visualisation library in RggshakeR
⭐ by Abhishek Mishra - an analysis and visualisation R package that works with publicly available soccer data. See the following documentation [link]StatsBombR
⭐ - an R package to easily stream StatsBomb data from the API using your log in credentials or from the Open Data GitHub repository cost free into RsoccerAnimate
⭐ - an R package to create 2D animations of soccer tracking datasoccermatics
⭐ by Joe Gallagher - an R package for the visualisation and analysis of soccer tracking and event dataworldfootballR
⭐ by Jason Zivkovic - a R package for extracting world football (soccer) data from FBref, TransferMarkt, Understat and fotmob (see guide on how to use this package [link])understatr
⭐ by ewenme - a R package to scrape Understat shooting and player meta data.
GitHub Repositories
The following GitHub repositories are either repos that I have found and recommend or are publicly available analytics work in the subject of football with at least 5 stars on GitHub (at the time of writing).
GitHub repositories that are considered to be 'Top Rated' are those with 50 or more stars (again, at the time of writing) and have been indicated with a star emoji (⭐).
For a full list of Football Analytics GitHub repositories and libraries, see the following list on GitHub [link].
Python
Action-Density
by Eliot McKinley - create action density plots from StatsBomb event dataanalytics-handbook
⭐ by Devin Pleuler - getting started with soccer analyticsapplied-examples
by Devin Pleuler - applied soccer analyticsASA-Win-Probability-Model
by Abhishek Sharma - implementing ASA's Win Probability Modelbalaban
by Will Thompson - a small Python package for estimating & plotting Bayesian hierarchical models for player-level football dataBig-Data-Cup-2021
by Big Data Cup - Big Data Cup 2022: Powered by Stathletes (Hockey but lots of good tracking data analysis). See the competition webpage [link]AIrsenal
⭐ by the Alan Turing Institute - AIrsenal is a package for using Machine learning to pick a Fantasy Premier League team.awesome-football
by football.db (Gerald Bauer) - a collection of awesome football (national teams, clubs, match schedules, players, stadiums, etc.) datasetsballer2vec
by Michael A. Alcorn - a multi-entity Transformer for multi-agent spatiotemporal modelingbetdaq
by Rory Cole - Python wrapper for Betdaq APIBirdsPyView
⭐ by Ricardo Tavares - a streamlit app to convert images to top-down view and get coordinates of objects, built for football data collectionchord-uefa-ec
by Guy Abel - visualising bilateral links between Euro squads and players clubs. See the accompanying blog post [link]CodaBonito
by Aditya Kothari (The Come On Man) - functions to aid football / soccer analysiscorner_stats
by Andrey Hesussavas - statistical project on soccer's cornersDataVizTutorial
by @shreyas7kha - a tutorial on how to make dashboards using matplotlib using data from FBref and Understatdata
⭐ by FiveThirtyEight - data and code behind the articles and graphics at FiveThirtyEightd3-soccer
by Pieter Robberechts - a D3 plugin for visualizing event stream soccer dataelm-soccer-tracker
by Ben Torvaney - track xy coordinates of events on a soccer pitchexpected_goals_deep_dive
by Andrew Puopoloexpected-goals-theiss
⭐ by Andrew Rowlinson - a repository for analysis on Expected Goals using StatsBomb and Wyscout dataFantasy-Premier-League
⭐ by Vaastav Anand - creates a .csv file of all players in the English Player League with their respective team and total fantasy pointsFantasyPremierLeague.py
⭐ by Guy Daher - football statistics for your mini leaguesFBref_EPL
by Christopher Martin - scrape player and team data from FBref (not updated since the switch from StatsBomb to StatsPerform data)FC-Python-Tutorials
by Jamie Dos Anjos (FC Python) - Collection of tutorials and resources from FC Pythonfifa-FUT-Data
by Ali Kafagy - web-scraping script that writes the data of all players from FutHead and FutBin to a CSV file or a DBfifa-world-cup-2022-prediction
⭐ by Frank AndradeFriends-of-Tracking-Data-FoTD
footballcsv
- Historical soccer results in CSV formatfootball.json
⭐ by football.db (Gerald Bauer) - free open public domain football data in JSON including English Premier League, Bundesliga, Primera División, Serie A and morefootball-analytics
⭐ by Edd Webster - THIS GITHUB REPO! A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster), including a curated list of publicly available resources published by the football analytics communityFootball-Analytics
by Daniel Azevedo - repository that explores some concepts of football analytics using event data and tracking data from different sourcesFootball-Analytics-With-Python
by Anmol Durgapalfootball-crunching
⭐ by Ricardo Tavares. Accompanying Medium posts [link] - analysis and datasets about football (soccer)football_data_analysis
by xzl524 - use math and data to understand footballfootball-data-analytics
⭐ by Jake KolliariFootballData
⭐ by Joe Kampschmid - a hodgepodge of JSON and CSV Football/Soccer datafootballdata
by skagr - a collection of wrappers over football data from various websites / APIsfootball_data
by Andre Brener - extract, analyze and visualize data for football teams and player performancesfootball-datasets
by Open Data - major Europe leagues data (England, Spain, Italy, Germany and France)football-machine-learning
by Hugo Mathien - machine learning scripts using the Kaggle European Soccer Database [link]Football-packing
by Samira Kumar - find the packing rate in football (soccer)Football_Prediction_Project
⭐ by Matt Haythornthwaite - project that will pull past game data from api-football, and use these statistics to predict the outcome of future premier league matches through machine learningfootball-predictor
⭐ by Andrew Carter - using a Deep Neural Network (DNN) to predict the results of Premier League Football Matches. See the accompanying article A Beginners Guide to Beating the Bookmakers with TensorFlowfootball-tracking-data-from-TV-broadcast
by Mohamed Yousssef - get football tracking data from TV broadcast using yoloV5Football-Visualisations
by casualfantasyfootballer - Rrpository for program files used to produced football analysis visualisationsfot-valuing-actions
⭐ by Jan Van Haaren and Lotte Bransen - presentations and tutorials that demonstrate how to value on-the-ball actions in football, as featured on Friends of Trackingfpl
⭐ by Amos Bastian - an asynchronous Python wrapper for the Fantasy Premier League APIfpl-ai
by Saheed Ayanniyi - a machine learning system that predicts FPL points of playersFPLbot
⭐ by Amos Bastian - a bot made for /r/FantasyPLFPL-Optimization-Tools
⭐ by Sertalp B. Çay - a collection offpl-optimiser
⭐ by Ben Torvaney - optimise FPL squadsfriends-of-tracking-viz-lecture
by Peter McKeever - repo to hold pdfs, notebooks, and data for the 'Good practice in data visualisation' webinar for Friends of TrackingFutbolAnalysis
by Enrique Gudino De Grote - code Enrique uses to create his visualsGet-Goalside-newsletter-public-code
by Mark THompson - code relating to Get Goalside newsletter postsgoogle-football-pytorch
by Tianhong Dai - a PyTorch implementation ofGoogle Research Football
Google Research Football
- an RL environment based on open-source game Gameplay Football, created by the Google Brain team for research purposeshow-to-expected-goals
by Jan Van Haaren and SciSports - repository for a how-to on training an expected-goals model for football. See the accompany article: 'Tech how-to: build your own Expected Goals model' abd Bitbucket repository [link]highlight_text
⭐ by @danzn1 - functions to plot text with highlighted substrings in matplotlibitscalledsoccer
by American Soccer Analysis - Python package that wraps the ASA APILaurieOnTracking
⭐ by Laurie Shaw - Python code for working with Metrica tracking dataLast-Row
⭐ by Ricardo Tavares - Last Row tracking data and codeLearningFromThePros
by Matt Wear - learning From the Pros: Extracting Professional Goalkeeper Technique from Broadcast Footagematplotlib-tutorials
by Son of a Corner - the Jupyter notebooks behind the Son of a Corner matplotlib tutorials [link]matplotsoccer
⭐ by Tom Decroos - package to visualize soccer dataMainZone
by Asian Football Analysis Zone (Ben Griffis) - standalone code, follow-along tutorials for analysis of Asian footballMathematical-Modelling-of-Football-Assignments
by Christian Gilson - contains code and content output for assignments for the Uppsala University Mathematical Modelling of Football course.mapping-match-events-in-python
⭐ by Luca Pappalardo, Alessio Rossi, and Paolo Cintia - code for working with and plotting Wyscout data as featured on Friends of Tracking. See the paper: A public data set of spatio-temporal match events in soccer competitionsmatchbook
by Rory Cole - Python wrapper for Matchbook APIMSc-Applied-Statistics-Project-Code
by Christian Gilson - contains code and content output for the thesis: A Systematic Approach to Strategic Football Team Insights & Player Recruitment Analysis by Christian GilsonMetrica-pitch-control
by Will Thompson - a Python implementation of Javier Fernández and Luke Bornn's Pitch Control model from their paper Wide Open Spaces: A statistical technique for measuring space creation in professional soccer (2018) and Will Spearman's Pitch Control model from his paper Beyond Expected Goals (2018). The respectively Google Colab notebooks are available [link] and [link]mezzala
by Ben Torvaney - models for estimating football (soccer) team-strengthmonderian-soccer-art
by Devin Pleuler - Python notebook for procedurally generating Mondrian-style soccer fieldsmpl-footy
by Abhishek Sharma - gallery for typical football plots created using matplotlib. See the following website [link]octopy
by Octosport - Python implementation of various soccer/football analytics methods such as Poisson goals prediction, Shin method, machine learning prediction. This is a companion python module for Octosport Medium blog [link]Optical_Player_Tracking_System_For_Soccer
by Pranav Nagarajan - a new and non-invasive method of tracking players in a soccer matchoutliers-football
by Andrew Rowlinson - identify young outliers in footballPass-Flow
- create animated flow velocity fields using passing data by Open Goal Apppassmaps
by Abhishek Sharma - creating simple passmaps using Statsbomb's datapassing-networks-in-python
⭐ by Sergio Llana - repository for building customisable passing networks with matplotlib, as featured on Friends of Tracking. The code is prepared to use both eventing (StatsBomb) and tracking data (Metrica Sports)PCA_Player_Finder
by Parth Athalepenalty
by Martin Eastwood - example code from www.pena.lt/ypenaltyblog
by Martin Eastwood] - a package tbat contains code from http://pena.lt/y/blog for working with footbal datapitchly
- Python Plotly wrapper for simple football plots by Vinay Warrierpinnacle
by Rory Cole - Python wrapper for Pinnacle Sports APIplayer-chemistry
by Jan Van Haaren and Lotte Bransen - About repository for the 'Player Chemistry: Striving for a Perfectly Balanced Soccer Team' paper. See the accompany presentation at Sloan 2020 [link]plottable
by @danzn1 - most pretty & lovely tables with matplotlibplayerank
⭐ by Paolo Cintia and Luca PappalardoPlayerDetection
⭐ by Kanan Vyas - player detection and ball detection in football matches using image processing(opencv)predict-soccer-ball-location
by Anar Amirli - "Predicting Ball Location From Optical Tracking Data" - contains data analysis, model development and testingPredicting-Football-Player-Transfer-Values
by Sanjit Varma - predicting how well players' on-field performance metrics can be used to predict their transfer valuespremier_league
by Matt Murray - data and regressions on Premier League teams from 2000-01 through to 2016-17PyFootballPitch
by @danzn1 - functions to draw a football pitch in various available styles for matplotlib and bokehpysoccer
by PlayeRank Sports Analytics - Pysoccer offers a standardized data model designed to make data-driven soccer analytics easy. See the following documentation [link]python-for-fantasy-football
⭐ by Fantasy Futopia (Thomas Whelan) - supplementary materials for the Python for Fantasy Football blog seriesquick-starter
by football.db (Gerald Bauer) - football.db quick starter datafile templates forworldcup.db
,euro.db
,england.db
, etc. - build your own football.db with the sportdb command line tool tutorials and recipes to use optimization for winning Fantasy Premier Leagueresearch
by Devin Pleuler - see the Reframing Post-Shot xG markdown postrl-bot-football
by Chintan Trivedi - an RL agent for the Google Football environmentscraper-whoscored
by Ramis Lao - WhoScored Scraperscraping-understat-dataset
⭐ by Douglas) - a repository with scraping code and soccer dataset from understat.comScrape-Whoscored-Event-Data
by Ali Hasan Khan - get match event data from whoscored.comSkillcorner_Opendata_Match_Analysis
by @danzn1 - overview into resources for analyzing the games, working with the data and showcasing applications of the broadcast tracking datasn-calibration
by SoccerNet - repository containing all necessary codes to get started on the SoccerNet Camera Calibration challenge. This repository also contains benchmark methodssn-reid
by SoccerNet - repository containing all necessary codes to get started on the SoccerNet Re-Identification challenge. This repository also contains benchmark methodssn-spotting
by SoccerNet - fepository containing all necessary codes to get started on the SoccerNet Action Spotting challenge. This repository also contains several benchmark methodssn-tracking
by SoccerNet - repository containing all necessary codes to get started on the SoccerNet Tracking challenge. This repository also contains benchmark methods to get startedsoccer_analytics
⭐ by Kraus Clemens - a Python project that facilitates the starting point for analyticssoccer-analytics-handbook
⭐ by Devin Pleuler - getting started with soccer analyticsSoccer-Analyses
⭐ by Ben Griffis - code to create football analytics visualssoccerapi
⭐ by @S1M0N38 - an unambitious soccer odds scrapersoccercpd
by Hyunsung Kim - source Code for "SoccerCPD: Formation and Role Change-Point Detection in Soccer Matches Using Spatiotemporal Tracking Data"SoccerNet-v3
by SoccerNet - repository contains a generic dataloader for the SoccerNet-v3 annotations and data. It allows to load the images in any chosen resolution, and parses the json annotation files to retrieve the bounding boxes, lines and correspondences between bounding boxes in a ready-to-use formatsoccermix
- a soft clustering technique based on mixture models that decomposes event stream data into a number of prototypical actions of a specific type, location, and direction by Tom Deccoos and ML-KULeuvensportsipy
by Robert Clark - a free sports API written for pythonSportsBook
⭐ by David Bristoll - a sports data scraping and analysis toolsportsdataverse-py
by SportsDataverse - SportsDataverse Python packagesport
⭐ by Piotr Skalski - examples of computer vision usage in sports lucpappalard) - a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer playerssportypy
by SportsDataverse - Python package for drawing regulation playing surfaces for several sportssoccerstan
by Ben Torvaney - reproduction of football models in StanSoFIFA
⭐ by Diogo Dantas - a SoFIFA webcrawler and Machine Learning predictionsoccerdata
by Bill Mill - a collection of soccer resultsSoccer-Players-Tracking
by Parsa Samadnejad - classifying and tracking players of a soccer exhibition based on recorded videos with different points of viewSoccerTrack
⭐ by Atom Scott - a dataset and Tracking Algorithm for Soccer with Fish-eye and Drone Videos.soc-viz-of-the-week
⭐ by Son of a Corner - viz of the weeksocplot
⭐ by StatsBomb - a Python package that helps you visualize StatsBomb football dataSoccermatics
by David Sumpter - repo dedicated for people getting started with Python using the concepts derived from the book SoccermaticsSoccermaticsForPython
⭐ - a template for creating a course with readthedocs.soccer_analytics
by Kraus Clemens - a Python project trying to facilitate and being a starting point for analytics projects in soccer including EDA of Event data, goal kick analysis, passing analysis, xG modelling, and an introduction to Tracking datasocceraction
by ML-KULeuven - convert soccer event stream data to SPADL and value player actions using VAEP or xTstatsbombapi
by Ben Torvaney - an extendable Statsbomb API wrapper for data-pipelinesstatsbomb-messi
by Benjamin Larrousse - deep dive into Statsbomb's Messi data biographystatsbombpy
⭐ by StatsBomb - a Python package to easily stream StatsBomb data into Python using your log in credentials for the API or free data from our GitHub page. API access is for paying customers onlythemepy
⭐ by Peter McKeever - an open source theme selector for matplotlibtracking_tagger
by @danzn1. See app [link]transfermarkt-datasets
⭐ by David Cereijo - extract, prepare and publish Transfermarkt datasets from thetransfermarkt-scraper
. See the following documentation [link]transfermarkt-scraper
by David Cereijo - collects data from TransferMarkt. See the following documentation [link]xG_Model_Workflow
by Ian Dragulet - a comprehensive guide to explaining, creating and using an xG model\xGils
by Christian Gilson - xGils package by Christian Gilsonunderstat-db
by Ben Torvaney - a project to scrape data from Understat and store it in a Postgres databaseValuing actions in football
by Lotte Bransen and Jan Van Haaren of SciSportswc-explorer-dash
by @hkhare42 - Dash app visualizing StatsBomb's FIFA World Cup 2018 dataweighted_voronoi
by Erika Munoz - weighted Voronoi for soccerWhoscored
by Brian Lan - scrape data from whoscored.com and draw insights from datawhoscoredscraper
by Canggih Puspo Wibowo - modules to scrape football data from WhoScoredwingback
by Ben Torvaney - backtesting team-strength modelswyscoutapi
by Ben Torvaney - an API client for the Wyscout API (v2 & v3) for Pythonwyscout-soccer-match-event-dataset
by Koen Vossen - repository contains the Wyscout data described in the 'A public data set of spatio-temporal match events in soccer competitions' paper, but processed to the regular Wyscout form. In this form it can be loaded by libraries likekloppy
youtube-videos
by McKay Johns - code used in his YouTube Videos [link]
R
asa-shiny-app
by American Soccer Analysis - American Soccer Analysis interactive application, built with Shiny. See the Shiny App at the following [link]betScrapeR
⭐ by David Sheehan - R package to scrape live sports betting oddsclub-rankings
by Tony ElHabr - historical daily Opta Power Rankings and FiveThirtyEight Global Club Soccer RankingsDecomposition-of-Expected-Goal-Models
by Mustafa Çavuş - repository consists the supplemental materials of the paper: "Decomposition of Expected Goal Models: Aggregated SHAP Values for Analyzing Scoring Potential of Player/Team]"eLPAR-soccer
by Konstantinos Pelechrinis - respository that includes all the code and data used for developing the expected league points above replacement for soccer as described in "Positional Value in Soccer: Expected League Points Added..."EPVDemo
by Dan Cervone - demo of NBA Expected Possession Value modelenglish-premier-league-datasets-for-10-seasons
by Tara Nguyen - clean datasets for 10 seasons of the English Premier League, including league tables, match stats, and head-to-head performancesengsoccerdata
⭐ by James Curley - English and European soccer results 1871-2022Expected-Goals-Model
by Kuba Michalczyk - example Expected Goals (xG) modelfootBayes
by Leonardo Egidi - an R package for many football modelsFootball-Analytics
⭐ by Sezer Unar - code to create football analytics visualisations including heat maps, pass solar plots, venn diagrams, xg lollipop charts, and moreFoundationsInR
⭐ by Sudarshan Golaladesikan - getting started with R using the StatsBomb datasetfootballdatr
by Ben Torvaney - a package to fetch data from football-data.co.ukfootball-data
by David Schoch - football (soccer) datasetsfcscrapR
by Ron Yurko - package to scrape soccer commentary and statistics from ESPNfplscrapR
⭐ by Rasmus Christensen - a package that enables those interested in Fantasy Premier League to perform detailed data analysis of the game, using the FPL's JSON API. The fplscrapR functions help R users collect and parse data from the Official Fantasy Premier League websitegoalmodel
⭐ by Jonas - package that lets you build prediction models for the number of goals scored in sport gamesmoneyline-team-strengths
by Ben Torvaney - code, analysis, and data for estimating team strengths from odds dataPassSonar
⭐ by Eliot McKinley - example code to produce PassSonars from event datapassing-networks
by Dato Fútbol (Ismael Gómez Schmidt) - a couple of functions to create customized passing networks with event data by Statsbomb and tracking data by Metrica SportPlayerFinishingOverviewShiny
by Harsh Krishna - a Shiny app that creates xG based finishing reports for soccer players using data from Understat. See the Shiny App at the following [link]regista
⭐ by Ben Torvaney - an R package for soccer modellingrfutbin
by Daniel Redondo Sánchez - package to get prices, stats and more information for FIFA Ultimate Team players in FutbinRteta
by Robert Hickman - public analytics toolbox in Rsports_viz
by Tony ElHabr - #rstats code and plots for visualisations that Tony posts on TwittersoccerAnimate
by Dato Fútbol (Ismael Gómez Schmidt) - R package to create 2D animations of soccer tracking datasoccer-analytics-library
by Lars Maurath - a collection of soccer analytics research. See the Shiny App at the following [linksoccercolleagues
by Matt Dray - find footballers' common team mates. See the following documentation [link]soccermatics
by Joe Gallagher - tools for visualisation and analysis of soccer tracking and event data- soccer_ggplots ⭐ by Ryo Nakagawara - soccer/football analytics blog posts & data viz from the World Cup, Premier League, Copa America, and beyond. Using ggplot2, ggsoccer, & more
SportsAnalytics
by CRAN Task Views and maintained by Benjamin Baumer, Quang Nguyen, and Gregory Matthews - a list of packages useful for sports analyticssportyR
⭐ by Saiem Gilani - package for drawing regulation playing surfaces for several sportsSBpitch
by FC rSTATS - create a pitch plot ready for Statsbomb datastatsbomb-bayesian-shooting
by Marek Kwiatkowski - Bayesian estimation of the finishing skill of football playerssportsdatascience
by Sudarshan Golaladesikan - code to draw pitches and transform and visualise StatsBomb open dataTracking-Data
by Kuba Michalczyk - package to draw a convex hull from tracking datatransfers
⭐ - GitHub repo for European football clubs' player transfers from 1992/93-2021/22 (as per TransferMarkt) by ewenmeworldfootballR_data
by Jason Zivkovic - project holding various data for theworldfootballR
R package (see guide on how to use this package [link])world_cup_2022
by Luke Benz - modeling and simulations for the 2022 FIFA World Cupxg-model
⭐ by Dato Fútbol (Ismael Gómez Schmidt) - an example of how to create a xG model using R and Wyscout event data. See the accompanying article - Fitting your own football xG modelxThreatR
by Grant Rhines - an implementation of Karun Singh's Expected Threat with R
Other Languages
fotmob
⭐ by Brian Greenwood - a JS/TS wrapper around the unofficial FotMob APImobfot
by Brian Greenwood - a Python wrapper around the unofficial FotMob APIfootball_manager_api
by YJ Kim - an unofficial API for Player Data in Football Managerrandom_stuff
by Abhishek Sharma - a place to store and share random projects/snippets of code/vizzes by Abhishek SharmaStatball
by Somdeep Dey - football soccer stats analyser from top 5 european leagues with data obtained by web scraping from FBref and Statsbombsoccer_ha_covid2
by Luke Benz - repository that contains the data and code used in a manuscript "Estimating the change in soccer's home advantage during the Covid-19 pandemic using bivariate Poisson regression", by Luke Benz and Michael Lopez
No Language Specified
awesome-soccer-analytics
⭐ by Matias Mascioto - a curated list of awesome resources related to Soccer Analyticsawesome-football-analytics
⭐ by Diego Pastor - a curated list of football analytics awesome resources, articles, books and morecoordinateFC
by FC rSTATS - a shot at coordinating open source football analytics builders to work towards common standards with interoperability as the goaldeutschland
by football.db (Gerald Bauer) - football data for Deutschland (Germany) incl. Bundesliga, 2. Bundesliga, etc.epl-fantasy-geek
⭐ - English Premier League 2017-18 Fantasy statsengland
⭐ by football.db (Gerald Bauer) - football data for England (and Wales) incl. English Premier League, The Football League (Championship, League One, League Two), Football Conference etc.espana
⭐ by football.db (Gerald Bauer) - football data for España (Spain) incl. Primera División (La Liga), Segunda División etc.euro
by football.db (Gerald Bauer) - free open public domain football data (euro.db) for Euro 2008, Euro 2012, Euro 2016, Euro 2020 (2021), etc.europe-champions-league
by football.db (Gerald Bauer) - football data for European Champions League (incl. European Cup / European Champion Clubs' Cup)FIFAWorldCup
by S Anand - FIFA World Cup data includes teams data, squad formations, clubs dominancefifadata
by Pratap Vardhan - FIFA datafootball-data-collection
⭐ by Hugo Mathien - web scraper used to create Kaggle European Soccer database [link]FootballData
by Joe Kampschmidfootball-logos
by Luuk Hopman - all logos of teams in the top 20 European leagues. Season 2022/2023football-graphs
by Rodolfo Mói - Graphs and passing networks in footballguideR
by Dom Samangy - repository for the Google spreadsheet with 200+ R resources, 100+ Python tutorials, 30+ packages, 25+ accounts to follow, 10 cheatsheets, and several free books & blogs. Google spreadsheet [link]italy
by football.db (Gerald Bauer) - free open public domain football data (football.db) for Italy / Europe - Serie A etc.international_results
⭐ by Mart Jürisoo - 44,353 results of international football matches starting from the very first official match in 1872 up to 2022league-starter
by football.db (Gerald Bauer) - football.db league quick starter sample - start your own leagues & cupslivesoccertv-parser
by Pablo Varela - parse soccer games data from https://livesoccertv.comopendata
⭐ by SkillCorneropen-fpl
by Narudom Techaval - ppen-source Fantasy Premier League toolsplayers
by football.db (Gerald Bauer) - free open public domain football data (football.db) for players (goalkeepers, defenders, midfielders, forwards)pinnacleapi-documentation
⭐ by Pinnacle - Pinnacle API Documentationsample-data
⭐ by Metrica Sportsshot-plotter
by An Nguyen - web application for plotting events on a sport's playing area with a single click, while keeping track of any other details. Supports download and upload of .csv filesstadiums
by football.db (Gerald Bauer) - free open public domain football stadium datasoccer
⭐ by Christopher D. Long - soccer analytics datasetssoccer-analytics-resources
⭐ by Jan Van HaarenSports_Data_Reference
⭐ by Meyappan SubbaiahStats_in_Sports_2021
by Zachary Binney - materials for the Statistics in Sports class for first-year undergradstheFPLkiwi
by The FPL Kiwi - Kiwi shared stats/resourcesworldcup.json
⭐ by football.db (Gerald Bauer) - free open public domain football data for the world cups in JSON incl. Qatar 2022, Russia 2018 and moreworldcup
⭐ by https://twitter.com/joshfjelstul - a comprehensive database on the FIFA World Cup
Apps
- ALPHONSO 2.0 by Sam Goldberg and Mike Imburgio for American Soccer Analysis
- Football Slices by Football Slices (DyslexicDdue]) (now offline)
- Player Finishing Overview by Harsh Krishna, an app that generates a dashboard of visualisations that can be useful in getting an overview of a football player's finishing ability. See the accompanying Twitter thread [link]
- Player Replacement Shortlist Generator by Hugh Klein. See the accompanying Twitter thread [link]
- Pizza Vizz App by Johnny Vizz. See the accompanying blog post [link] and Twitter thread [link] to access the Vizz App (subscription only)
- Statsbomb-Json-Parse by Rob Carroll. A small app that lets you input a StatsBomb JSON file and get a CSV file back (you need to create a free account to run it. For a video explainer, see the following [link]
- Scouting Tool by Renzo Cammi - a scouting tool created with Streamlit from StatsBombdata via FBref, that lets you filter players stats from the 'Big 5' European Leagues
- Soccer Analytics Library by Lars Maurath
- Tracking Tagger by @danzn1. See GitHub repo [[link]
- Twelve Football
- YouTubeCoder Event video tagging Jamie Dos Anjos (FC Python)
📊 Data Visualisation Resources and Tools
Resources to aid data visualisation:
Vizpiration
Check out the vizpiration
subfolder in the img
folder, for examples of visualisations created by analysts in the community.
- Son of a Corner's Viz of the week. See the following GitHub repo [link] for code to create these visuals
Tutorials
- Son of a Corner's matplotlib and Python tutorials [link]. Tutorials include:
- How to create Football Pitches/Goals as Backgrounds in Tableau by James Smith. Download his pitch and goal templates here
- Peter McKeever's 'Good practice in data visualisation' webinar for Friends of Tracking. See the following for code [link]
- Football Data Visualizations - Passing Networks by Karol Działowski - a great blog post on how to create passing networks from first principles, using Opta Event data acquired from WhoScored. This data is then visualised using matplotlib.
- John Burn-Murdoch's Data visualisation is about words webinar for Friends of Tracking
Repos and libraries
ggsoccer
⭐ by Ben Torvaney - a soccer visualisation library in RggshakeR
⭐ by Abhishek Mishra - an analysis and visualisation R package that works with publicly available soccer data. See the following documentation [link]football-graphs
by Rodolfo Mói - Graphs and passing networks in footballfootball-logos
by Luuk Hopman - all logos of teams in the top 20 European leagues. Season 2022/2023matplotsoccer
⭐ - a Python library for visualising soccer event data by Tom Decroosmplsoccer
⭐ - a Python library for plotting football pitches in matplotlib by Andrew Rowlinsonmatplotlib-tutorials
by Son of a Corner - the Jupyter notebooks behind the Son of a Corner matplotlib tutorialsmatplotsoccer
⭐ by Tom Decroos - package to visualize soccer datamonderian-soccer-art
by Devin Pleuler - Python notebook for procedurally generating Mondrian-style soccer fieldsmpl-footy
by Abhishek Sharma - gallery for typical football plots created usingsoc-viz-of-the-week
⭐ by Son of a Corner - viz of the week repositoryPyWaffle
- an open source, MIT-licensed Python package for plotting waffle charts by Peter McKeeverplottable
by @danzn1 - most pretty & lovely tables with matplotlibsoccerplots
⭐ - a Python package that can be used for making visualisations for football analytics by Anmol Durgapal. Now part of themplsoccer
packageyoutube-videos
by McKay Johns - code used in his YouTube Videos [link]
Resources
- For club badges for the 'Big 5' European leagues and English leagues, see the
club_badges
subfolder of this GitHub repository. See also the Club crests put together by Ninad Barbadikar that is available for download. - matplotlib for Football - gallery for Typical Football Plots created using matplotlib by Abhishek Sharma. See his Twitter thread [link] and GitHub repository [link]
- PL 21-22 player images by Karan Popli
- StatsBomb media pack
- URLs of images of all first team players from the Premier League website by Alfred - see CSV
- Team colour codes, for the HEX, RGB, and HSL colours of top flight football teams
- Pitch templates, put together by Tony Bambrick (see tweet [link])
- Luke Griffin's pitch graphics - slides. Drop him a donation of PayPal if you're using his work [link]. See original Tweet [link]
Tweets
- Peter McKeever's Twitter thread about data viz [link].
✒️ Written Pieces
Blogs
Highly Rated and Recommended Pieces
Many of these blog posts are recommended in Sam Gregory's Best Football Analytics Pieces piece and Tom Worville's “What’s the best Football Analytics piece you’ve ever read?”, both articles now a few years old. This section is very subjective so if I've missed anything obvious, apologies.
- Assessing The Performance of Premier League Goalscorers by Sam Green
- Counting Across Borders by Ben Torvaney
- Is Soccer Wrong About Long Shots? by John Muller
- Where Goals Come From by Jamon Moore and Carlon Carpenter
- Defending Your Patch by Thom Lawrence
- The DePO Models: Bringing Moneyball to Professional Soccer by Sam Goldberg and Mike Imburgio
- Using Data to Analyse Team Formations by Laurie Shaw
- Structure in football: putting formations into context by Laurie Shaw
- Inside Arsenal’s Attack: In-Depth Analysis Of Arteta’s Problems & Possible Solutions by Ashwin Raman
- Premier League Projections and New Expected Goals by Michael Caley
- Introducing Passing Combinations by Piotr Wawrzynów
- Pass Footedness in the Premier League by James Yorke
- Messi Walks Better Than Most Players Run by Bobby Gardiner
- Introduction Expected Goals on Target (xGoT) by Jonny Whitmore
- Tools for tiny teams by Ben Torvaney:
- Anatomy of a Shot by Thom Lawrence
- Soccer Analytics 101 by Kevin Minkus
- An Introduction to Soccer Analytics by John Muller
- Valuing On-the-Ball Actions in Soccer: A Critical Comparison of xT and VAEP by Jesse Davis, Tom Decroos, Pieter Robberechts, Maaike Van Roy
- Game of Throw-Ins by Eliot McKinley
- Expected Threat by Karun Singh. Check out also as an unrolled Twitter thread [link] Karun's Twitter thread for the many resources out there around this topic, including: Episode 19 of The Football Fanalytics Podcast, Karun's StatsBombconference presentation [link] and slides [link], Rob Hickman's StatsBombconference presentation where he extended xT to take defensive risk into account [link], Last Row View (Ricardo Tavares)'s blog post for evaluating off-the-ball player movements by combining xT and tracking data, and Karun's xT values as a 12x8 grid to download as a JSON file [link]
- Lionel Messi’s ten stages of greatness by Michael Cox and Tom Worville
- Passing Out at the Back by Will Gürpinar-Morgan
- The 10 Commandments of Football Analytics by Tom Worville
- Borussia Dortmund - What's gone wrong? by Colin Trainor for StatsBomb
- Breaking Down Set Pieces: Picks, Packs, Stacks and More by Euan Dewar
- Data Based Coaching: How to Incorporate Data-Driven Decision into Your Coaching Workflow by Kieran Doyle
- Coaches Reward Goalscorers. But Should They? by Eliot McKinley and John Muller
- Soccer Analysis Summary at Behind the Net by Hawerchuk
Blogs and Data Analytics Websites
The following list contains those blogs that are still maintained, as well as the original blogs from the OGs of football analytics.
For a Twitter thread of the football analytics blogs from 2009 an earlier, see the following Twitter thread from Tiotal Football [link].
- 11tegen11 by 11tegen (Sander IJtsma)
- 21st Club - blog posts available in hard-copy form in their Changing the Conversation series
- 2+2=11 by Will Gürpinar-Morgan
- 5 Added Minutes by Omar Chaudhuri (last updated 03/09/2016)
- 8 Yards 8 Feet by Simon Lock
- Abel Lorincz by Abel Lorincz
- Abhishek Amol Mishra's Medium blog - check out his Learning Machines With Me. series
- Absolute Unit
- All Things Football
- Alex Rathke by Alex Rathke
- American Soccer Analysis
- Analyse Football by Ravi Ramineni (last updated 06//04/2015)
- The Analyst by Stats Perform
- Analytics FC. For the blog, see [link]
- Attacking Center-back by JP Quinn
- Barça Innovation Hub
- Benoit Pimpaud's Medium blog
- BiscuitChaserFC by Mark Wilkins. See his Twitter thread of R tutorials [link]
- Bosemessi GitHub blog by Soumyajit Bose
- Brendan Kent. Check out his Sports Analytics 101 series
- Brisink by Jerome
- Carey Analytics by Mark Carey
- Danny Page's Medium blog
- Dato Fútbol by Dato Fútbol (Ismael Gómez Schmidt)
- davidfombella.github.io by David Fombella
- DeepxG by Thom Lawrence (last updated 29/11/2017)
- Differentgame by Paul Riley
- DTAI Sports Analytics Lab by KU Leuven, featuring posts from Jesse Davis, Pieter Robberechts, Maaike Van Roy, Lotte Bransen, Jan Van Haaren, Tom Decroos, and more
- The Economics of Sport
- EddWebster.com by Edd Webster
- EFL Numbers by EFL Numbers
- EightyFivePoints by Laurie Shaw
- Experimental 361 by Ben Mayhew
- FC Python by Jamie Dos Anjos (FC Python)
- FiveThirtyEight Sports
- Football Crunching by Ricardo Tavares
- Football Data Science by Dr. Garry Gelade
- Football Philosophy by Joost van der Leij
- Football Science by Michael C. Rumpf
- Football Whispers
- Futbol AnalysR by Josh Trewin
- The Futebolist by Ashwin Raman
- Get Goalside! by Mark Thompson
- The Harvard Sports Analysis Collective
- Liam Henshaw's Medium blog
- Hockey Graphs
- Hudl
- James W Grayson by James W Grayson
- Jan Van Haaren by Jan Van Haaren
- jogall.github.io by Joe Gallagher
- Karun Singh by Karun Singh
- kubamichalczyk.github.io by Kuba Michalczyk
- kwiatkowski.io by Marek Kwiatkowski
- The Last Man Analytics by The Last Man Anayltics (Ciaran Grant)
- lufcdata by @LUFCDATA
- LukeBornn.com by Luke Bornn
- Mackay Analytics by Nils Mackay
- Mackinaw Stats by Mackinaw Stats
- Maram AlBaharna's Medium blog
- Mark's Notebook (Substack) and Mark's Notebook (Ghost) by Mark Thompson
- Mixed kNuts by Ted Knutson including post pre-StatsBomb blog
- MRKT Insights with Tim Keech, Ram Srinivas, Matt Lawrence, Kevin Elphick, and Andy McGregor. Formally Jay Socik
- Modern Fitba (currently archived)
- Nandy47 GitHub blog by Sagnik Das
- Ninad Barbadikar Medium blog by Ninad Barbadikar
- North Yard Analytics by Dan Altman
- openGoal by Charles William
- Opta Pro - old blogs removed but can be found using Wayback Machine
- patricklucey.com by Patrick Lucey
- Penal.lt/y by Martin Eastwood
- Piotr Wawrzynów – Football Analysis by Piotr Wawrzynów
- Phil Birnbaum's Blog by Phil Birnbaum
- The Power of Goals by Mark Taylor
- Proform AFC by Proform Analytics (Mladen Sormaz and Dan Nichol)
- Ravi Mistry's Medium blog
- robert-hickman.eu
- R by R(yo) by Ryo Nakagawara
- SaddlersStats
- Sam Gregory's Medium blog
- SciSports
- Sergi's Blog by Sergi_Lehkyi
- The Significant Game by Lars Maurath
- Soccermatics Medium blog by David Sumpter
- soccerNurds
- space space space by John Muller
- StatDNA (last updated 01/06/2011 before Arsenal bought the company)
- StatsBomb
- Stats Perform
- Stats and snakeoil by Ben Torvaney
- Tiago Estêvão's Medium blog by Tiago Estêvão
- Tony ElHabr's blog by Tony ElHabr
- Training Ground Guru. Check out their accompanying podcast [link]
- Tom Worville's Medium blog by Tom Worville (last updated 14/08/2017). Tom now writes for The Athletic [link]
- winningwithanalytics.com by Bill Gerrard
- Wooly Jumpers for Goal Posts by The Woolster
- Worville Analysis by Tom Worville
- Wyscout
- x+football by Niklas Hemmer
- xG per Shot by Parthe Athale
- Zonal Marking. by Michael Cox. Michael now writes for The Athletic [link].
📃 Papers
See the following subfolder of this GitHub repo for PDF copies of the papers listed below [link].
Many of the papers included in this list have been included after reading Jan Van Haaren's Jan Van Haaren's Soccer Analytics Reviews (2020, 2021, 2022). All credit to him for reading a paper a week and making his reviews publicly available and give his reviews a read through if you haven't already done so!
The following Shiny App from Lars Maurath is a great tool for looking up publications [link].
See the following webpages of conference papers per year (where available):
- Opta (see talks [link]) - 2021, 2022
- MIT Sloan Sports Analytics Conference - 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022
- Machine Learning and Data Mining for Sports Analytics - 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022
- NESSIS - 2007, 2009, 2011, 2013, 2015, 2017, 2019, 2021
- StatsBomb - 2019, 2021, 2022
2022
- A study of Prediction models for football player valuations by quantifying statistical and economic attributes for the global transfer market (2022) by Dibyanshu Patnaik, Harsh Praharaj, Kartikeya Prakash, and Krishna Samdani
- Analyzing Passing Sequences for the Prediction of Goal-Scoring Opportunities (2022) by Conor McCarthy, Panagiotis Tampakis, Marco Chiarandini, Morten Bredsgaard Randers, Stefan Jänicke and Arthur Zimek
- Automated Detection of Complex Tactical Patterns in Football—Using Machine Learning Techniques to Identify Tactical Behavior (2022) by Pascal Bauer
- Automatic Event Detection in Football Using Tracking Data (2022) by Ferran Vidal-Codina, Nicolas Evans, Bahaeddine El Fakir and Johsan Billingham
- A framework for the analytical and visual interpretation of complex spatiotemporal dynamics in soccer (2022) by Javier Fernández
- Estimating transfer fees of professional footballers using advanced performance metrics and machine learning (2022) by Ian G.McHale, and Benjamin Holmes
- A New Performance Metric For Player Evaluation Based On Causality (2022) by Alessandro Cecchin
- Beyond action valuation: A deep reinforcement learning framework for optimizing player decisions in soccer (2022) by Pegah Rahimian, Jan Van Haaren, Togzhan Abzhanova and Laszlo Toka
- Contextual Expected Threat (xT) using Spatial Event Data (2022) by Greg Everett, Ryan Beal, Tim Matthews, Tim Norman, Gopal Ramchurn
- Contextualised High-Intensity Running Profiles of Elite Football Players with Reference to General and Specialised Tactical Roles (2022) by Wonwoo Ju, Dominic Doran, Richard Hawkins, Mark Evans, Andy Laws and Paul Bradley
- Controlling Ball Progression in Soccer (2022) by Catherine Pfaff, Emily Hunter, Haozhi Hong, Daniel Forestell, Ari Fialkov, Zoey Drassinower and Timothy Chan
- Detection of tactical patterns using semi-supervised graph neural networks (2022) by Gabriel Anzer, Pascal Bauer, Ulf Brefeld, Dennis Faßmeyer
- The Determinants of Football Transfer Market Value: An Age of Financial Restraint by Thomas Preston
- Econometric Approach to Assessing the Transfer Fees and Values of Professional Football Players by Raffaele Poli, Roger Besson, and Loïc Ravenel
- “Estimated Player Impact” (EPI) Quantifying The Effects Of Individual Players On Football Actions Using Hierarchical Statistical Models (2022) by Tahmeed Tureen and Sigrid Olthof
- Evaluation of Creating Scoring Opportunities for Teammates in Soccer via Trajectory Prediction (2022) by Masakiyo Teranishi, Kazushi Tsutsui, Kazuya Takeda and Keisuke Fujii
- Evaluating Sports Analytics Models: Challenges, Approaches, and Lessons Learned (2022) by Jesse Davis, Lotte Bransen, Laurens Devos, Wannes Meert, Pieter Robberechts, Jan Van Haaren and Maaike Van Roy
- Expected passes (2022) by Gabriel Anzer and Pascal Bauer
- Generalized Action-based Ball Recovery Model using 360° data (2022) by Ricardo Furbino And Hugo Rios-Neto
- Identification of Factors Determining Market Value of the Most Valuable Football Players by Sebastian Majewski
- The influence of tactical and match context on player movement in football (2022) by Sam Gregory, Sam Robertson, Robert Aughey and Grant Duthie
- Is it worth the effort? Understanding and contextualizing physical metrics in soccer (2022) by Sergio Llana, Borja Burriel, Pau Madrero, Javier Fernández
- Let’s Penetrate the Defense: A Machine Learning Model for Prediction and Valuation of Penetrative Passes (2022) by Pegah Rahimian, Dayana Grayce da Silva Guerra Gomes, Fanni Berkovics and Laszlo Toka
- Looking Beyond the Past: Analyzing the Intrinsic Playing Style of Soccer Teams (2022) by Jeroen Clijmans, Maaike Van Roy and Jesse Davis
- A Machine Learning Ensembling Approachm to Predicting Transfer Values by Ayse Elvan Aydemir, Tugba Taskaya Temizel, and Alptekin Temizel
- Machine Learning for Understanding and Predicting Injuries in Football (2022) by Aritra Majumdar, Rashid Bakirov, Dan Hodges, Suzanne Scott and Tim Rees
- Modelling the transfer prices of football players by Ivo Hendriks
- Multiagent off-screen behavior prediction in football (2022) by Shayegan Omidshafiei, Daniel Hennes, Marta Garnelo, Zhe Wang, Adria Recasens, Eugene Tarassov, Yi Yang, Romuald Elie, Jerome T. Connor, Paul Muller, Natalie Mackraz, Kris Cao, Pol Moreno, Pablo Sprechmann, Demis Hassabis, Ian Graham, William Spearman, Nicolas Heess, and Karl Tuyls
- Predicting Market Value of Football Players using Machine Learning Algorithms (2022) by Sidharrth Mahadevan
- Predict the Value of Football Players Using FIFA Video Game Data and Machine Learning Techniques by Mustafa A. Al-Asadi and Sakir Tasdemır
- Predicting Market Value of Soccer Players Using Linear Modeling Techniques by Yuan He
- Reinforcement Learning For Football Player Decision Making Analysis (2022) by Michael Pulis and Josef Bajada
- Qualitative Team Formation Analysis in Football: A Case Study of the 2018 FIFA World Cup (2022) by Jasper Beernaerts, Bernard De Baets, Matthieu Lenoir and Nico Van de Weghe
- Seq2Event: Learning the Language of Soccer Using Transformer-based Match Event Prediction. (2022) by Ian Simpson, Ryan Beal, Duncan Locke and Tim Norman
- Shot Analysis in Different Levels of German Football Using Expected Goals (2022) by Laurynas Raudonius and Thomas Seidl
- SoccerCPD: Formation and Role Change-Point Detection in Soccer Matches Using Spatiotemporal Tracking Data (2022) by Hyunsung Kim, Bit Kim, Dongwook Chung, Jinsung Yoon and Sang-Ki Ko
- Team-Builder: Toward More Effective Lineup Selection in Soccer (2022) by Anqi Cao, Ji Lan, Xiao Xie, Hongyu Chen, Xiaolong Zhang, Hui Zhang and Yingcai Wu
- Temporal Match Analysis and Recommending Substitutions in Live Soccer Games (2022) by Yuval Berman, Sajib Mistry, Joby Mathew and Aneesh Krishna
- Towards Expected Counter - Using Comprehensible Features to Predict Counterattacks (2022) by Henrik Rolf Biermann, Franz-Georg Wieland, Jens Timmer, Daniel Memmert and Ashwin Phatak
- Transfer Portal: Accurately Forecasting the Impact of a Player Transfer in Soccer (2022) by Daniel Dinsdale and Joe Gallagher
- The use of player tracking data to analyze defensive play in professional soccer - A scoping review (2022) by Leander Forcher, Stefan Altmann, Leon Forcher, Darko Jekauc, and Matthias Kempe
- un-xPass Measuring Soccer Player’s Creativity (2022) by Pieter Robberechts, Maaike Van Roy and Jesse Davis
2021
- 6MapNet: Representing Soccer Players from Tracking Data a Triplet Network (2021) by Hyunsung Kim, Jihun Kim, Dongwook Chung, Jonghyun Lee, Jinsung Yoon and Sang-Ki Ko
- A Bayesian Approach to In-Game Win Probability in Soccer (2021) by Pieter Robberechts,Jan Van Haaren, and Jesse Davis. See the accompanying blog [link]
- A Career in Football: What Is Behind an Outstanding Market Value? (2021) by Balázs Ács and László Toka
- A Copula-Based Hidden Markov Model for Classification of Tactics in Football (2021) by Marius Oetting. See accompanying NESSIS talk [link]
- A Framework for the Fine-Grained Evaluation of the Instantaneous Expected Value of Soccer Possessions (2021) by Javier Fernández, Luke Bornn and Daniel Cervone
- A Goal Scoring Probability Model for Shots Based on Synchronized Positional and Event Data in Football (Soccer) (2021) by Gabriel Anzer and Pascal Bauer
- A novel machine learning method for estimating football players’ value in the transfer market by Iman Behravan and Seyed Mohammad Razavi
- A Poisson Betting Model with a Kelly Criterion Element for European Soccer (2021) by Kushal Shah, James Hyman and Dominic Samangy
- A Risk-Reward Assessment of Passing Decisions: Comparison Between Positional Roles Using Tracking Data from Professional Men’s Soccer (2021) by Floris Goes, Edgar Schwarz, Marije Elferink-Gemser, Koen Lemmink and Michel Brink
- Analyzing Learned Markov Decision Processes using Model Checking for Providing Tactical Advice in Professional Soccer (2021) by Maaike Van Roy, Wen-Chi Yang, Luc De Raedt and Jesse Davis
- Anatomy of Receiving and Turning with the Ball (2021) by Soumyajit Bose and Manas Saraswat
- Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of Their Representations for Action Spotting (2021) by Anthony Cioppa, Adrien Deliege, Floriane Magera, Silvio Giancola, Olivier Barnich, Bernard Ghanem and Marc Van Droogenbroeck
- Combining Machine Learning and Human Experts to Predict Match Outcomes in Football: A Baseline Model (2021) by Ryan Beal, Stuart Middleton, Timothy Norman, Sarvapali Ramchurn
- Data-Driven Detection of Counterpressing in Professional Football (2021) by Pascal Bauer and Gabriel Anzer
- Determining the Phases of Play Using Graph Neural Network Embeddings (2021) by Juan Camilo Campos
- Econometric Approach to Assessing the Transfer Fees and Values of Professional Football Players (2022) by Raffaele Poli, Roger Besson, and Loïc Ravenel
- Evaluating Soccer Player: from Live Camera to Deep Reinforcement Learning (2021) by Paul Garnier and Théophane Gregoir. See the
nayra
library for code. - Extended Model for Expected Threat in Soccer by Jirka Poropudas
- From Motor Control to Team Play in Simulated Humanoid Football (2021) by Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, Ali Eslami, Daniel Hennes, Wojciech Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan Tracey, Karl Tuyls, Thore Graepel and Nicolas Heess
- How Soccer Scouts Identify Talented Players (2021) by Tom Bergkamp, Wouter Frencken, Susan Niessen, Rob Meijer and Ruud den Hartigh
- Identifying and Evaluating the Efficiency of Each Player During the Pressing Phase Against an Opponent’s Controlled Build-Up Play (2021) by Caterina De Bacco
- Inferring the Strategy of Offensive and Defensive Play in Soccer with Inverse Reinforcement Learning (2021) by Pegah Rahimian and László Toka
- Learning Football Body-Orientation as a Matter of Classification (2021) by Adrià Arbués-Sangüesa, Adrián Martín, Paulino Granero, Coloma Ballester and Gloria Haro
- Leaving Goals on the Pitch: Evaluating Decision Making in Soccer (2021) by Maaike Van Roy, Pieter Robberechts, Wen-Chi Yang, Luc De Raedt, and Jesse Davis. See the accompanying blog post [link] and research poster [link]
- Making Offensive Play Predictable - Using a Graph Convolutional Network to Understand Defensive Performance in Soccer (2021) by Paul Power, Michael Stöckl, and Thomas Seidel for Opta Pro Forum 2021. See the accomanpying talk on Vimeo [link]
- Measuring the Effectiveness of Pressing in Soccer by Simon Merckx, Pieter Robberechts, Yannick Euvrard and Jesse Davis
- Modelling Team Performance in Soccer Using Tactical Features Derived from Position Tracking Data (2021) by Floris Goes, Matthias Kempe, Jan van Norel and Koen Lemmink
- Optimally Disrupting Opponent Build-Ups (2021) by Maaike Van Roy, Pieter Robberechts and Jesse Davis
- Optimising Long-Term Outcomes using Real-World Fluent Objectives: An Application to Football (2021) by Ryan Beal, Georgios Chalkiadakis, Timothy Norman and Sarvapali Ramchurn
- Potential Penetrative Pass (P3) (2021) by Hadi Sotudeh
- Predicting Player Transfers in the Small World of Football (2021) by Roland Kovács and László Toka
- Similarity of Football Players Using Passing Sequences (2021) by Alberto Barbosa, Pedro Ribeiro and Inês Dutra
- SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos (2021) by Adrien Deliege, Anthony Cioppa, Silvio Giancola, Meisam Seikavandi, Jacob Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas Moeslund and Marc Van Droogenbroeck
- Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts (2021) by Silvio Giancola and Bernard Ghanem
- The Interpretable Representation of Football Player Roles Based on Passing/Receiving Patterns by Arsalan Sattari, Ulf Johansson, Erik Wilderoth, Jasmin Jakupovic and Peter Larsson-Green
- The Origins of Goals in the German Bundesliga (2021) by Pascal Bauer, Gabriel Anzer and Ulf Brefeld
- The Quest for the Right Pass: Quantifying Players’ Decision Making (2021) by Borja Burriel and Javier Buldú
- What Happened Next? Using Deep Learning to Value Defensive Actions in Football Event-Data (2021) by Charbel Merhej, Ryan Beal, Sarvapali Ramchurn and Tim Matthews
- “Why Would I Trust Your Numbers?” On the Explainability of Expected Values in Soccer (2021) byJan Van Haaren
- Women's football analyzed: interpretable expected goals models for women (2021) byLotte Bransenand Jesse Davis.
2020
- Automatic Pass Annotation from Soccer Video Streams based on Object Detection and LSTM (2020) by Danilo Sorano, Fabio Carrara, Paolo Cintia, Fabrizio Falchi and Luca Pappalardo
- A Framework for the Fine-Grained Evaluation of the Instantaneous Expected Value of Soccer Possessions (2020) by Javier Fernández, Luke Bornn and Daniel Cervone
- A new look into Off-ball Scoring Opportunity: taking into account the continuous nature of the game (2020) by Hugo M. R. Rios-Neto, Wagner Meira Jr., Pedro O. S. Vaz-de-Melo
- Cracking the Black Box: Distilling Deep Sports Analytics (2020) by Xiangyu Sun, Jack Davis, Oliver Schulte and Guiliang Liu
- Deep Soccer Analytics: Learning an Action-Value Function for Evaluating Soccer Players (2020) by Guiliang Liu, Yudong Luo, Oliver Schulte and Tarak Kharrat
- Game Plan: What AI can do for Football, and What Football can do for AI (2020) by Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome Connor, Daniel Hennes, Ian Graham, Will Spearman, Tim Waskett, and Dafydd Steele, Pauline Luc, Adria Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien P´erolat, Bart De Vylder, Ali Eslami, Mark Rowland, Andrew Jaegle, Remi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, and Demis Hassabis
- Google Research Football: A Novel Reinforcement Learning Environment (2020) by Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly. See the GitHub repo [link]
- Group Activity Detection From Trajectory and Video Data in Soccer (2020) by Ryan Sanford, Siavash Gorji, Luiz Hafemann, Bahareh Pourbabaee and Mehrsan Javan
- Interpretable Prediction of Goals in Soccer (2020) by Tom Decroos and Jesse Davis
- Inverse Reinforcement Learning for Team Sports: Valuing Actions and Players (2020) by Yudong Luo, Oliver Schulte and Pascal Poupart. See the code [link]
- Learning the Value of Teamwork to Form Efficient Teams (2020) by Ryan Beal, Narayan Changder, Timothy Norman, Sarvapali Ramchurn
- Player Chemistry: Striving for a Perfectly Balanced Soccer Team (2020) by Lotte Bransen. See the accompanying Friends of Tracking video tutorials [link] and chapter 4 of the Barça Innovation Hub Football Analytics 2021 publication, titled: 'How does context affect player performance in football?' by Lotte Bransen, Pieter Robberechts, Jesse Davis, Tom Decroos, andJan Van Haaren [link]
- Ready Player Run: Off-ball run identification and classification (2020) by Sam Gregory
- The Right Place at the Right Time: Advanced Off-Ball Metrics for Exploiting an Opponent’s Spatial Weakenesses in Soccer (2020) by Sergio Llana, Pau Madrero and Javier Fernández
- Optimising Game Tactics for Football (2020) by Ryan Beal, Georgios Chalkiadakis, Timothy Norman and Sarvapali Ramchurn
- Routine Inspection: A Playbook for Corner Kicks (2020) by Laurie Shaw and Sudarshan 'Suds' Gopaladesikan. Accompanying talk - 2020 Harvard Sports Analytics Lab]
- Seeing in to the future: using self-propelled particle models to aid player decision-making in soccer (2020) by Fran Peralta, Pablo Piñones Arce, David Sumpter and Javier Fernández
- SoccerMap: A Deep Learning Architecture for Visually-Interpretable Analysis in Soccer (2020) by Javier Fernández and Luke Bornn
- SoccerMix: Representing Soccer Actions with Mixture Models (2020) by Tom Decroos, Maaike Van Roy and Jesse Davis
- Soccer Analytics Meets Artificial Intelligence: Learning Value and Style from Soccer Event Stream Data (2020) by Tom Decroos
- The Tactics of Successful Attacks in Professional Association Football: Large-Scale Spatiotemporal Analysis of Dynamic Subgroups Using Position Tracking Data (2020) by Floris Goes, Michel Brink, Marije Elferink-Gemser, Matthias Kempe and Koen Lemmink
- Using Player’s Body-Orientation to Model Pass Feasibility in Soccer (2020) by Adrià Arbués-Sangüesa, Adrián Martín, Javier Fernández, Coloma Ballester and Gloria Haro
- Valuing On-the-Ball Actions in Soccer: A Critical Comparison of xT and VAEP (2020) by Maaike Van Roy, Pieter Robberechts, Tom Decroos and Jesse Davis
2019
- Actions Speak Louder Than Goals: Valuing Player Actions in Soccer (2019) by Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis. See accompany presentation at SIGKDD 2019 by Tom Decroos [link]
- Decomposing the Immeasurable Sport: A deep learning expected possession value framework for soccer (2019) by Javier Fernández, Bornn, and Dan Cervone. Accompanying talks - SSAC19, StatsBomb conference
- Dynamic Analysis of Team Strategy in Professional Football (2019) by Laurie Shaw and Mark Glickman. Accompanying talks - NESSIS 2019, 2020 Google Sports Analytics Meetup
- Invalid Interpretation of Passing Sequence Data to Assess Team Performance in Football Repairing the Tarnished Legacy of Charles Reep (2019) by Richard Pollard
- Measuring soccer players’ contributions to chance creation by valuing their passes (2019) by Lotte Bransen, Jan Van Haaren, and Michel van de Velden.
- Modelling the Collective Movement of Football Players (2019) by Fran Peralta
- Player Vectors: Characterizing Soccer Players’ Playing Style from Match Event Streams (2019) by Tom Decroos and Jesse Davis.
2018
- Beyond Expected Goals (2018) by Will Spearman
- Chance involvement in goal scoring in football (2018) by Martin Lames
- Predicting football results using machine learning techniques (2018) by Corentin Herbinet
- Replaying the NBA (2018) by Luke Bornn
- Wide Open Spaces: A statistical technique for measuring space creation in professional soccer (2018) by Javier Fernandez and Luke Bornn
- Spatial analysis of shots in MLS: A model for expected goals and fractal dimensionality (2018) by Alexandera Fairchild, Konstantinos Pelechrinis, Mariosa Kokkodis
- High-resolution shot capture reveals systematic biases and an improved method for shooter evaluation (2018) by Rachel Marty.
2017
- Beyond crowd judgments: Data-driven estimation of market value in association football by Oliver Müller, Alexander Simons, and Markus Weinmann
- Data-Driven Ghosting using Deep Imitation Learning (2017) by Hoang M. Le, Peter Carr, Yisong Yue, and Patrick Lucey
- Football Player’s Performance and Market Value by Miao He, Ricardo Cachucho, and Arno Knobbe
- “The Leicester City Fairytale?”: Utilizing New Soccer Analytics Tools to Compare Performance in the 15/16 & 16/17 EPL Seasons (2017) by Hector Ruiz, Paul Power, Xinyu Wei, and Patrick Lucey
- Physics-Based Modeling of Pass Probabilities in Soccer (2017) by Will Spearman, Austin Basye, Greg Dick, Ryan Hotovy, and Paul Pop
- Valuing passes in football using ball event data (2017) by Lotte Bransen
- Not all passes are created equal: objectively measuring the risk and reward of passes in soccer from tracking data (2017) by Paul Power, Hector Ruiz, Xinyu Wei, and Patrick Lucey. See Paul Power's talk [link] (downloadable MP4), and the webpage [link]
- Plus-Minus Player Ratings for Soccer (2017) by Tarak Kharrat, Javier Pena, and Ian McHale
- An examination of expected goals and shot efficiency in soccer (2017) by Alex Rathke
- Predicting goal probabilities for possessions in football (2017) by Nils Mackay.
2016
- Spatio-Temporal Analysis of Team Sports – A Survey (2016) by Joachim Gudmundsson and Michael Horton
- Valuing Individual Player Involvements in Norwegian Association Football (2016) by Olav Nørstebø, Vegard Rødseth Bjertnes, and Eirik Vabo
- Expected Goals in Soccer (2016) by Harm Eggels.
2015
- “Quality vs Quantity”: Improved Shot Prediction in Soccer using Strategic Features from Spatiotemporal Data (2015) by Patrick Lucey, Alina Bialkowski, Mathew Monfort, Peter Carr, and Iain Matthews
- Quantifying Shot Quality in the NBA by
- Soccer video and player position dataset (2015) by S. A. Pettersen, D. Johansen, H. Johansen, V. Berg-Johansen, V. R. Gaddam, A. Mortensen, R. Langseth, C. Griwodz, H. K. Stensland, and P. Halvorsen. See the accompanying webpage [link].
2014
- Large-Scale Analysis of Soccer Matches using Spatiotemporal Tracking Data (2014) by Alina Bialkowski, Patrick Lucey, Peter Carr, Yisong Yue, Sridha Sridharan, and Iain Matthews.
2011
- A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains (2011) by Sarah Rudd. Accompanying NESSIS talk on Metacafe [link]
- An Extension of the Pythagorean Expectation for Association Football (2011) by Howard Hamilton.
2002
- Charles Reep (1904-2002) pioneer of notational and performance analysis in football (2002) by Richard Pollard.
1997
- Modelling Association Football Scores and Inefficiencies in the Football Betting Market (1997) by Mark Dixon and Stuart Coles.
1971
- Skill and Chance in Ball Games (1971) by Charles Reep, Bernard Benjamin, and Richard Pollard.
Newsletters
- 21st Club
- Absolute Unit by Tiotal Football
- Analytics FC
- BiscuitchaserFC by Mark Wilkins
- The Chatalytics Newsletter by The Chatalytics Podcast
- Get Goalside! by Mark Thompson
- geom_mark
- GriffinFtbl by Luke Griffin
- Grace on Football by Grace Robertson
- From An Engineer Sight by Benoit Pimpaud
- KPMG Football Benchmark Newsletter - go to the home page and click on 'Registration' in the top-right corner
- Looks Good on Paper by Felix Pate
- Measureables by Brendan Kent
- No Grass in the Clouds by Ryan O'Hanlon
- Post Script Pod by Tiotal Football and John Muller
- Soccer Analytics Newsletter
- space space space by John Muller (this newsletter has now finished but catch John's work as a Senior Writer for The Athletic [link]
- Stats Perform
News Articles
- Professional footballers threaten data firms with GDPR legal action (12/10/2021) for BBC News by Nick Hartley
- Liverpool director of research hints at seven reasons for quiet summer transfer window (11/10/2021) for Liverpool Echo by Josh Williams
- 'We can make some valuable signings' - Liverpool director of research explains how transfer strategy really works (11/10/2021) for Liverpool Echo by Paul Gorst
- Introducing Manchester United's big new signing: a mathematician (08/10/2021) for The Telegraph by James Ducker
- England vs Germany will be settled by spreadsheets (29/06/2021) for Wired by Amit Katwala
- Now DeepMind is using AI to transform football (06/05/2021) for Wired by Andrew Powell
- Kevin De Bruyne uses data analysts to broker £83m Man City contract without agent (08/04/2021) by David McDonnell for The Mirror
- La extraña renovación de De Bruyne: sin agente y usando el 'big data' para calcular su salario (07/04/2021) for Marca
- From scouting players on sidelines to sofas – Meet the WyScout generation transforming football analytics (07/04/2021) by Pete Hall for iNews
- Meet Ram Srinivas, The Biggest Wes Hoolahan Fanatic In India (27/03/2021) by Fiachra Gallagher for Balls.ie
- Soccer-From blogging to the dressing room - the rise of the new analysts (25/03/2021) by Simon Evans for Reuters
- Premier League club Manchester City hire astrophysicists (24/03/2021) by Alfredo Relaño for AS
- Manchester City will have astrophysicists in their ranks in Marca
- It IS rocket science! Manchester City hire astrophysicists to their data analysis team in bid to move Premier League leaders further ahead of their rivals by Jack Gaughan (22/03/2021) for The Daily Mail
- [Liverpool sign up for StatsBomb 360: Ted Knutson explains why this stats revolution will change the game](https://www.skysports.com/football/news/11669/12248621/liverpool-sign-up-for-statsbomb-360-ted-knutson-explains-why-this-stats-revolution-will-change-the-game) (18/03/2021) by Adam Bate for Sky Sports News
- Data experts are becoming football's best signings (05/03/2021) by Justin Harper for BBC News
- How a Celtic blogger nurtured by Brendan Rodgers is now lifting Leicester City (27/02/2021) by Tom Roddy for The Times
- 17-Year-Old Man Lands Dream Job Of Getting Paid To Watch Football All Day by Adnan Riaz for Sport Bible
- Aged 17 and getting paid to watch football all day (04/02/2021) by Manish Pandey for BBC News
- Man City’s Big Winter Signing Is a Former Hedge Fund Brain (31/01/2021) by David Dellier and Adam Blenford for Bloomberg
- How data is pushing Twitter scouts and bloggers into football's big time (27/02/2021) by Paul MacInnes for The Guardian
- Revealed: expected goals being used in football's war against match-fixing (13/02/2021) by Sean Ingle for The Guardian
- 'What we do isn't rocket science': how Midtjylland started football's data revolution (25/10/2020) by Sean Ingle for The Guardian
- How a teenager from Bangalore became a performance analyst for Dundee United (23/12/2020) by Tim Wigmore for The Telegraph
- How the volunteers of data website Transfermarkt became influential players at European top football clubs (18/12/2020) by Pepihn Keppel and Tom Claessens
- Colin Trainor: from bigging up Klopp to the little details of the GAA (17/10/2020) by Kenny Archer for The Irish Times
- REVEALED: The data scientist, astrophysicist, chess champion, and doctor in theoretical physics who are behind Liverpool’s title-winning success… they may look a 'little nerdy' but this Fab Four prove it is rocket science! (27/06/2020) by Rob Draper and Adam Shafiq for The Daily Mail
- How analysts have used lockdown to unearth football’s next hidden gems (17/07/2020) by Dan Clark in The Times
- Behind the Badge: The physicist who leads Liverpool's data department (15/06/2020) by Sam Williams for LiverpoolFC.com
- How Soccer Scouting Has Changed, And Why It’s Never Going Back (15/05/2020) by Robert Kidd for Forbes
- ‘Expected threat’, ‘width per sequence’ – the statistical metrics you haven’t heard of (13/02/2020) by Dan Clark for The Times
- How Brentford flipped the script and staged a data revolution to become England’s smartest club (24/01/2020) by Sean Ingle for Talksport
- 'It's the boffins what won it!': Data experts plus Jurgen Klopp's charisma turn Liverpool into the kings of Europe (02/06/2019) by Joe Bernstein for The Mail on Sunday
- How Data (and Some Breathtaking Soccer) Brought Liverpool to the Cusp of Glory (22/05/2019) by Bruce Schoenfeld for The New York Times
- Brexit Could Drastically Change English Soccer (11/12/2018) by Laurie Shaw for FiveThirtyEight
- Soccer's Moneyball Moment: How Enhanced Analytics Are Changing The Game (19/11/2018) by Robert Kidd for Forbes
- 2018 World Cup: Prediction Time Up Against The Machine (13/06/2018) by Bobby McMahon for Forbes
- Home advantage, unconscious bias and the boisterous crowds who influence referees (23/04/2018) by Tim Wigmore for iNews
- The Premier League is losing its competitive balance – that should be cause for concern (02/02/2018) by Tim Wigmore for iNews
- Expected goals and Big Football Data: the statistics revolution that is here to stay (03/03/2017) by Paul MacInnes in The Guardian
- How computer analysts took over at Britain's top football clubs (09/03/2014) by Tim Lewis for The Observer
- How data analysis helps football clubs make better signings (01/11/2018) by John Burn-Murdoch for The FT
- What does 'Expected Goals' mean? Welcome to the new Opta stat you will be hearing a lot about this season (12/08/2017) by Mirror Football for The Irish Mirror
- A football revolution (17/07/2011) in The FT [pay wall]
- A working life: The quantitative analyst (11/06/2011) by Graham Snowdon for The Guardian.
📚 Books
The list of books below include are not only for football but for sports analytics in general.
See the following reading lists for book recommendations from other sports data scientists:
- Sports Analytics Reading List by Measureables (Brendan Kent), as part of his Sports Analytics 101 series
- Sports Analytics Books by Jan Van Haaren
The following use Amazon UK links where available and are not affiliate links.
- Moneyball: The Art of Winning an Unfair Game by Michael Lewis
- The Numbers Game by Chris Anderson and David Sally
- Football Hackers by Christoph Biermann
- Soccermatics by David Sumpter
- Soccernomics by Simon Kuper and Stefan Szymanski
- Net Gains: Inside the Beautiful Game's Analytics Revolution by Ryan O'Hanlon
- Expected Goals: The story of how data conquered football and changed the game forever by Rory Smith
- Money and Football: A Soccernomics Guide by Simon Kuper and Stefan Szymanski
- Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football by Wayne Winston
- Data Analytics in Football by Daniel Memmert and Dominik Raabe
- How To Watch Football: 52 Rules for Understanding the Beautiful Game, On and Off the Pitch by Tifo
- Football Intelligence: A hands-on guide to maximising your investments in data by Mathieu Lacome and Eva Murray
- Changing the Conversation series by Twenty First Group
- Football Decoded: Using Match Analysis & Context to Interpret the Demands by Paul Bradley
- Sports Analytics: A Guide for Coaches, Managers, and Other Decision Makers by Ben Alamar
- Outside the Box by Duncan Alexander
- Opta World Football Infographics: The Beautiful Game in Brilliant Detail by Adrian Besley
- Zonal Marking: The Making of Modern European Football by Michael Cox
- The Mixer: The Story of Premier League Tactics, from Route One to False Nines by Michael Cox
- The Price of Football by Kieran Maguire. Check out The Price of Football Podcast with Kieran Maguire and Kevin Hunter Day
- Inverting the Pyramid by Jonathan Wilson
- Sprawlball: A Visual Tour of the New Era of the NBA by Kirk Goldsberry
- Numbers Don't Lie: New Adventures in Counting and What Counts in Basketball Analytics by Yago Colás
Magazines
📼 Video
YouTube Playlists
Custom Playlists Curated by Myself
The following is a series of playlists that that I have collated originally for my own personal viewing but they may be useful to you:
- All Sports Analytics - a huge playlist of around 800 video that includes anything at all to do with Sports Analytics and Data Science. Any video found related to the topic is here
- Football-specific Data Science lectures and seminars - presentations and seminars from conferences including StatsBomb, Opta, Sloan, and more. For links to recently unlisted Stats Perform (Opta) talks, see Ben Torvaney's Gist list [link]
- Football-specific Tableau tutorials
- Football-specific Power BI tutorials
- Football-specific Machine Learning
- Football-specific Data Viz
- Tracking data - all videos related to the topic of Tracking data include presentations and tutorials
- Expected Goals - all videos on the topic of Expected Goals
Public Playlists
Playlists created by others
- Friends of Tracking Playlists:
- The Analytics (formally Opta) Playlists
- McKay Johns Playlists:
- StatsBomb Playlists
- UTSPAN Seminar Series 2020 by UTSPAN
- 2020 Google Sports Analytics Meetup by Alok Pattani for Google Sports Analytics
- Carnegie Mellon Sports Analytics Conference
- Great Lakes Analytics in Sports Conferences:
- Shorts Videos on Soccer Analytics by Dan Altman
YouTube Channels
- 42 Analytics – for SSAC conferences
- Barça Innovation Hub (English and Spanish)
- Big Data Sports by David Fombella
- The Coaches’ Voice
- CMU Statistics
- Friends of Tracking with David Sumpter, Dr. Catherine Pfaff, Javier Fernández, Laurie Shaw, Sudarshan 'Suds' Gopaladesikan, Pascal Bauer, and Fran Peralta
- Football Player Ratings by Lars Magnus Hvattum
- Football Whispers
- Futbol AnalysR by Josh Trewin - for PowerBI tutorials
- Hadi Sotudeh by Hadi Sotudeh
- Mark Glickman – for NESSIS talks, uploaded to his personal channel. Old talks are available on his Metacafe channel. See the official website [link]
- McKay Johns's YouTube channel - for Python and Data Science tutorials
- Ninad Barbadikar's YouTube channel - for Tableau tutorials
- Opta - including Opta Pro Forum talks
- Planeta Data Fútbol (en español) con Jesús Lagos y Miguel Ángel García
- SciSports
- StatsBomb - including StatsBombConference talks
- STATS Insights
- Tifo Football
Video Analysis
- Carlon Carpenter's Football Analytics repository of videos in Google Drive, featuring: Individual Concepts, General Football Tactics, specific clips for Men's football and Women's football, Tactical Camera Footage, and Training Sessions.
- Carlon Carpenter's Performance Analysis] repository of videos in Google Drive, featuring: Analysis Sample Reports and Writing and Sportscode Materials.
- Coaching Video Content Google Drive repository by Michael Loftman
Webinars and Lectures
- Laurie Shaw's Metrica Sports Tracking data series for Friends of Tracking - Introduction, Measuring Physical Performance, Pitch Control modelling, and Valuing Actions. See the following for code [link]
- Lotte BransenandJan Van Haaren's 'Valuating Actions in Football' series for Friends of Tracking - Valuing Actions in Football: Introduction, Valuing Actions in Football 1: From Wyscout Data to Rating Players, Valuing Actions in Football 2: Generating Features, Valuing Actions in Football 3: Training Machine Learning Models, and Valuing Actions in Football 4: Analyzing Models and Results. See the following for code [link]
- David Sumpter's Expected Goals webinars for #FoT - How to Build An Expected Goals Model 1: Data and Model, How to Build An Expected Goals Model 2: Statistical fitting, and The Ultimate Guide to Expected Goals. See the following for code 3xGModel, 4LinearRegression, 5xGModelFit.py, and 6MeasuresOfFit
- Peter McKeever's 'Good practice in data visualisation' webinar for Friends of Tracking. See the following for code [link]
- StatsPerform AI in Sport series - Overview, AI in Basketball, AI In Soccer, and AI in Tennis
- Making Offensive Play Predictable by Paul Power, Michael Stöckl, and Thomas Seidel for Opta Pro Forum 2021
- Google Research Football by Piotr Stanczyk
- Will Spearman's masterclass in Pitch Control for Friends of Tracking
- How Tracking Data is Used in Football and What are the Future Challenges with Javier Fernández, Sudarshan 'Suds' Gopaladesikan, Laurie Shaw, Will Spearman and David Sumpter for Friends of Tracking
- Why Do Clubs Need to Embrace Analytics to Stay Competitive? with Vosse de Boode, David Sumpter, Adrien Tarascon and Javier Fernández for Barça Innovation Hub
- Valuing Actions in Football: Introduction withLotte BransenandJan Van Haaren for Friends of Tracking
- Routine Inspection: Measuring Playbooks for Corner Kicks by Laurie Shaw and Sudarshan 'Suds' Gopaladsikan
- A Physics Based Measurement of Defensive Contributions (2021) by Aditya Kothari
- Enriching Event Data: A Semi-Supervised Augmentation Approach Using Location Information by Debangan Dey, Rahul Ghosal and Atanu Mitra
- Estimating the Change in Soccer… Home Advantage During the COVID-19 Pandemic by Luke Benz and Mike Lopez
- Identifying and Evaluating Strategies for Successfully Penetrating a High Opposition Press from Short Goal Kicks, Played Inside the Box, to Move the Ball into the Opposition Half by Vignesh Jayanth
- Pace and Power: Removing Unconscious Bias from Soccer Broadcasts by Sam Gregory
- Player Masks: Encoding Soccer Decision-Making Tendencies by Devin Pleuler
- Predictive Value of Off-Target Shots in Soccer by Ethan Baron
- Quantifying League-Independent Scoring Ability in Soccer by Daniel Daly-Grafstein. New England Symposium on Statistics in Sports. October 2021.
- The Statistics of Spin in Soccer by Jackson Weaver. New England Symposium on Statistics in Sports by October 2021.
- Volatility and Calculation of Risk-Adjusted Return in Football Scouting by Ola Lidmark Eriksson
- Tactical Insight Through Team Personas by David Perdomo Meza and Daniel Girela. See accompanying blog post [link]
- Training Ground Guru webinairs
- Christmas Lectures 2019: How to Get Lucky with Hannah Fry. Small segment with Tim Waskett @ 27mins
- I’m in a Wide Open Space: Creating Opportunities at Set Pieces by Dan Barnett
- Long or Short? How the New Short Goal Kick Rule Is Impacting Football by Tom Worville
- Identifying and Evaluating Strategies to Break down a Low Block Defence by Vignesh Jayanth. See accompanying blog post [link]
- Seeing in to the Future: Modelling Football Player Movements by David Sumpter
- Learning Value and Style from Soccer Event Stream Data by Tom Decroo
- Marcelo Bielsa's infamous 'Spygate PowerPoint presentation of Derby County [link]
- Tom Goodall's Tactics, Training & Tableau: Football Tableau User Group. Check out his Football Tableau training courses [link
- Data Robot Opening Remarks & Keynote: Making Better Decisions, Faster with Brian Prestidge
- A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains by Sarah Rudd. Accompanying slides [link]
- Demystifying Tracking data Sportlogiq webinar by Sam Gregory and Devin Pleuler
- Data Analytics in Soccer by Dan Fradley
- How Hammarby create the mathematically perfect pressing game by David Sumpter
- Hudl Presents: Performance Analysis in 2020
- Self-Supervised Representations for Tracking Data by Karun Singh
- An American Analyst in London at SSAC 2019 with StatsBomb CEO Ted Knutson and Houston Rockets GM Daryl Morey
- Beyond the Baseline by Marek Kwiatkowski
- Some Things Aren't Shots by Thom Lawrence
- Beyond Save Percentage by Derrick Yam
- Expected goals demonstration by Sander Ijtsma
- Goals change games by Garry Gelade
- Expected goals by Dan Altman
Ted Talks
- What Football Analytics can Teach Successful Organisation by Rasmus Ankersen
- Soccermatics: how maths explains football by David Sumpter
- Changing the soccer transfer market with big data by Giels Brouwer
Documentaries
- The Numbers Game: How Data Is Changing Football - FourFourTwo Documentary
- How Stats Won Football: From Moneyball to FC Midtjylland – COPA90 Stories Documentary
Match Highlights
- Footballia - historical matches and highlights
Other
🔊 Podcasts
Below I've tried to include both the Sports/Football Analytics and then notable episodes of all podcasts that have analytical content/interviews. Spotify and YouTube links used where available. All episodes mentioned below that are available on Spotify can be found in the following playlist (updated periodically): [link].
Football Analytics Podcasts
- All Stats Aren't We with Jon Mackenzie and Josh Hobbs (Leeds United Podcast)
- American Soccer Analysis
- Analytics FC Podcast - originally with Tom Worville and Sam Gregory, next with Jon MacKenzie, and now with Alex Stewart
- Big Data Sports (en español) con Marcelo Gantman y Agustin Mario Gimenez
- Chatalytics Podcast by The Chatalytics Podcast
- Corridor of Uncertainty FPL Podcast by Simon and Jamie
- The Dan & Omar Show with Daniel Geey and Omar Chaudhuri
- Double Pivot Podcast
- Differentgame - The Football Analytics Podcast by Paul Riley and Richard Shephard
- Expected Value
- Fanalytics with Mike Lewis
- First Time Finish Podcast with Tom Underhill, Bence Bocsak, and Ninad Barbadikar
- The Football Fanalytics Podcast
- Football Today
- Laptop Gurus
- Looks Good on Paper podcast by Felix Pate
- Measurables Podcast by Brendan Kent
- MRKT Insights with Tim Keech, Ram Srinivas, Matt Lawrence, Kevin Elphick, and Andy McGregor. Formally Jay Socik
- Open Source Sports with Ron Yurko
- A Podcast About Tactics by Jon Mackenzie
- Post Script Podcast by Tiotal Football and John Muller
- The Price of Football Podcast with Kieran Maguire and Kevin Hunter Day. Check out the The Price of Football book by Kieran Maguire.
- The Scouted Football Podcast
- smarterscout: The Why in Analytics by Dan Altman
- Squawka Talker Football Podcast
- SSAC by MIT Sloan Sports Analytics Conference
- StatsBomb
- The SV Podcast
- Target Scouting by Luke Griffin
- Tifo Podcast
- Training Ground Guru
- Three At The Back by Opta Pro
- Unofficial Partner Podcast
- Winning with Data by Gemini Sports Analytics
- xPodcast by Modern Fitba (Scottish football)
- Zonal Marking with Michael Cox, Mark Carey and Ali Maxwell. Previously Tom Worville
Notable Episodes (including non-football-data-specific podcasts)
- All Stats Aren't We:
- Analytics FC Podcast:
- The Beesotted Brentford Pride of West London Podcast
- Bet The Process
- Big Data Sports (Spanish) by Marcelo Gantman and Agustin Mario Gimenez:
- 87: No es Moneyball: es Brentford
- 66: Tres Libros Sobre Sports Analytics Más Allá De Moneyball
- 65: Métrica Sports: La máquina de entender el juego with Bruno Dagnino
- 56: STATS PERFORM: Cómo es el nuevo gigante de los datos del fútbol
- 47: Wyscout: 550 Mil Futbolistas "concentrados" En Un Software
- 35: Big Data Sports - 35: Analistas: Los nuevos "cracks" del fútbol
- 33: Google + IA = Fútbol en Real Time
- Blood Red: The Liverpool FC Podcast
- Burn It All Down
- Campbell's Footballs by Dr. Grant Campbell
- Challengers Podcast:
- Expected goals (2016)
- The Conor J Show:
- ČT sport podcasty
- The Derby County BlogCast
- January window preview with Ram Srinivas (MRKT Insights)
- Economic Rockstar:
- ESPN Daily
- ExtraTime Radio: The Numbers Game Book Club
- Merritt Paulson names his Timbers Mount Rushmore, plus a little storytime | Exploring future of data in soccer featuring Devin Pleuler
- ExtraTime Radio: The Numbers Game Book Club featuring Devin Pleuler
- Expected Value
- Explore Explain with Andy Kirk:
- Fanalytics with Mike Lewis:
- Getting Your Foot in the Door with Sean Steffen
- Fell In Love With A Girl
- Sommerpause Special - FCSP and DCFC: A Different Kind of Football Panel including Dr. Stefan Szymanski on the panel
- Fluid Football
- Freakonomics by Stephen J. Dubner:
- Can Britain Get Its “Great” Back? (Ep. 393) featuring Dr. Ian Graham @ 41m25s
- The Football Analytics Shot by The Power Rank and Ed Feng (usually American football):
- Football CFB Podcast:
- The Football Collective Podcast:
- The Football Ramble
- The Football Pod:
- Football Today
- I Prefer not to Speak
- Infinite Football
- Inside The Newsroom
- Life...On Our Terms
- The Lowdown by Conor Walsh:
- The Lowe Post
- Measurables Podcast by Brendan Kent (football specific episodes):
- Stephanie Kovalchik, Senior Data Scientist at Zelus Analytics
- Simon Banoub, CMO at StatsBomb
- Rob Suddaby, First Team Data Analyst at Norwich City FC
- Ross Moses and Tyler Heaps, US Soccer Analytics and Research
- Mike Treacy, Chairman of Dundalk FC
- Sam Gregory, Data Analyst at Sportlogiq
- Resources
- Expected Goal Chain and Penalty Kick Analysis
- Elo and Tournament Projections
- Mariela Nisotaki, Technical Scout at Norwich City FC
- Expected Goals and Expected Assists
- Brendan Kent, Soccer Data Analyst at the Portland Timbers
- What is sports analytics?
- Men in Blazers:
- MLS Assist (a Total Soccer Show podcast):
- MLS Assist: Advanced soccer metrics explained, how MLS teams use data, and more with Eliot McKinley - Spotify
- The Modern Football Group Podcast
- Modern Soccer Coach Podcast with Gary Curneen:
- Motley Fool Money
- Motley Fool Money: 06.06.2014 with Stefan Szymanski
- New Books in Sports:
- Not The Top 20 Podcast:
- The Nutmegged Arena by The Nutmeg Assist:
- Open Source Sports with Ron Yurko
- Player Chemistry in Soccer with Lotte Bransen
- The Ornstein & Chapman Podcast with David Ornstein and Mark Chapman:
- Latest on the race to sign Erling Haaland and Tuchel's reaction to Chelsea horror show featuring Tom Worville
- Should football scrap transfer fees? with Daniel Geey and Stefan Szymanski
- Football Club Ownership: Data, Decisions & Competitive Edge with Simon Hallett
- Pacey Performance Podcast with Robert Pacey:
- #340 What is data science (and what isn't), data informed decision making with Sudarshan Golaladesikan - Spotify and YouTube
- The PinkUn Norwich City Podcast:
- Pinnacle Podcast:
- Planet Fútbol with Grant Wahl
- The Pomp Podcast
- The Process with James Allcott:
- Purely Arsenal - Football Purists, an AFC podcast
- Rigo Plascencia Deportes, Entrevistas y más:
- The Scouted Football Podcast:
- Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas
- SempreMilan Podcast:
- Sports Tech Research Podcast:
- Soccer Player Development Podcast:
- Episode 12 with Rasmus Ankersen - YouTube
- Squawka Talker Football Podcast:
- State of the World:
- These Football Times
- The Tao of Sports Podcast – The Definitive Sports, Marketing, Business Industry News Podcast:
- This Football Life:
- Tifo Podcast:
- The Transfer Market & 21st Club with Omar Chaudhuri - Spotify and YouTube
- How Memphis Depay Used Data to Find His Next Club with Giels Brouwer - Spotify and YouTube
- How Do Football Clubs Actually Use Statistics? - YouTube
- JJ Bull: Tactical Analysis & Coaching Badges - Spotify and YouTube
- A Day in the Life Of: A Football Recruitment Analyst - Spotify and YouTube
- Liverpool: Pressing, xG Concerns, and Klopp’s Future - Spotify and YouTube
- Understanding Stats in Football with Nikos Overheul - Spotify and YouTube
- Steve Morison: Tactical Insight & Football Psychology - Spotify and YouTube
- Football Tactics with Michael Cox (Zonal Marking) - Spotify and YouTube
- Football, Tactics & History with Jonathan Wilson - Spotify and YouTube
- The Future of Stats: xG, xA - Spotify and YouTube
- The Totally Football Show with James Richardson
- 03/07/2019: Football Hackers with Christoph Biermann
- Total Soccer Show:
- #32 What is xG and why are advanced stats useful in soccer? - YouTube
- Soccer stats and analytics with Ted Knutson (in which Ted explains Expected Goals to Daryl) - YouTube
- Mike L. Goodman (@TheM_L_G) talks USMNT tactical options, EPL trends, Expected Goals - YouTube
- Everton Premier League preview: Mike L. Goodman talks Silva's style, Moise Kean, and replacing - YouTube
- Trademate Sports:
- UCN/USF Sport Management - Sports Business Podcast:
- Wharton Business Daily
- The Wharton Moneyball Post Game Podcast
- Wharton Moneyball: Soccer Analytics, the Women's World Cup & Cirque Du Soleil featuring Ted Knutson
- 6/13/18 Wharton Moneyball with Stefan Szymanski
- Where Others Won't by Cody Royle:
- View From The Byline
👨💻 Notable Figures and Twitter Accounts
- Training Ground Guru Staff Profiles
- Female (identifying) sports analysts by Dr. Catherine Pfaff
- 2020 Analytics Twitter Top 1,000 Power Rankings, calculated by Will Thomson. See the Twitter list created by Luton Town Analytics [link]
- Sports Analytics Twitter list by Jan Van Haaren
- Soccer People Twitter list by John Muller
- Football Analysts Twitter list by Colin Trainor
- Opta Staff Twitter list by Opta
- Football Analyst Community Rankings dashboard by Neil Charles
- Football data Analysts spreadsheet by Dan Altman (few years old now but lists the OGs of football analytics)
- Introduction to Soccer Analytics – The Guys I Follow by Ted Knutson (a old, 2013 bio of the guys that started the football analytics movement. Now out-of-date, but great if you want to know who helped influence football analytics).
🗓️ Events and Conferences
- OptaPro Analytics Forum
- StatsBomb Conference
- Barça Sports Tomorrow, Sports Analytics Summit, and Sports Technology Symposium
- MIT Sloan Sports Analytics Conference
- New England Symposium on Statistics in Sports (NESSIS
- Carnegie Mellon Sports Analytics Conference
- CASSIS
- Tactical Insights 2020 Conference at King Power Stadium
- Artificial Intelligence in Team Sports (AITS) and [link]
- Machine Learning and Data Mining for Sports Analytics
- International Workshop on Computer Vision in Sports
- Google Sports Analytics Meetup.
- DFB Hackathon
- (Ohio State Sports Analytics Association Conference)[http://org.osu.edu/sportsanalytics/]
- PSG Sports Analytics Challenge
- Football Data International Forum
- Global Training Camp
- Great Lakes Analytics Conference
- MathSport International
- Sports Analytics World Series
- Sportdata & Performance Forum.
Competitions
The following includes non-football competitions.
- NFL Big Data Bowl (American Football) - 2021 - annual
- Big Data Cup (Hockey) - annual
- Google Research Football with Manchester City F.C. - October 2020
- Liverpool Analytics Challenge (Football) - May 2020. Challenge used Last Row Tracking-like data kindly provided by Ricardo Tavares. Full a full list of entries, see David Sumpter's Medium post [link], featuring the three eventual winners - Surya Kocherlakota, Theophane Gregoir and Paul Garnier's, and Gabin Rolland (discussed on Friends of Tracking [link]).
Courses
- Mathematical Modelling of Football by Uppsala University
- StatsBomb Academy
- Sport Analytics and Technologies MSc at Loughborough University, taught by Donald Barron
- Football Analytics short course by StatsPerform with Birkbeck University
- Barça Innovation Hub
- Training Ground Guru Masterclasses:
- SQL Masterclass by Edd Webster
- Python webinars:
- Introduction To Python Masterclass Jamie Dos Anjos (FC Python)
- Python Match Analysis Masterclass Jamie Dos Anjos (FC Python)
- Tableau Masterclass by Tom Goodall
- PowerBI Masterclass by Harriet Eastham
- Big Data webinars:
- 2020 featuring Paul Neilson, Alex Kleyn, Arjav Trivedi, Geir Jordet, Ed Sulley, Austin Fuller, Paul Power, and Devin Pleuler
- 2021 featuring Mikkel Keldmann, James Young, Jan Van Haaren, David Perdomo, Jens Melvang, Luke Bornn, and Pedro Pereira
- 2022 featuring Rodrigo Picchioni, Mladen Sormaz, Tyler Heaps, Emily Angwin, Fabio Nevado Garrosa, Akhil Shah, James Young, Jonny Whitmore, and Lucy Rowland
- Individual Development Coaching webinars:
- Coach Paint Masterclass by João Nuno
- Scouting & Recruitment Masterclass featuring Rhys Carr, Mussell Martin, Saul Isaksson-Hurst, Matt Crocker, Leigh Bromby, Shayne Hall, and Austin Fuller
- Youth Development Masterclass featuring Nick Cox, Michael Hamilton, per Mertesacker, Gregg Broughton, and Ged Roddy
- Recovery Webinar featuring Will Abbott, Marcus Hannon, Anna West, Tony Strudwick, and Scott Guyett
- The Future of the Game Masterclass featuring David Slemen, Sarah Murray, Lorraine O'Maley, Karen Bardsley, David Sumpter, Eddie Jones, Dave Reddin, Tom Vernon, and Ben Mackriell
💼 Jobs
For live job postings tracked by the community, check the Jobs channel of the Football in Numbers Discord server.
Clubs
The list of clubs is quite UK-centric. I would like to add more clubs but it takes a bit of time.
Premier League
- Arsenal
- Aston Villa
- Bournemouth
- Brentford
- Brighton & Hove Albion
- Burnley
- Chelsea
- Crystal Palace
- Everton
- Fulham
- Liverpool
- Luton Town
- Manchester City. See also the City Football Insights Twitter account
- Manchester United
- Newcastle United
- Nottingham Forest
- Sheffield United
- Tottenham Hotspur
- West Ham United
- Wolverhampton Wanderers
Championship
- Birmingham City
- Blackburn Rovers
- Bristol City
- Cardiff City
- Coventry City
- Huddersfield Town
- Hull City
- Ipswich Town
- Leeds United
- Leicester City. See also the LCFC Analysis Twitter account
- Middlesbrough
- Millwall
- Norwich City
- Plymouth Argyle
- Preston North End
- Queens Park Rangers
- Rotherham United
- Sheffield Wednesday
- Southampton
- Stoke City
- Sunderland
- Swansea City
- Watford
- West Bromwich Albion
League One
- Barnsley
- Blackpool
- Bolton Wanderers
- Bristol Rovers
- Burton Albion
- Cambridge United
- Carlisle United
- Charlton Athletic
- Cheltenham Town
- Derby County
- Cheltenham Town
- Fleetwood Town
- Leyton Orient
- Lincoln City
- Northampton Town
- Oxford United
- Peterborough United
- Port Vale
- Portsmouth
- Reading
- Shrewsbury Town
- Stevenage
- Wigan Athletic
- Wycombe Wanderers
League Two
- Accrington Stanley
- AFC Wimbledon
- Barrow
- Bradford City
- Colchester United
- Crawley Town
- Crewe Alexandra
- Doncaster Rovers
- Forest Green Rovers
- Gillingham
- Grimsby Town
- Harrogate Town
- Mansfield Town
- Milton Keynes Dons
- Morecambe
- Newport County
- Notts County
- Salford City
- Stockport County
- Sutton United
- Swindon Town
- Tranmere Rovers
- Walsall
- Wrexham
Scottish Premier League
- Aberdeen
- Celtic
- Dundee
- Heart of Midlothian
- Hiberian
- Kilmarnock
- Livingston
- Motherwell
- Rangers
- Ross County
- St. Johnstone
- St. Mirren
Analytics Companies and Consultancies
- driblab
- Genius Sports and [link]
- Gracenote and [link]
- Football Radar
- Gemini Sports Analytics
- Global Sports
- Hudl
- Metrica Sports
- Opta
- Second Spectrum
- SciSports
- SkillCorner
- SRC FTBL
- Statsbomb
- Stats Perform and [link]
- Sportec Solutions
- Prospect Sporting Insights
- TwentyFirst Group
- Wyscout and [email protected]
- Zelus Analytics
Associations and Organisations
- English Football League
- EFL Trust
- The FA
- FIFA
- Football Association of Ireland
- Football Association of Wales
- Irish Football Association
- MLS Careers
- The Premier League
- Scottish FA
- Sport Scotland
- UEFA
- UK Sport
Betting Companiess
Media
Job Boards
- Association of Professional Football Analysis
- Freelance Football Opps
- Football Careers
- The Football Scouts
- FutbolJobs
- Global Sports
- Jobs4Football
- JobsInFootball
- Sports Jobs UK
- Teamworkonline
- Training Ground Guru Job Board. See also the Training Ground Guru Twitter account
- Women in Football
Other Website Lists
- [The Ultimate List of Football Job Sites](https://www.smartodds.co.uk/Careers/Vacancies](https://analyisport.com/complete-list-of-football-job-websites/) by analyisport
Discord/Slack groups
- Football in Numbers Discord server organised by McKay Johns
- Uppsala Mathematical Modelling of Football Slack group organised by Novosom Salvador
- Tableau for Sports Discord server organised by Ninad Barbadikar
- Football Analysts Discord server organised by Carlon Carpenter
- Scouted Football Discord server.
🔑 Key Concepts
Focus on some of the key topics in football analytics. Most of the following resources features above but are instead reorganised by topic. This section is still very much a work in progress as I go along and may be missing resources mentioned above.
History of Football Analytics
- Charles Reep Wiki
- Analytics is older than you think: (re)introducing Charles Reep by Mark Thompson) for his newsletter Get Goalside!
- The evolution of football data by Mark Thompson
- Goal Scoring in Association Football: Charles Reep by Keith Lyons
- The Charles Reep and Bernard Benjamin Paper 50 Years On (1) by Keith Lyons
- Bernard Benjamin profile by Keith Lyons
- Charles, Richard, Neil and Simon: the stories we craft by Keith Lyons
- Football’s Pioneer – The Charles Reep story by Rob Carroll
- Grim Reep by Barnay Ronay
- History of Performance Analysis: The Controversial Pioneer Charles Reep by Guillermo Martinez Arastey
- The Soccer Analytics Revolution by Nathan Luzum and Michael Model
- How One Man’s Bad Math Helped Ruin Decades Of English Soccer by Joe Sykes and Neil Paine for FiveThirtyEight
- The History of Sports Analysis: The Man Who Ruined English Football by Duncan Ritchie
- No, seriously: what the heck is expected goals (xG)? by James Maw
- Don't Shoot the Messenger. The First Football Analyst Was a Pioneer 50 Years Ahead of His Time by Alan Campbell
- Papers by and about Charles Reep:
- Skill and Chance in Ball Games by Charles Reep, Bernard Benjamin and Richard Pollard
- Charles Reep (1904-2002): pioneer of notational and performance analysis in football by Richard Pollard
- Invalid Interpretation of Passing Sequence Data to Assess Team Performance in Football: Repairing the Tarnished Legacy of Charles Reep by Richard Pollard
- A Twitter thread of the original football analytics blogs from 2009 by Tiotal Football [link].
Expected Goals (xG) Modeling
Videos
For a playlist of Expected Goals related videos available on YouTube, see the following playlist I have created [link].
- What is xG? by Tifo Football
- Opta Expected Goals by The Analyst (formally Opta)
- What are Expected Goals? by David Sumpter and Axel Pershagen
- Anatomy of a Goal by Numberphile Brady Haran)
- How Did These Goals Go In? - We Explain How Goal Probability Works by the Bundesliga
- Soccer Analytics: Expected Goals by Dan Altman
- Anatomy of an Expected Goal by 11tegen (Sander IJtsma)
Webinars and Lectures
- David Sumpter's Expected Goals webinars for Friends of Tracking (see the following for code 3xGModel, 4LinearRegression, 5xGModelFit.py, and 6MeasuresOfFit):
- "Is Our Model Learning What We Think It Is?" Estimating the xG Impact of Actions in Football by Tom Decroos from the 2019 StatsBombInnovation in Football Conference
- Statsbomb Data Launch - Beyond Naive xG by Ted Knutson.
Tutorials
- Tech how-to: build your own Expected Goals model by Jan Van Haaren and SciSports.
- Fitting your own football xG model by Dato Fútbol (Ismael Gómez Schmidt). See GitHub repo [link]
- Python for Fantasy Football series by Fantasy Futopia (Thomas Whelan). See the following posts:
- Expected Goals & Player Analysis by Gabriel Manfredi
- Building an Expected Goals Model in Python by Peter McKeever (using WayBackMachine)
- An xG Model for Everyone in 20 minutes (ish) by Football Fact Man (Paul Riley).
Notable Models
- Sam Green's xG model
- Michael Caley's xG model
- 11tegen (Sander IJtsma)'s xG model (using WayBackMachine).
Written Pieces
For a collated list of Expected Goals literature collated by Keith Lyons, see the following [link]
- xG explained by FBref
- What are expected Goals? by American Soccer Analysis
- David Sumpter's Expected Goals pieces:
- Michael Caley's Expected Goals pieces:
- Jesse Davis and Pieter Robberechts' Expected Goals pieces for KU Leuven
- Does xG really tell us everything about team performance? by Ben Torvanay
- Unexpected goals Will Gürpinar-Morgan
- Great Expectations by Will Gürpinar-Morgan
- On single match expected goal totals by 2+2=11 (Will Gürpinar-Morgan])
- Martin Eastwoood (Pena.lt/y)'s Expected Goals pieces [link]
- Expected Goals For All.
- Actual Goals Versus Expected Goals
- Expected Goals Updated
- Expected Goals: The Y Axis
- Expected Goals And Exponential Decay
- Expected Goals: Foot Shots Versus Headers
- Expected Goals And Support Vector Machines
- Expected Goals and Uncertainty
- Sharing xG Using Multi-touch Attribution Modelling.
- Garry Gelade's Expected Goals pieces:
- Expected Goals and Unexpected Goals (using WayBackMachine)
- Assessing Expected Goals Models. Part 1: Shots (using WayBackMachine)
- Assessing Expected Goals Models. Part 2: Anatomy of a Big Chance (using WayBackMachine)
- 11tegen (Sander IJtsma)'s Expected Goals pieces:
- A close look at my new Expected Goals Model (using WayBackMachine)
- The best predictor for future performance is Expected Goals (using WayBackMachine)
- Ted Knutson's Expected Goals pieces:
- Anatomy of a Shot by Thom Lawrence
- Modern Fitba's Expected Goal Guides Part 1 and Part 2 by Christian Wulff
- [How StatsBombData Helps Measure Counter-Pressing](https://statsbomb.com/2018/05/how-statsbomb-data-helps-measure-counter-pressing/) by Will Gürpinar-Morgan
- A Shooting Model – An Exp(G)lanation and Application by Paul Riley
- Introducing xGChain and xGBuildup by Thom Lawrence
- Introduction Expected Goals on Target (xGoT) by Jonny Whitmore
- Quantifying finishing skill by Marek Kwiatkowski
- The Dual Life of Expected Goals (Part 1) by Mike L. Goodman
- Many bad shots or one good shot? by Luis Husier
- Expected Goals Just Don’t Add Up — They Also Multiply. by Danny Page
- An analysis of different expected goals models by Benjamin Cronin
- Expected Goals 3.0 Methodology by Matthias Kullowatz
- A simple Expected Goals model by Cricket Savant
- How we calculate Expected Goals (xG) by Fantasy Football Fix
- Una mirada al Soccer Analytics usando R — Parte III by Dato Fútbol (Ismael Gómez Schmidt).
Libraries
soccer-xg
by Jesse Davis and Pieter Robberechts at KU Leuven.
GitHub Repositories
Expected Goals Thesis
by Andrew Rowlinson. See both his thesis [link] and the following notebooks:expected_goals_deep_dive
by Andrew Puopolo. See the following notebooks:soccer_analytics
by Kraus Clemens. See the following notebooks:xg-model
] by Dato Fútbol (Ismael Gómez Schmidt)xG_Model_Workflow
by Ian Dragulet
Podcasts
- Expected Goals Extravaganza by The Double Pivot podcast
- Extreme nerding out over expected goals by The Double Pivot podcast
- Explaing xGChain, evaluating defensive midfielders and more - it's the Mailbag by The Double Pivot podcast
- Understanding why Burnley don't break expected goals by The Double Pivot podcast
- #1: What Did You Expect? - Spotify by The Football Fanalytics Podcast
- Expected Goals and Expected Assists by Measureables
- Advanced soccer metrics explained, how MLS teams use data, and more with Eliot McKinley by MLS Assist
- Rating players with expected goals from shot creation by smarterscout
- Expected goals from ball progression and tactical applications by smarterscout
- AVFC Extra #1 - xG, xA & PPG - The abbreviations of modern football explained by Claret & Blue podcast
- Episode 3 - xG 101, West Ham in Trouble? Norwich Doomed? by Differentgame
- Episode 5 - Shot Stoppers, xG at Corners, Building a Passing Model by Differentgame
- What is xG by For the Love of Paul McGrath podcast
- The Future of Stats: xG, xA - Spotify and YouTube by Tifo Podcast
- #56: Dominic Calvert-Lewin & Explaining Expected Goals - Spotify and YouTube by The Scouted Football Podcast.
Tweets
- The benefits of including fake data in an Expected Goals model [link]
- Twitter thread by Jernej Flisar to build an Expected Goals model trained with Logistic Regression on StatsBombEvent data and using the model to predict Liverpool goals from Tracking-like data provided by Ricardo Tavares (Last Row View) for Friends of Tracking [link]. The model uses the SHAP library for feature interpretation.
Web Scraping Football Data
Written Pieces
- Scraping Premier League Football Data with Python by Jamie Dos Anjos (FC Python)
- Football Data Visualizations - Passing Networks by Karol Działowski - a great blog post on how to create passing networks from first principles, specifically to webscraping, how to get Opta Event data from WhoScored. The second part of the blog shows how to create a passing network from this data using matplotlib.
- Python for Fantasy Football – Getting and Cleaning Data by Fantasy Futopia
- Intro to {polite} Web Scraping of Soccer Data with R! by Ryo Nakagawara
- Web Scraping Advanced Football Statistics by Sergi Lehkyi
- Web Scraping Football Data — Serverless Edition by Sergi Lehkyi
- How to Build a Football Dataset with Web Scraping by Otávio Simões Silveira
Videos
- How to scrape Understat for football data in Python with requests and BeautifulSoup by McKay Johns
- How to scrape SPORTS STATS websites with Python by John Watson Rooney
Libraries
ScraperFC
- a Python package to scrape data from FBRef, Understat and FiveThirtyEight by Owen SeymourScrape-FBref-data
- Python library to scrape StatsBombdata via FBref by Parthe Athale, which in turn was updated from Christopher Martin's repositorytmscrape
- a Python TransferMarkt webscraper by danzn1Tyrone Mings
- a Python TransferMarkt webscraper by FCrSTATSworldfootballR
- a R package to allow users to extract various world football results and player statistics data from FBref and valuations and transfer data from TransferMarkt.com by Jason Zivkovic (see guide on how to use this package [link])understat
- a Python webscraper by Amos Bastianunderstatr
- a R package to scrape data from Understat.
Tracking Data
- Laurie Shaw's Metrica Sports Tracking data series for Friends of Tracking - Introduction, Measuring Physical Performance, Pitch Control modelling, and Valuing Actions. See the following for code [link]
- How Tracking Data is Used in Football and What are the Future Challenges with Javier Fernández, Sudarshan 'Suds' Gopaladesikan, Laurie Shaw, Will Spearman and David Sumpter for Friends of Tracking
- Introduction to tracking data in football. by David Sumpter for Friends of Tracking
- Learning to Watch Football: Self-Supervised Representations for Tracking Data by Karun Singh. See accompanying blog post [link]
- On Tracking Data, the Nature of Soccer, and Allocation by Tiotal Football, as part of their Absolute Unit newsletter
- How Hoffenheim are helping to democratise tracking data by Training Ground Guru.
Pitch Control Modeling
Tutorials
Pitch Control modelling and Valuing Actions tutorials by Laurie Shaw as part of his Metrica Sports Tracking data series for Friends of Tracking. See the following for code [link]
GitHub Repositories
Metrica-pitch-control
by Will Thompson - a Python implementation of Javier Fernández and Luke Bornn's Pitch Control model from their paper Wide Open Spaces: A statistical technique for measuring space creation in professional soccer (2018) and Will Spearman's Pitch Control model from his paper Beyond Expected Goals (2018). The respectively Google Colab notebooks are available [link] and [link]
Written Pieces
- Everything you need to know about 'pitch control' by Mark Thompson
- A Framework for the Fine-Grained Evaluation of the Instantaneous Expected Value of Soccer Possessions (2020) by Javier Fernández, Luke Bornn and Daniel Cervone
- Decomposing the Immeasurable Sport: A deep learning expected possession value framework for soccer (2019) by Javier Fernández, Bornn, and Dan Cervone. Accompanying talks - SSAC19, StatsBomb conference
- Beyond Expected Goals (2018) by Will Spearman
- Replaying the NBA (2018) by Luke Bornn
- Wide Open Spaces: A statistical technique for measuring space creation in professional soccer (2018) by Javier Fernandez and Luke Bornn
- Physics-Based Modeling of Pass Probabilities in Soccer (2017) by Will Spearman, Austin Basye, Greg Dick, Ryan Hotovy, and Paul Pop
Video
- Will Spearman's masterclass in Pitch Control for Friends of Tracking
- How to create the mathematically perfect press using pitch control. by David Sumpter for Friends of Tracking.
Podcasts
Passing Networks
Written Pieces
Blogs
- How each Premier League team pass by John Muller for The Athletic
- Interactive Passing Networks by Karun Singh
- Explaining xGChain Passing Networks by Ted Knutson
Papers
- Using Network Science to Quantify the Identifiability of Football Teams by Javier M. Buldú, David Garrido
- The role of passing network indicators in modeling football outcomes: an application using Bayesian hierarchical models by Riccardo Ievoli, Aldo Gardini & Lucio Palazzo
Tutorials
- Football Data Visualizations - Passing Networks by Karol Działowski - a great blog post on how to create passing networks from first principles, using Opta Event data acquired from WhoScored. This data is then visualised using matplotlib
- Creating a Passmap in Python by Abhishek Sharma
- Football passing networks using R by Dato Fútbol (Ismael Gómez Schmidt)
- How to Render 3D Football Pass Network by Daniel Linke
- Plotting a passing network on a football pitch by Alberto Rodríguez Martín
- Medium blog post by Rahul Iyer - Guide to Creating Passing Networks in Tableau
Videos
- How to Create Football Pass Networks in Python by McKay Johns
- Creating passing networks with Barcelona data scientist by Serio Llana, to build customizable passing networks with matplotlib for Friends of Tracking. The code is prepared to use both eventing (StatsBomb) and tracking data (Metrica Sports). See the corresponding GitHub repository -
passing-networks-in-python
- Introduction to Soccer Pass Network Analysis with Python
Tweets
- Premier League Passing Networks by John Muller for The Athletic
- Pass network EPL 2019/20 by Cheuk Hei Ho
- Pass Network now with xT by Matt Trevillion
Possession Value (PV) Frameworks
General
- A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains (2011) by Sarah Rudd
- Attacking Contributions: Markov Models for Football by Derrick Yam for StatsBomb
- Introducing a Possession Value Framework by Nils Mackay for Stats Perform
- Expected Potential (xPo) by Aditya Kothari (The Come On Man)
- Deep Soccer Analytics: Learning an Action-Value Function for Evaluating Soccer Players by Guiliang Liu, Yudong Luo, Oliver Schulte, Tarak Kharrat
- Evolving Our Possession Value Framework by Jonny Whitmore for Stats Perform
- Why Possession Value Is Bollocks by Paul Riley.
Expected Threat (xT)
- Introducing Expected Threat (xT) by Karun Singh. Check out also as an unrolled Twitter thread [link] Karun's Twitter thread for the many resources out there around this topic, including: Episode 19 of The Football Fanalytics Podcast, Karun's StatsBombconference presentation [link] and slides [link], Rob Hickman's StatsBombconference presentation where he extended xT to take defensive risk into account [link], Last Row View (Ricardo Tavares)'s blog post for evaluating off-the-ball player movements by combining xT and tracking data, and Karun's xT values as a 12x8 grid to download as a JSON file [link].
- Introducing ‘expected threat’ (or xT), the new metric on the block by Tom Worville
- Explaining Expected Threat by David Sumpter
- Football's New Stat - What is Expected Threat? by Tifo
- How to Calculate Expected Threat (xT) in Python by McKay Johns. See the corresponding GitHub repository [link] and Jupyter notebook [link]
- Implementing Expected Threat (xT) in Julia by Abhishek Sharma
Valuing Actions by Estimating Probabilities (VAEP)
- Lotte BransenandJan Van Haaren's 'Valuing Actions in Football' series for Friends of Tracking - Valuing Actions in Football: Introduction, Valuing Actions in Football 1: From Wyscout Data to Rating Players, Valuing Actions in Football 2: Generating Features, Valuing Actions in Football 3: Training Machine Learning Models, and Valuing Actions in Football 4: Analyzing Models and Results. See the following for code [link]
- STARSS: A Spatio-Temporal Action Rating System for Soccer by Tom Decroos,Jan Van Haaren, Vladimir Dzyuba, Jesse Davis
- Actions Speak Louder Than Goals: Valuing Player Actions in Soccer (V1) by Tom Decroos, Lotte Bransen,Jan Van Haaren, Jesse Davis
- Actions Speak Louder Than Goals: Valuing Player Actions in Soccer (V2) by Tom Decroos, Lotte Bransen,Jan Van Haaren, Jesse Davis.
Goals Added (g+)
- Goals Added: Introducing a New Way to Measure Soccer by John Muller for American Soccer Analysis
- The future of possession value models with David Sumpter, Catherine Pfaff, Matthias Kullowatz and Jernej Flisar for Friends of Tracking. The Goals Added (g+) model is focussed on in minutes 9-45 of the lecture.
On-Ball Value (OBV)
Dixon Coles Modeling
- Modelling Association Football Scores and Inefficiencies in the Football Betting Market (1997) by Mark Dixon and Stuart Coles
- Analysis of football prediction methods by William Brojanigo
- Predicting Football Results Using Python and the Dixon and Coles Model by Martin Eastwood
- Dixon Coles and xG: together at last by Ben Torvaney
- A generic Dixon-Coles model for estimating team strengths by Ben Torvaney
- Dixon Coles by Mathematical Football Predictions
- Dixon Coles Model by Philip Winchester
- Predicting Football Results With Statistical Modelling: Dixon-Coles and Time-Weighting by David Sheehan
Player Similarity and Style Analysis
Written Pieces
- The Seven Styles of Soccer by John Muller
- You Down With t-SNE? by Eliot McKinley and Cheuk Hei Ho for American Soccer Analysis
- Tweet to Clustering European Teams by Behaviors by Cheuk Hei Ho
- Defining Player Roles: How Every Player Contributes to Goals by Michael Imburgio for American Soccer Analysis
- The Relentless, the Chaotic, and the Bus Conductors by James McMahon
- Similar Player Tool by Niklas Hemmer
- The Bargain Bin Bielsa Machine by James McMahon
- Using Machine Learning To Find Players In Similar Roles In Scotland by Matt Rhein
- Comparing Players: Clustering and Style of Play by American Soccer Analysis
- Clustering Playing Stles in the Modern Day Fullback by Mark Carey and Mladen Sormaz
- Finding a replacement for Gerard Pique using Machine Learning by Malhar B.
Videos
Tutorials
- Grouping Soccer Players with Similar Skillsets in FIFA 20 by Jaemin Lee
- Clustering Football Players by Using FIFA 19 Data by Oğuz Can Yurteri
GitHub Repositories
Reinforcement Learning for Football Simulation
- Google Research Football: A Novel Reinforcement Learning Environment (2020) by Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly
Google Research Football
GitHub repo- Google Research Football with Manchester City F.C. Kaggle Competition (ended October 2020)
- Karol Kurach - Google Research Football
- Karol Kurach (Google Brain) "Google Research Football: Learning to Play Football with Deep RL
- Google Research Football by Piotr Stanczyk
- Google's AI Plays Football…For Science! by Two Minute Papers
Player Rating Modelling
Written Pieces
- WhoScored Ratings Explained
- Introducing Opta Player Ratings: Premier League Star Players in 2022-23
- Ten for exceptional: Flashscore introduces football player ratings into our stats universe
- Henshaw Analysis player ratings — methodology, discussion & examples by Liam Henshaw
- AW Role Scouting System: The Launch by Andy Watson. See video [link]
Podcasts
- How is data actually used to scout players? | Tifo Football Podcast Special with Dan Pelchen
- Tifo Talks: The role tactics play in data analysis with Dan Pelchen
Github Repos
Companies
- Traits Insights
Team Playing Style Analysis
Written Pieces
Papers
- Identifying Play Styles of Football Players Based on Match Event Data (2021) by Mark Riezebos
- Soccer Analytics Meets Artificial Intelligence: Learning Value and Style from Soccer Event Stream Data (2020) by Tom Decroos
- SoccerMix: Representing Soccer Actions with Mixture Models (2020) by Tom Decroos, Maaike Van Roy, and Jesse Davis
- Actions Speak Louder Than Goals: Valuing Player Actions in Soccer (2019) by Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis
- Player Vectors: Characterizing Soccer Players’ Playing Style from Match Event Streams (2019) by Tom Decroos and Jesse Davis (discussed in the DeepMind blog: Advancing sports analytics through AI research)
- Automatic Discovery of Tactics in Spatio-Temporal Soccer Match Data (2018) by Tom Decroos, Jan Van Haaren, and Jesse Davis
- Distinguishing Between Roles of Football Players in Play-by-play Match Event Data by Bart Aalbers and Jan Van Haaren
- Analysis of association football playing styles: An innovative method to cluster networks (2018) by Jacopo Diquigiovanni and Bruno Scrapa
- Predicting Soccer Highlights from Spatio-Temporal Match Event Streams (2017) by Tom Decroos, Vladimir Dzyuba, Jan Van Haaren, and Jesse Davis
- Game style in soccer: what is it and can we quantify it? by Adam Hewitt, Grace Greenham, and Kevin Norton
Blogs
- Stats Perform Playing Styles - An Introduction by Andy Cooper for Stats Perform
- Introducing Role Discovery: Generating Data-Driven Roles in Elite Professional Football
- How Does the Context of the Game Impact the Style of Play in Football Teams? by Carlos Lago Peñas for Barça Innovation Hub
- Comparing Players: Clustering and Style of Play by Sam Goldberg
- Player Roles: How to find the right type of player for your team? by SciSports
- SciSports 22 Player Roles
- Clustering Playing Styles in the Modern Day Full-Back by Mark Carey and Mladen Sormaz
Videos
- Learning Value and Style from Soccer Event Stream Data by Tom Decroos, as part of the Sports Analytics Lab at Harvard University and American Statistical Association Section on Statistics in Sports seminar series
- Presentation at SIGKDD 2019 | Actions Speak Louder than Goals: Valuing Player Actions in Soccer by Tom Decroos
- Making Offensive Play Predictable by Paul Power for the Opta Pro Forum 2021
- How to properly compare players by Paul Power for Training Ground Guru
- Measuring Style of Play in Football Using Statistics and Machine Learning by Xiaoyi Ji (Sia)
- Stats Perform Playing Styles playlist
GitHub Repositories
Set Pieces
Section created after seeing the following tweets and threads by Ashwin Raman ([link]) and Stuart Reid ([link])
- Dynamic Analysis of Team Strategy in Professional Football (2019) by Laurie Shaw and Mark Glickman. Accompanying talks - NESSIS 2019, 2020 Google Sports Analytics Meetup
- Breaking Down Set Pieces: Picks, Packs, Stacks and More by Euan Dewar
- Tactical Theory: Set-Pieces by István Beregi
- Set-Piece Analysis: A comprehensive guide to zonal marking from corners by Cameron Meighan. See all his pieces [link]
- Changing How the World Thinks About Set Pieces by Ted Knutson
- Set Pieces and Market Efficiency by Ted Knutson
- The Blades’ Sharpest Edge: A look at Sheffield United’s 17/18 Set Pieces. by Oli Walker
- Pieces by Marc Lamberts [link]
- Pieces by Stuart Reid [link].
Radars
- StatsBomb radar articles. For all articles, see the following: [link]
- [Understanding StatsBombRadars](https://statsbomb.com/2021/07/understanding-statsbomb-radars/) by StatsBomb(16/07/2021)
- New Team, Same Numbers: How Transfers Do (And Don't) Change Player Output by Tim Keech (06/03/2019)
- Introducing Goalkeeper Radars by Ted Knutson (11/12/2018)
- Radar Wars
- [New Data, New StatsBombRadars](https://statsbomb.com/2018/08/new-data-new-statsbomb-radars/) by Ted Knutson (03/08/2018)
- Revisiting Radars by Ted Knutson (18/05/2017)
- Understanding Football Radars For Mugs and Muggles by Ted Knutson (25/04/2016)
- Radar Wars - CASSIS Presentation Summer 2018
- Models for evaluating players part 2: Player radars by David Sumpter for Friends of Tracking
- Introducing Twenty3’s Dynamic Radars
- Radar Charts in mplsoccer
soccerplots
⭐ - a Python package that can be used for making visualizations for football analytics by Anmol Durgapal- Building a Radar Plot in ggplot2 by FC rSTATS
Recruitment Analysis
- Gerard Moore uses the Event Lab to analyse centre-backs for recruitment by Gerard Moore for Twenty3
- [Using StatsBombIQ For Player Recruitment: Centre Backs](https://statsbomb.com/2021/06/using-statsbomb-iq-for-player-recruitment-centre-backs/) by StatsBomb (05/06/2021)
- Recruitment & Analysis at Melborne City: Optimising Key Processes using Data and Technology by Andy Cooper for Stats Perform
- Season Analysis & Summer Recruitment pieces
- 20/21:
- 21/22:
- Motherwell Summer 2021 Recruitment Plan by Greg Marshall (see tweet [link]
- Nottingham Forest Recruitment Plan Summer 2021 by Liam Henshaw (see tweet [link]
- Sheffield Wednesday Recruitment Plan for the 21/22 season by Owls Analytics (see tweet [link]
- Celtic F.C. Opposition Report by Liam Bailey
- 22/23 (see the following Twitter thread by Blues Breakdown for 19 summer recruitment plans for 11 clubs [link]):
- LUFC Squad and Recruitment Analysis Summer 2020 by Kris Hilliam
- Transfer Targets for Huddersfield Town by HTAFC STATTO
- Luton Town full-back analysis and possible targets by Oak Road Hatter
- Rochdale Summer Recruitment Plan 2022 by ROCHDALE AFC FAN PAGE
- Wolves Recruitment Plan - Summer 2022 by Dan Butler
- Queens Park Rangers 2022 Summer Recruitment Plan by Tom Ward
- Swindon Town Recruitment Plan Summer 2022 by Tyler
- West Ham United Recruitment Plan, Summer 2022 by Knees up Mother Brown (KUMB).com
-
[Part 1 - The Defence](https://www.kumb.com/article.php?id=44341)
-
[Part 1 - The Midfield](https://www.kumb.com/article.php?id=44343)
-
- Sunderland AFC Data Driven Recruitment Plan 22/23 by SAFC Data Analytics
- Ipswich town Recruitment List - 22/23 by Luke Penning
- West Bromwich Albion Data Recruitment Plan Summer 2022 by Albion Analytics
- Watford’s Recruitment Plan 2022/23 by Aaron Bennis
- Nottingham Forest Summer Cruitment Plan 2022 by Liam Henshaw
- Notts County: Recruitment Plan by Tom Williams
- Ipswich Town 2021/22 Season Overview & 2022/23 Recruitment Plan by ITFC Analytics
- Ipswich Town 2022/23 Recruitment Proposal by Thomas Lane
- Swansea City Summer 2022 Recruitment Plan by ForeseeaBall
- SWFC Recruitment Plan 22/23 by Owls Analytics
- Sunderland AFC: The future of the first-team squad by Daniel Burrell
- Middlesbrough summer recruitment plan 22/23 by Luke
- BWFC Summer Transfer Targets by BWFC Analysis
Quantifying Relative Club and League Strength
Models
Financial
- TransferMarkt values of leagues - Europe, Asia
- Deloitte Football Money League (DFML)
Historical Match Results
- FiveThirtyEight
- Stats Perform’s Power Rankings - ranks 19,008 clubs across 391 domestic leagues, based on a system that scores the current strongest side 100 and the weakest zero. The ratings evolve each day based on the results of both an individual club and those of other teams within its own domestic league. Domestic and European fixtures are taken into account, as well as the strength of each league. Access is not available online but there is theoccasional The Analyst article referring to the rankings - November 2021, October 2021 update
- Global Football Rankings by Ken Ackerson (Pear Apps), powered by FiveThirtyEight's Global Club Rankings
- UEFA Club Coefficients (official rankings). See also on the European Club Association (ECA) [link]
- UEFA European Cup Football Results and Qualification by Bert Kassies
- World Football Elo Ratings
- Club Elo Ratings - Methodology. See Episode 27: A Closer Looking into "Elo Ratings" by The Football Fanalytics Podcast
- Glicko Rating System by Mark Glickman
- Euro Club Index - Methodology
- World Football / Soccer Clubs Ranking by FootballDatabase
- the KA Club Rating - What are the KA Ratings? and Procedure
Historical Statistical Player Performance
- Ben Torvaney's 2018 Opta Pro Forum talk 'Counting Across Borders' [link]. For slides, see Ben's blog post [link] or by from the Stats Perform blog using Waybackmachine [link] and for Ben's original submission, see [link]
- Tony El Habr's Soccer league Strength post (see also [link]
- Aditya Kothari (The Come On Man) uses differences in VAEP values of players who transferred between different leagues [link]
Articles
- Which is the most physically demanding league? by Training Ground Guru
- Soccer Power Index explained by ESPN staff
- Examining FiveThirtyEight’s Soccer Power Index Ratings
- Man Utd, Barcelona, Liverpool: Data analysts rank the top 30 clubs in world football by Kobe Tong
Papers
- PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach by Luca Pappalardo, Paolo Cintia, Paolo Ferragina, Emanuele Massucco, Dino Pedreschi, Fosca Giannotti.
- Ranking soccer teams on the basis of their current strength: A comparison of maximum likelihood approaches by Christophe Ley, Tom Van de Wiele, and Hans Van Eetvelde.
Videos
- Mladen Sormaz's StatsBomb2021 talk Practical tools for ‘Bridging the gap (see @ 20m42s)
Data
club-rankings
by Tony ElHabr - historical daily Opta Power Rankings and FiveThirtyEight Global Club Soccer Rankings
Miscellaneous
- Tweets by AI Abucus [link] and [link]. They use a simple Dickson-Coles method focusing on historic results going back 15 years to build an order of hierarchy amongst teams in leagues that might have never played each other.
Tactics
Counter Attacking
Articles
- On the anatomy of a counter-attack by Will Morgan. Also available at the following [link]
- Quantifying Player Contribution to Counter Attacks by Laurynas Raudonius. See his poster [link]
- Spotlight on: counter-attacks by The FA
- Evolution of Counterattacking by Adin Osmanbašić
- The Various Forms of Restdefences Part 2: Counterattacking
- Counter- or Gegenpressing
- Pressing, counterpressing, and counterattacking by Adin Osmanbašić. Also available at the following [link]
- Tactical Analysis: Defending Against the Counter Attack
- Stats Perform Playing Styles - An Introduction - see the 'Counter Attack' subsection
- Analysis: Leicester City Counter Attack
- The Importance of Counter-Attacking in Football by Max Bergmann
- Premier League Club Stats - Goals From Counter Attack
- InStat Sport Facebook post on counter attacking
Papers
- Quantifying the Value of Transitions in Soccer via Spatiotemporal Trajectory Clustering by Jennifer Hobbs, Paul Power, Long Sha, Hector Ruiz, Patrick Lucey
- Evaluating Football Player Actions During Counterattacks
- Counter attack detection with machine learning from log files of RoboCup simulation
Videos
- Why More Teams Should Counter-Attack | By The Numbers by Tifo
- Carlos Carvalhal • Fast attacks: counter-attacking to organised possession • CV Academy Session
- Dean Wright • Norwich City under-15 Counter-attacking • CV Academy coaching course
- Sit deep and play on the counter-attack | Football tactics | Nike Academy
- How to hit a team on the counter-attack | Soccer drill | Tactics | Nike Academy
- Tactics Explained: Tottenham's counter-attack
- How To Score The Perfect Counter-attack Goal? by Bundesliga
- Course - Counter Attacking Masterclass - Part 1 (HD)
- Course - Counter Attacking Masterclass - Part 2 (HD)
- Wenger's tips: Counter-attacking
- Fast As Lightning Counter-attacks | Premier League | Salah, Aguero, Martial
- Top 10 Counter Attack Goals RB Leipzig - Werner & Co. with Superfast Transitions by Bundesliga
- Top 10 Counter-Attacking Goals 2020/21 so far – Haaland, Gnabry & More by Bundesliga
- Alcacer, Reus & Co. - Top 10 Counter-Attack Goals 2018/19 So Far by Bundesliga
- Top 10 Counter-Attacking Goals - 2015/16 by Bundesliga
- Top 10 counter attack goals - including Lionel Messi v Arsenal
- The best counter-attacking team in Europe according to Pep Guardiola | Oh My Goal
Podcasts
Pressing
Articles
- What is Ball-Oriented Defending: How to defend, press and actively win the ball feat. Rangnick, Klopp & Nagelsmann
- Pressing, counterpressing, and counterattacking by Adin Osmanbašić. Also available at the following [link]
Videos
Counter Pressing
Articles
- Counterpressing variations
- Pressing, counterpressing, and counterattacking by Adin Osmanbašić. Also available at the following [link]
- The Question: is the counter-counter more crucial than the counterattack? by Jonathan Wilson
Papers
- Data-driven detection of counterpressing in professional football [2021] by Pascal Bauer and Gabriel Anzer.
Videos
Player Valuation Modeling
Example Models
- The DePO Models: Bringing Moneyball to Professional Soccer by Sam Goldberg and Mike Imburgio
- TransferRoom Expected Transfer Value (xTV) Webinar - webinar by Daniel Blades of TransferRoom, providing insight into the workings of xTV, a metric created by TransferRoom to enable a data-led measure of a player’s value in the transfer market. The webinar covers: What is xTV? How is it calculated? Does it differ from other valuation methods? How does xTV benchmark players? What is the reliability of xTV? How does xTV help clubs and agents? What are some real-life examples of xTV?
Example Methodologies
- TransferMarkt: Transfermarkt Market Value explained - How is it determined?. See also the paper The Wisdom of Crowds and Transfer Market Values by Dennis Coates and Petr Parshakov
- Football Benchmark by KPMG: KPMG Methodology and limitations of published information
- SciSports: SciSports’ transfer fee prediction model
- CIES Football Observatory: Scientific evaluation of the transfer value of football players by Drs Raffaele Poli, Loïc Ravenel and Roger Besson
- sportskeeda](https://www.sportskeeda.com/): Transfer Values - Calculation explained
- Football Transfer: Data & Algorithms
- The Transfer List - UEFA Market Value Calculator, CONMEBOL Market Value Calculator, CONCACAF Market Value Calculator, and Intra-MLS Market Value Calculator
Written Pieces Regarding the Topic of Player Valuation
Articles
- The Wisdom of the Crowd: Soccer takes the player valuations posted on the website Transfermarkt extremely seriously. It has never really stopped to ask where they come from by Rory Smith for The New York Times
- Soccer’s Confounding Calculation: What’s a Player Worth? by Rory Smith for The New York Times
- The DePO Models: Bringing Moneyball to Professional Soccer by Sam Goldberg and Mike Imburgio
- Transfer window analysed: Less spent, young players targeted and free agents have defined key moves by Tom Worville
- How do you value a player? by Stuart James
- How to value the modern footballer: Algorithms, cheat codes and the scientification of the transfer market by Lawrence Ostlere
- How the volunteers of data website Transfermarkt became influential players at European top football clubs by Pepijn Keppel and Tom Claessens
Blogs
- From Sessegnon to Sanchez: How to calculate the correct market salary for EPL players by Laurie Shaw
- Money Madness: The secrets of a how football player’s transfer value is calculated by Ash for Football Whispers
- Player Valuation: Putting Data to Work on Transfer Market AnalysisFootball Benchmark
- How much do you value your favorite football star? by Johannes Post
- Knutson's Transfer Model Review by StatsBomb
- Analyzing and Identifying Markets for Transfers by Steven Marc Scott
- Determining Player Contracts Based on Player Values Over Time by Steven Marc Scott
- Analyse af Superliga tranfervinduet 2021 by Christian Rønsholt. See the following GitHub [link] for his project analysing the transfer windows in the Danish Superliga from 2010 to 2021
- FIFA considering using AI technology to calculate transfer fees by Paul Macdonald
- Market value of football players: what is important? by Callum Williams
- How Players are Valued
- Is It Possible to Predict Football Players’ Value by Burak Arslan
- Using machine learning to identify high-value football transfer targets by Jack Tattersall
- Predicting Transfer Fee of a Football Player by Yahya Yavuz
Papers
- Beyond crowd judgments: Data-driven estimation of market value in association football by Oliver Müller, Alexander Simons, and Markus Weinmann
- Football Player’s Performance and Market Value by Miao He, Ricardo Cachucho, and Arno Knobbe
- A novel machine learning method for estimating football players’ value in the transfer market by Iman Behravan and Seyed Mohammad Razavi
- Predicting player transfers in the small world of football by Roland Kovas and Laszlo Toka
- Predicting Market Value of Football Players using Machine Learning Algorithms by Sidharrth Mahadevan
- Estimating transfer fees of professional footballers using advanced performance metrics and machine learning by Ian G.McHale, and Benjamin Holmes
- Predict the Value of Football Players Using FIFA Video Game Data and Machine Learning Techniques by Mustafa A. Al-Asadi and Sakir Tasdemır
- Econometric Approach to Assessing the Transfer Fees and Values of Professional Football Players by Raffaele Poli, Roger Besson, and Loïc Ravenel
- Identification of Factors Determining Market Value of the Most Valuable Football Players by Sebastian Majewski
- Predicting Market Value of Soccer Players Using Linear Modeling Techniques by Yuan He
- A Machine Learning Ensembling Approachm to Predicting Transfer Values by Ayse Elvan Aydemir, Tugba Taskaya Temizel, and Alptekin Temizel
- The Determinants of Football Transfer Market Value: An Age of Financial Restraint by Thomas Preston
- Modelling the transfer prices of football players by Ivo Hendriks
- A study of Prediction models for football player valuations by quantifying statistical and economic attributes for the global transfer market by Dibyanshu Patnaik, Harsh Praharaj, Kartikeya Prakash, and Krishna Samdani
Code/Notebooks
- A Modelling Analysis of Transfer Fees from the 2019/20 Premier League Season by Callum Littler
football_scout_ml
R scripts by Jack Tattersall - 01_load_data.R, 02_train_xgboost.R, 03_xgboost_predict.R, randomforest.R. See the accompanying Medium post: Using machine learning to identify high-value football transfer targetsPlayerValueEstimator
by Burak Arslan. See the accompanying GitHub repo [link] and Medium Blog: Is It Possible to Predict Football Players’ Value
Slides
Tweets
Financial Data
Player Values
- Capology - a sports salaries platform
- KPMG Football Benchmark player valuation data
- spotrac - player contracts, salaries, and transfer information for the Premier League, MLS, and NWSL
- TransferMarket player bio and fiscal data
- This data can be scraped in the following ways:
- Python:
Tyrone Mings
webscraper by FCrSTATS (I've currently submitted a pull request to fix issues with this library to scrape bio-status data, see my TransferMarkt scraping notebook for code with minor fixes to enable code to run). - R:
worldfootballR
package by Jason Zivkovic (see guide [link])
- Python:
- An extract of player data for 2010-2021 for the 'Big 5' European leagues has been made available by John Muller, see [link]
- This data can be scraped in the following ways:
- Soccerway (transfer values)
Recorded Transfers
transfers
GitHub repo for European football clubs' player transfers from 1992/93-2020/21 (as per TransferMarkt) by ewenme- Player Transfer Data collated by Tom Worville (see Tweet [link]
Other
- The Price of Football Master Spreadsheet - data from the finance/business aspect of football by Kieran Maguire
Relevant Packages/Repos
Tyrone Mings
TransferMarkt webscraper by FCrSTATStmscrape
TransferMarkt.de webscraper by znstriderworldfootballR
package by Jason Zivkovic that include TransferMarkt webscraping (see guide [link])football_scout_ml
- a repo that uses machine learning to identify high-value targets in the football transfer market by James Tattersall. See the accompanying Medium Post [link]- Predicting Football Player Transfer Values by Sanjit Varma
DSbootcamp-Project2
by Burak Arslan. See the accompanying Medium Blog: Is It Possible to Predict Football Players’ ValueFootball_Transfers
by . GitHub repo for understanding trends in football transfers through an attempt to build a prediction model to predict the market value of players using Python
Miscellaneous
- Predicting Soccer Player Transfer Values by Sanjit Varma
Game Win Probability Modeling
- A Bayesian Approach to In-Game Win Probability by Jesse Davies, Pieter Robberechts, andJan Van Haaren for DTAI Sports Analytics Lab. See paper [link]
- Who Will Win It? An In-game Win Probability Model for Football by Pieter Robberechts,Jan Van Haaren, and Jesse Davies for DTAI Sports Analytics Lab
- Explaining Live Win Probability (LWP) by Jonny Whitmore for The Analyst
- Opta's Live Win Probability Model on Amazon Prime Video by Alex Jennings for Stats Perform
- We Have a New Win Probability Model by Tyler Richardett for American Soccer Analysis
- Explained: What is ‘win probability’ and how does it work? by Alex Jennings for FourFourTwo
- Win probability Wiki
- FiveThirtyEight’s 2018 World Cup Predictions featuring win probability visualisations
Goalkeeper Analysis
- Anything by Dr. John Harrison. See his pieces on Goalkeeper.com. See his analysis being used on Sky Sport's Monday Night Football [link]
- Profiling keepers with data by Victor Renaud
- Intro to Goalkeeper Analysis by Ted Knutson (08/11/2018)
- Articles by Paul Riley
- What’s a Goalkeeper Worth to His Team? (06/07/2012)
- Things to watch differently in the Premier League this season #2 – Goalkeepers (06/10/2012)
- Safe hands? Is your keeper performing as well as expected? (11/06/2014)
- Courtois and Pickford: The Tall and Short of Keeper Styles (13/08/2018)
- Turning Theory Into Practice: Paul Riley Meets Swedish National Goalkeeping Coach Maths Elfvendal (04/12/2018)
- The Unique (and Not so Unique) Challenges of Goalkeeping in Women's Soccer (16/11/2018)
Citations
Thanks to all those that have kindly wrote about or promoted this GitHub repository. See:
- Articles:
- Social Media:
- WHERE TO LEARN FOOTBALL ANALYTICS? by Irfan Alghani Khalid
- The following LinkedIn post by Hadi Sotudeh
- The following LinkedIn post and Kaggle post by Ekrem Bayar
- The following Tweet by Tom Worville. Check out his Twitter thread on getting started in football analytics
- The following Tweet by Jan Van Haaren
- The following Tweet by Joe Gallagher
- The following Tweet by Hadi Sotudeh
- The following Tweet by Ninad Barbadikar
- The following Tweet by Tim Keller
- The following Tweet by The Devil's DNA
- The following Tweet by Brian McDonnell
- The following Tweet by Panda_9097
- Blogs/Reddit:
- Moneyball - Section Football Analytics by Alberto Riccardi
- Pra quem gosta de análise de dados no futebol by Wagner Andrade
- datasets on football or sports in general by nichtgefunden
- Ask Anything Thread by MatchAnalyst
- How do the top 5 leagues in Europe differ in overall style?
- New York City FC vs New York Red Bulls Data Analysis Report by MatchAnalyst
- Weekly Open Thread - General Discussion by messimisses
- GitHub Repositories:
Contributing
This GitHub repository and resources list is always a work in progress, with new resources added semi-regularly. If you feel there's any resource(s) that I've missed, I'm always open to contributions! Please feel free to create a pull request or send me a message @ [email protected] or @eddwebster and I'll get back to you as quick as I can!
If you're new to creating a pull request, please follow these steps (based on this)
-
Create an account on GitHub if you do not already have one.
-
Fork the project repository: click on the ‘Fork’ button near the top of the page. This creates a copy of the code under your account on the GitHub user account. For more details on how to fork a repository see this guide.
-
Clone your fork of the football_analytics repo from your GitHub account to your local disk:
git clone https://github.com/<github username>/football_analytics.git cd football_analytics
-
Create environment with:
$ python3 -m venv my_env
or$ python -m venv my_env
or with conda:
$ conda create -n my_env python=3
-
Activate the environment:
$ source my_env/bin/activate
or with conda:
$ conda activate my_env
-
Add the upstream remote. This saves a reference to the main hyperopt repository, which you can use to keep your repository synchronised with the latest changes:
$ git remote add upstream https://github.com/eddwebster/footbal_analytics.git
You should now have a copy of the football analytics repository, and your git repository properly configured. The next steps now describe the process of modifying code and submitting a pull request:
-
Synchronize your master branch with the upstream master branch:
git checkout master git pull upstream master
-
Create a feature branch to hold your development changes:
$ git checkout -b my_change
and start making changes. Always use a feature branch. It’s good practice to never work on the master branch!
-
Then, once you commit ensure that git hooks are activated (Pycharm for example has the option to omit them). This can be done using pre-commit, as follows:
pre-commit install
-
Develop the feature on your feature branch on your computer, using Git to do the version control. When you’re done editing, add changed files using git add and then git commit:
git add modified_files git commit -m "my first football_analyitcs commit"
-
Record your changes in Git, then push the changes to your GitHub account with:
git push -u origin my_change
Star History
Star history for the football_analytics
repository.
Acknowledgements
- Soccer Analytics Handbook by Devin Pleuler
- Awesome Soccer Analytics by Matias Mascioto
- Jan Van Haaren's Soccer Analytics Reviews:
- Jan Van Haaren's
soccer-analytics-resources
Github repo awesome-readme
repository by Matias Singers used to restyle this README- Excel spreadsheet version of this README by Melanie Loeper link.