Scalable-Software-Architecture
Collection of tech talks, papers and web links on Distributed Systems, Scalability and System Design.
Tech Talks
General Advice on System Design and Scalability
- Lecture - Scalability - Harvard Web Development, David Malan
- Building Software Systems At Google and Lessons Learned
- Scalable Internet Architectures - Theo Schlossnagle
- Seattle Conference on Scalability - Jeff Dean
- Best Practices for Scaling Web Apps
- Building a Scalable Architecture for Web Apps
- Web application architecture: The whole stack - Allen Holub
- Scalable Distributed Design
- Building Software at Google Scale Tech Talk
- Seattle Conference on Scalability: Scaling Google for Every User
- Seattle Conference on Scalability: Lessons In Building Scalable Systems
- Workers, Queues, and Cache
- Velocity 2012: Jay Parikh, "Building for a Billion Users"
- Building Large Systems at Google
- 3000 images per second - Henna Kerman - @Scale 2016
- Scaling to over 1,000,000 requests per second
- Jeff Dean: "Achieving Rapid Response Times in Large Online Services" Keynote - Velocity 2014
- Getting Things Done at Scale
- Scale-oriented Architecture with APIs
- You Won't Believe How the Biggest Sites Build Scalable and Resilient Systems!
- Scalable Distributed Design
- Distributed Patterns you should know by Eric Redmond
- GOTO 2012 • Runaway Complexity in Big Data Systems...and a Plan to Stop it • Nathan Marz
- Seattle Conference on Scalability: Abstractions for handling large datasets
- High Performance Web Sites and YSlow
Company/Product specific tech talks
- Seattle Conference on Scalability: YouTube Scalability
- How to answer design question: How do you design a twitter?
- Operations at Twitter: Scaling Beyond 100 Million Users
- How We've Scaled Dropbox
- Keynote: Twitter's search architecture
- Marco Cecconi - "The Architecture of StackOverflow"
- Scalability at YouTube
- Lessons of Scale at Facebook
- Scale at Facebook
- Flight Lightning - Scaling Twitter core infrastructure
- Scaling Instagram with Mike Krieger
- GOTO 2014 • Scaling Pinterest • Marty Weiner (InfoQ link)
- Real-Time Delivery Architecture at Twitter
- OSCON 2014: How Instagram.com Works; Pete Hunt
- Scaling the Data Infrastructure at Instagram
- O'Reilly MySQL CE 2011: Jeremy Cole, "Big and Small Data at @Twitter"
- O'Reilly Webcast: How Pinterest Architected and Built Their Sharded MySQL Datastore
- Timelines at Scale (Twitter)
- Architecture at Scale at ESPN
- Building Highly-resilient Systems at Pinterest
- Scaling Uber
- How Zoom works
- Scaling Engineering Culture at Twitter
- Keynote - Systems at Facebook Scale
- Scaling YouTube's Backend: The Vitess Trade-offs
- Hacker Way: Rethinking Web App Development at Facebook
- Data Platform Architecture, Evolution, and Philosophy at Netflix
- Data Platform Architecture, Evolution, and Philosophy at Netflix
- Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
- Scaling Foursquare: From Check-ins to Recommendations
- How Netflix Leverages Multiple Regions to Increase Availability: An Active-Active Case Study
- That's 'Billion' with a 'B': Scaling to the Next Level at WhatsApp
- How Facebook Scales Big Data Systems
- Scaling Uber's Real-time Market Platform
- Software Development & Architecture @ LinkedIn
- Etsy Search: How We Index and Query 26 Million One-of-a-kind Items
- Scalability Lessons from eBay, Google, and Real-time Games
- How SoundCloud Uses Cassandra
- Service Architectures at Scale: Lessons from Google and eBay
- Solidifying the Cloud: How Google Backs up the Internet
- Real-Time Systems at Twitter
- Serving user intent : Facebook style notifications using HBase and Event streams
- Netflix's Distributed Computing Strategies: Optimistic Design for the Eventual Consistency Model
Distributed Computing
- Intro to Hadoop and MapReduce (Udacity)
- Introduction to Hadoop
- MapReduce Flow Chart
- Distributed Computing CS 61A UC Berkeley
- MapReduce CS 61A UC Berkeley
- Cluster Computing and MapReduce Lecture 1
- Cluster Computing and MapReduce Lecture 2
- Introducing Apache Hadoop: The Modern Data Operating System
Distributed Database/Large-Scale Storage
- Spanner: Google’s Globally-Distributed Database (Youtube link)
- Spanner - multi-version, globally- distributed, and synchronously-replicated database
- BigTable: A Distributed Structured Storage System (slides)
- Large-Scale Low-Latency Storage for the Social Network - Data@Scale
- Structured Data at Box: How We're Building for Scale
- F4 - Photo Storage at Facebook
- The Storage Technologies Behind Facebook Messages
- Cold Storage at Facebook
- Taking Storage for a Ride with Uber
- Zen: Pinterest's Graph Storage Service - @Scale 2014 - Data (With Slides)
- Storage Systems at a Rapidly Scaling Startup with a Small Team - Data@Scale
- f4: Facebook's Warm BLOB Storage System
Distributed graph processing
- Giraph
- Apache Giraph Large Scale Graph Processing On Hadoop
- Processing Over a Billion Edges on Apache Giraph
- Graph Search: The Power of Connected Data
- Using Graph Partitioning in Distributed Systems Design
- Let Me Graph That For You: Building a Graph Database Application
- GraphChi: Large-Scale Graph Computation on Just a PC
- PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs
Stream Processing
- Heron: Real-time Stream Data Processing at Twitter
- Samza in LinkedIn: How LinkedIn Processes Billions of Events Everyday in Real-time
- Mantis: Netflix's Event Stream Processing System
- High Throughput Stream Processing with ACID Guarantees
- Martin Kleppmann — Event Sourcing and Stream Processing at Scale
- ETE 2012 - Nathan Marz on Storm
- Cassandra NYC 2011: Nathan Marz - The Storm and Cassandra Realtime Computation Stack
API Design
- How To Design A Good API and Why it Matters
- How to Design Great APIs - Parse Developer Day 2013
- Google I/O 2010 - How Google builds APIs
- Designing a Beautiful REST+JSON API
Web Services and SOA
- Introduction to Service Design and Engineering - University of Trento, Italy
- REST+JSON API Design - Best Practices for Developers
- What is a Service Oriented Architecture?
- Webinar : Practical SOA for the Solution Architect
Caching
- Scaling Redis at Twitter
- Facebook and memcached - Tech Talk
- Scaling Memcache at Facebook
- How Netflix and reddit scale to handle massive demand
- An analysis of Facebook photo caching
NoSQL
- Introduction to NoSQL • Martin Fowler
- NoSQL Distilled to an hour by Martin Fowler
- NoSQL Distilled • Pramod Sadalage
- Tech Talk: Cassandra Data Modeling
- NoSQL Explained
- Big Data Architecture Patterns
- Graph Databases Exposed
Messaging
- Queue It! What Job Queues Can Do for You!
- Joydeep Sen Sarma - Messaging architecture at Facebook
- Messaging at Scale at Instagram
- Building a Distributed Data Ingestion System with RabbitMQ
- scaling web applications with message queues - Lenz Gschwendtner
Object Oriented Analysis and Design
- Software Architecture & Design | Udacity
- OOSE: Software Dev Using UML and Java
- Eric Evans — Tackling Complexity in the Heart of Software
- Domain Driven Design
- Domain-Driven Design
- How You Can Architect and Develop Enterprise Mission-Critical Applications with Domain-Driven Design
- DDD: putting the model to work
- Eric Evans on DDD: Strategic Design
- Architecting and Implementing Domain-Driven Design Patterns with Microsoft .NET
- SOLID Design Patterns in C#
- Object Oriented Design
- Design Patterns Video Tutorial
- Object Oriented Design Interview Question: Design a Car Parking Lot.
- Google's Clean Code Talks
- NYC Tech Talk Series: How Google Backs Up the Internet
- Robert C Martin(Uncle Bob) -Clean Architecture and Design-2012
- Robert C Martin - Clean Architecture and Design
- Robert C Martin - The Single Responsibility Principle
- Robert C Martin - Clean Architecture
- The S.O.L.I.D. Principles of OO and Agile Design - by Uncle Bob Martin
- Solid Principles by Uncle Bob Martin
- The Principles of Clean Architecture by Uncle Bob Martin
- Unleash Your Domain - Greg Young
Misc
- Differential Synchronization
- Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303) | AWS re:Invent 2013
- Deepak Agarwal: Recommender Systems - The Art and Science of Matching Items to Users
- Lecture 12 -Analyzing Big Data with Twitter: Recommender Systems by Alpa Jain
- Transactions across Datacenters (Slides)
- Finding the Needle in the Haystack - or - Troubleshooting Distributed Systems
- Finding the Needle in a Big Data Haystack
- Large scale image processing on the fly in 25ms with Google's first Network Engineer
- Bringing Push Notifications to the Mobile Web
Papers
General
- papers-we-love
- Google Research
- Facebook Research
- MIT PDOS
- Distributed Systems Reading List
- Hints for Computer System Design
- The Little Manual of API Design
- On Designing and Deploying Internet-Scale Services
- Time, Clocks, and the Ordering of Events in a Distributed System
- Above the Clouds: A Berkeley View of Cloud Computing
- The Byzantine Generals Problem
- How to Design a Good API and Why it Matters - Google Research
- Twitter - Automatic Management of Partitioned, Replicated Search Services
- High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads
- Making Reliable Distributed Systems in the Presence of Software Errors
- Fallacies of Distributed Computing Explained
Search
- The Anatomy of a Large-Scale Hypertextual Web Search Engine (Google Paper) (Weblink)
- The PageRank Citation Ranking: Bringing Order to the Web
- Web Search for a Planet: The Google Cluster Architecture
- Unicorn: A System for Searching the Social Graph (FB link)
P2P
- Chord: A scalable peer-to-peer lookup service for Internet applications
- Building peer-to-peer systems with Chord, a distributed lookup service
- Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web (CS 168: Consistent Hashing | Algorithmic Nuggets in Content Delivery)
- Web Caching with Consistent Hashing (Web link)
- Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems
- Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems
Distributed Computing
- MapReduce: Simplified Data Processing on Large Clusters
- Resident Distributed Datasets: a Fault-Tolerant Abstraction for In-Memory Cluster Computing (Zahari et al.)
- Kafka: a Distributed Messaging System for Log Processing
Distributed Database/Large-Scale Storage
- Dynamo: Amazon's Highly Available Key-value Datastore
- The Google File System
- Bigtable: A Distributed Storage System for Structured Data
- Spanner: Google's Globally-Distributed Database - Google Research
- TAO: Facebook’s Distributed Data Store for the Social Graph
- F1: A Distributed SQL Database That Scales
- Scuba: Diving into Data at Facebook
- f4: Facebook’s Warm BLOB Storage System
- Finding a needle in Haystack: Facebook’s photo storage
- Cassandra - A Decentralized Structured Storage System
Consistency
- Consistency Tradeoffs in Modern Distributed Database System Design
- Paxos Made Live - An Engineering Perspective
- Paxos Made Simple
- Existential Consistency: Measuring and Understanding Consistency at Facebook
- In Search of an Understandable Consensus Algorithm
Distributed Graph processing
- SQLGraph: An Efficient Relational-Based Property Graph Store
- One Trillion Edges: Graph Processing at FacebookScale
- Pregel: A System for Large-Scale Graph Processing
- Dremel: Interactive Analysis of Web-Scale Datasets
Company/Product specific
- Scaling Memcache at Facebook
- Realtime Data Processing at Facebook
- Holistic Configuration Management at Facebook
- The Unified Logging Infrastructure for Data Analytics at Twitter
- Scaling Big Data Mining Infrastructure: The Twitter Experience
- Large-scale cluster management at Google with Borg
Misc
- Differential Synchronization
- A Low-bandwidth Network File System
- Maglev: A Fast and Reliable Software Network Load Balancer
- The Chubby Lock Service for Loosely-Coupled Distributed Systems
- Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask
- Transactional storage for geo-replicated systems
- Highly Available Transactions: Virtues and Limitations
- The Log-Structured Merge-Tree
Web Links
General
- https://www.hiredintech.com/
- https://github.com/checkcheckzz/system-design-interview
- https://github.com/shashank88/system_design
- http://highscalability.com/ (All Time Favorites)
- http://blog.gainlo.co/index.php/category/system-design-interview-questions/
- http://www.allthingsdistributed.com/archives.html (Back-to-Basics series)
- Scalability for dummies - Part 1 (Part 2 | Part 3 | Part 4)
- Scalable Web Architecture and Distributed Systems
- Introduction to Architecting Systems for Scale
- Software Architect Roadmap
Examples
Highscalability.com has wide collection of articles on Scalable architecture. Individual web links will be added below if they are not already highlighted in popular sites like highscalability.
- Trending at Instagram
- Search Architecture - Instagram
- THE UBER ENGINEERING TECH STACK, PART I (PART II)
- Processing Payments At Scale
- Personalized Group Recommendations on Flickr
- Building The LinkedIn Knowledge Graph
- Personal recommendations for the Foursquare homescreen
Books
- Microsoft Application Architecture Guide, 2nd Edition (Online)
- The Architecture of Open Source Applications
- Design Patterns: Elements of Reusable Object-Oriented Software
- Head First Design Patterns
- Patterns of Enterprise Application Architecture
- Domain Driven Design by Eric Evans
- Agile Software Development, Principles, Patterns and Practices by Robert Martin