• Stars
    star
    510
  • Rank 86,627 (Top 2 %)
  • Language
    TeX
  • License
    MIT License
  • Created over 8 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

LaTeX source and supporting code for Think Data Structures: Algorithms and Information Retrieval in Java

ThinkDataStructures

LaTeX source and supporting code for Think Data Structures: Algorithms and Information Retrieval in Java

Data structures and algorithms are among the most important inventions of the last 50 years, and they are fundamental tools software engineers need to know. But in my opinion, most of the books on these topics are too theoretical, too big, and too bottom-up:

  • Too theoretical: Mathematical analysis of algorithms is based on simplifying assumptions that limit its usefulness in practice. Many presentations of this topic gloss over the simplifications and focus on the math. In this book I present the most practical subset of this material and eliminate the rest.

  • Too big: Most books on these topics are at least 500 pages, and some are more than 1000. By focusing on the topics I think are most useful for software engineers, I kept this book under 250 pages.

  • Too bottom-up: Many data structures books focus on how data structures work (the implementations), with less about how to use them (the interfaces). In this book, I go ``top down'', starting with the interfaces. Readers learn to use the structures in the Java Collections Framework before getting into the details of how they work.

Finally, many present this material out of context and without motivation: it's just one damn data structure after another!

I try to alleviate the boredom by organizing the topics around an application -- web search -- that uses data structures extensively, and is an interesting and important topic in its own right.

This application also motivates some topics that are not usually covered in an introductory data structures class, including persistent data structures, with Redis, and streaming algorithms.

I have made difficult decisions about what to leave out, but I have made some compromises. I include a few topics that most readers will never use, but that they might be expected to know, possibly in a technical interview. For these topics, I present both the conventional wisdom as well as my reasons to be skeptical.

This book also presents basic aspects of software engineering practice, including version control and unit testing. Each chapter ends with an exercise that allows readers to apply what they have learned. Each exercise includes automated tests that check the solution. And for most exercises, I present my solution at the beginning of the next chapter.

This book is intended for college students in computer science and related fields, as well as professional software engineers, people training in software engineering, and people preparing for technical interviews.

I assume that the reader knows Java at an intermediate level, but I explain some Java features along the way, and provide pointers to supplementary material.

People who have read Think Java or Head First Java are prepared for this book.

More Repositories

1

ThinkStats2

Text and supporting code for Think Stats, 2nd Edition
Jupyter Notebook
3,899
star
2

ThinkDSP

Think DSP: Digital Signal Processing in Python, by Allen B. Downey.
Jupyter Notebook
3,476
star
3

ThinkPython2

LaTeX source and supporting code for Think Python, 2nd edition, by Allen Downey.
TeX
2,378
star
4

ThinkBayes

Code repository for Think Bayes.
TeX
1,627
star
5

ThinkBayes2

Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.
Jupyter Notebook
1,617
star
6

ThinkPython

Code examples and exercise solutions from Think Python by Allen Downey, published by O'Reilly Media.
PostScript
930
star
7

ModSimPy

Text and supporting code for Modeling and Simulation in Python
HTML
818
star
8

ThinkComplexity2

Book and code for Think Complexity, 2nd edition
Jupyter Notebook
728
star
9

ThinkOS

Text and supporting code for Think OS: A Brief Introduction to Operating Systems, by Allen Downey.
TeX
526
star
10

ThinkJavaCode

Supporting code for Think Java by Allen Downey and Chris Mayfield.
Java
364
star
11

ElementsOfDataScience

An introduction to data science in Python, for people with no programming experience.
Jupyter Notebook
358
star
12

BayesMadeSimple

Code for a tutorial on Bayesian Statistics by Allen Downey.
Jupyter Notebook
330
star
13

LittleBookOfSemaphores

LaTeX source and supporting code for The Little Book of Semaphores, by Allen Downey.
TeX
237
star
14

CompStats

Code for a workshop on statistical interference using computational methods in Python.
Jupyter Notebook
215
star
15

empiricaldist

Python library that represents empirical distribution functions.
Jupyter Notebook
152
star
16

DSIRP

Data Structures and Information Retrieval in Python
Jupyter Notebook
128
star
17

BiteSizeBayes

An introduction to Bayesian statistics using Python and (coming soon) R.
Jupyter Notebook
126
star
18

ThinkCPP

Text and code for Think C++ by Allen Downey
PostScript
111
star
19

ExercisesInC

Exercises for people learning the C programming language
C
103
star
20

ThinkComplexity

Code for Allen Downey's book Think Complexity, published by O'Reilly Media.
PostScript
96
star
21

AstronomicalData

An introduction to working with astronomical data in Python.
Jupyter Notebook
87
star
22

Swampy

Code for Swampy, a set of modules used in Think Python, first edition
Python
85
star
23

PhysicalModelingInMatlab

Text and code for Physical Modeling in MATLAB
TeX
83
star
24

ProbablyOverthinkingIt

Supplementary material for my book, Probably Overthinking It.
Jupyter Notebook
82
star
25

ThinkPythonItalian

LaTeX source for the Italian Translation of Think Python.
TeX
81
star
26

DataExploration

Supporting code for a video series on best practices for exploratory data analysis.
Python
71
star
27

BayesianDecisionAnalysis

Repository for a workshop on Bayesian Decision Analysis
Jupyter Notebook
64
star
28

ExploratoryDataAnalysis

Repository for an online class on Exploratory Data Analysis in Python
Jupyter Notebook
63
star
29

ThinkJava

LaTeX source for Think Java, 1st edition, by Allen Downey and Chris Mayfield.
TeX
57
star
30

SurvivalAnalysisPython

Explorations of survival analysis in Python
Jupyter Notebook
48
star
31

BayesForUndergrads

Materials for a workshop on developing undergraduate classes on Bayesian statistics.
46
star
32

DataScience

Site for a Data Science class taught by Allen Downey
HTML
42
star
33

ComplexityScience

Repository for a workshop on Complexity Science
Jupyter Notebook
35
star
34

ThinkX

Python
30
star
35

ThinkStats3

Code and LaTeX source for Think Stats, third edition
29
star
36

BayesSeminar

Bayesian statistics seminars
Jupyter Notebook
29
star
37

BayesianInferencePyMC

Workshop on Bayesian inference using PyMC
Jupyter Notebook
26
star
38

ElementsOfDataScienceBook

Repository for the manuscript of Elements of Data Science
TeX
25
star
39

PoliticalAlignmentCaseStudy

Notebooks and data for a case study on political alignment, outlook, and beliefs
Jupyter Notebook
23
star
40

thinkjavasolutions5

Automatically exported from code.google.com/p/thinkjavasolutions
Java
21
star
41

blair-walden-project

The Blair Walden Project: in 1845 Henry David Thoreau went to live in the woods... a year later his journal was found.
19
star
42

Portfolio

Portfolio of Allen Downey at Olin College
HTML
18
star
43

ThinkPythonSolutions

Automatically exported from code.google.com/p/thinkpythonsolutions
Python
17
star
44

DataQnA

Data Q&A: Questions and answers about data and statistics
Jupyter Notebook
17
star
45

ProbablyOverthinkingIt2

New repo for projects related to my blog, Probably Overthinking It.
Jupyter Notebook
16
star
46

MarriageNSFG

Repository for a project using NSFG data to explore marriage patterns in the US.
Stata
15
star
47

clink

A network measurement tool, described at http://allendowney.com/research/clink/
C
12
star
48

RecidivismCaseStudy

Case study on evaluating statistical tools that predict recidivism.
Jupyter Notebook
11
star
49

ModSim

Modeling and Simulation in Python and MATLAB/Octave
Jupyter Notebook
11
star
50

ThinkStats

Notebooks for the third edition of Think Stats
Jupyter Notebook
11
star
51

SignalsAndSystemsAndDynamics

Code and examples for an experimental class on signals, systems, and dynamics
MATLAB
10
star
52

GssReligion

Code and data for measuring and predicting religious affiliation using GSS data.
Jupyter Notebook
10
star
53

GunControlGenerational

Data and analysis related to generational changes in attitudes toward gun control
Jupyter Notebook
9
star
54

ThinkPerl6

Text and supporting code for Think Perl 6 by Laurent Rosenfeld with Allen Downey
TeX
9
star
55

ModSimMatlab

Text and supporting code for Modeling and Simulation.
Makefile
8
star
56

JavaOOP

Supporting code for the OOP in Java independent study
Java
8
star
57

DSIRPSolutions

Solutions to the exercises in Data Structures and Information Retrieval in Python (DSIRP)
Jupyter Notebook
8
star
58

SoftwareSystems

Repo for software related to Software Systems at Olin College.
C
8
star
59

ThinkBayes2Translations

Translations of Think Bayes.
Jupyter Notebook
8
star
60

plastex-oreilly

Branch of plastex that generates DocBook 4.5 that meets O'Reilly style guidelines.
TeX
7
star
61

JupyterAsciidocTemplate

Template for converting Jupyter notebooks to an asciidoc book.
Jupyter Notebook
7
star
62

internet-religion

Data and code for an analysis of Internet use and religious affiliation using data from the GSS.
Python
6
star
63

AtmoChem

Atmospheric chemistry data and analysis
Jupyter Notebook
6
star
64

TheShakes

Jupyter Notebook
5
star
65

complexity

Automatically exported from code.google.com/p/complexity
PostScript
5
star
66

PythonCounterPmf

Examples using Python's Counter collection to implement a probability mass function (PMF)
Jupyter Notebook
5
star
67

FirstLateNSFG

Data and analysis for "Are first babies more likely to be late?"
Jupyter Notebook
4
star
68

PythonFun

Jupyter Notebook
4
star
69

ThinkJavaSequel

Text and supporting code for Think DS: Data Structures in Java, by Allen Downey.
4
star
70

matlabsolutions

Automatically exported from code.google.com/p/matlabsolutions
MATLAB
4
star
71

ThinkOCaml

Automatically exported from code.google.com/p/thinkocaml
PostScript
4
star
72

Notebooks

A repo for iPython notebooks.
4
star
73

ISSPRegression

Exploration of the data from the Crowdsourced Replication Initiative
Makefile
4
star
74

thinkjava5

Automatically exported from code.google.com/p/thinkapjava
TeX
3
star
75

plastex-docbook

DocBook renderer plugin templates and classes for the plasTeX engine
Python
3
star
76

GssExtract

Jupyter Notebook
3
star
77

SoftwareDesign

Directories and unit tests for exercises in Software Design at Olin College.
Python
3
star
78

InspectionParadox

Code and data for an article on length-biased sampling and the inspection paradox
Jupyter Notebook
2
star
79

OlinPyShop

Code for Python workshops from Olin College
2
star
80

TeamAllocation

Code for making team allocations under constraints.
Python
2
star
81

QEACode

Code for Quantitative Engineering Analysis (QEA) class at Olin College
2
star
82

thinkpythonchinese

Automatically exported from code.google.com/p/thinkpythonchinese
TeX
2
star
83

simulating

2
star
84

LongTailedDistributions

Data and code from a series of papers about long-tailed distributions in the Internet.
2
star
85

AfroBarometer

Jupyter Notebook
1
star
86

python-in-hydrology

Automatically exported from code.google.com/p/python-in-hydrology
1
star
87

a-bad-synthesizer

Arduino-based analog-digital synthesizer
Python
1
star
88

2019-08-27-needham

Python
1
star
89

GssFeminism

Exploration of changes in views related to feminism
Jupyter Notebook
1
star