• Stars
    star
    317
  • Rank 132,216 (Top 3 %)
  • Language Roff
  • License
    Other
  • Created over 10 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code and data to create a git repository representing the Unix source code history

Unix History Repository

DOI

The history and evolution of the Unix operating system is made available as a revision management repository, covering the period from its inception in 1970 as a 2.5 thousand line kernel and 26 commands, to 2018 as a widely-used 30 million line system. The 1.5GB repository contains about half a million commits and more than two thousand merges. The repository employs Git system for its storage and is hosted on GitHub. It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, the University of California at Berkeley, and the 386BSD team, two legacy repositories, and the modern repository of the open source FreeBSD system. In total, about one thousand individual contributors are identified, the early ones through primary research. The data set can be used for empirical research in software engineering, information systems, and software archaeology.

You can read more details about the contents, creation, and uses of this repository through this link.

Two repositories are associated with the project:

  • unix-history-repo is a repository representing a reconstructed version of the Unix history, based on the currently available data. This repository will be often automatically regenerated from scratch, so this is not a place to make contributions. To ensure replicability its users are encouraged to fork it or archive it.
  • unix-history-make is a repository containing code and metadata used to build the above repository. Contributions to this repository are welcomed.

Project status

The project has achieved its major goal with the establishment of a continuous timeline from 1970 until today. The repository contains:

  • snapshots of PDP-7, V1, V2, V3, V4, V5, V6, and V7 Research Edition,
  • Unix/32V,
  • all available BSD releases,
  • the CSRG SCCS history,
  • two releases of 386BSD,
  • the 386BSD patchkit,
  • the FreeBSD 1.0 to 1.1.5 CVS history,
  • an import of the FreeBSD repository starting from its initial imports that led to FreeBSD 2.0, and
  • the current FreeBSD repository.

The files appear to be added in the repository in chronological order according to their modification time, and large parts of the source code have been attributed to their actual authors. Commands like git blame and git log produce the expected results.

The repository contains a number of two-way merges.

  • 3 BSD is merged from Unix/32V and Research Edition 6
  • Various BSD releases are merged from the development branch and a time point of BSD-SCCS
  • FreeBSD 1.0 is merged from Net/2 BSD and 386BSD-0.1-patchkit
  • FreeBSD 2.0 is merged from BSD 4.4/Lite1 and FreeBSD 1.1.5

Blame is apportioned appropriately.

Tags and Branches

The following tags or branch names mark specific releases, listed in rough chronological order.

  • Epoch
  • Research-PDP7
  • Research-V1–6
  • BSD-1
  • BSD-2
  • Research-V7
  • Bell-32V
  • BSD-3, 4, 4_1_snap, 4_1c_2, 4_2, 4_3, 4_3_Reno, 4_3_Net_1, 4_3_Tahoe, 4_3_Net_2, 4_4, 4_4_Lite1, 4_4_Lite2 SCCS-END,
  • 386BSD-0.0, 0.1, patchkit
  • FreeBSD-release/1.0, 1.1, 1.1.5
  • FreeBSD-release/2.0 2.0.5, 2.1.0, 2.1.5, 2.1.6, 2.1.6.1, 2.1.7, 2.2.0, 2.2.1, 2.2.2, 2.2.5, 2.2.6, 2.2.7, 2.2.8
  • FreeBSD-release/3.0.0, 3.1.0, 3.2.0, 3.3.0, 3.4.0, 3.5.0
  • FreeBSD-release/4.0.0 4.1.0, 4.1.1, 4.2.0, 4.3.0, 4.4.0, 4.5.0, 4.6.0, 4.6.1, 4.6.2, 4.7.0, 4.8.0, 4.9.0, 4.10.0, 4.11.0
  • FreeBSD-release/5.0.0 5.1.0, 5.2.0, 5.2.1, 5.3.0, 5.4.0, 5.5.0
  • FreeBSD-release/6.0.0, 6.1.0, 6.2.0, 6.3.0, 6.4.0
  • FreeBSD-release/7.0.0, 7.1.0, 7.2.0, 7.3.0, 7.4.0
  • FreeBSD-release/8.0.0, 8.1.0, 8.2.0, 8.3.0, 8.4.0
  • FreeBSD-release/9.0.0, 9.1.0, 9.2.0, 9.3.0
  • FreeBSD-release/10.0.0, 10.1.0, 10.2.0, 10.3.0, 10.4.0
  • FreeBSD-release/11.0.0, 11.0.1, 11.1.0, 11.2.0, 11.3.0, 11.4.0
  • FreeBSD-release/12.0.0, 12.1.0

A detailed description of the major tags is available in the file releases.md.

More tags and branches are available.

  • The -Snapshot-Development branches denote commits that have been synthesized from a time-ordered sequence of a snapshot's files.
  • The -VCS-Development tags denote the point along an imported version control history branch where a particular release occurred.

Cool things you can do

If you have a broadband network connection and about 1.5GB of free disk space, you can download the repository and run Git commands that go back decades. Run

git clone https://github.com/dspinellis/unix-history-repo
git checkout BSD-Release

to get a local copy of the Unix history repository.

View log across releases

Running

git log --reverse --date-order

will give you commits such as the following

commit 64d7600ea5210a9125bd1a06e5d184ef7547d23d
Author: Ken Thompson <[email protected]>
Date:   Tue Jun 20 05:00:00 1972 -0500

    Research V1 development
    Work on file u5.s

    Co-Authored-By: Dennis Ritchie <[email protected]>
    Synthesized-from: v1/sys
[...]
commit 4030f8318890a026e065bc8926cebefb71e9d353
Author: Ken Thompson <[email protected]>
Date:   Thu Aug 30 19:30:25 1973 -0500

    Research V3 development
    Work on file sys/ken/slp.c

    Synthesized-from: v3
[...]
commit c4999ec655319a01e84d9460d84df824006f9e2d
Author: Dennis Ritchie <[email protected]>
Date:   Thu Aug 30 19:33:01 1973 -0500

    Research V3 development
    Work on file sys/dmr/kl.c

    Synthesized-from: v3
[...]
commit 355c543c6840fa5f37d8daf2e2eaa735ea6daa4a
Author: Brian W. Kernighan <[email protected]>
Date:   Tue May 13 19:43:47 1975 -0500

    Research V6 development
    Work on file usr/source/rat/r.g

    Synthesized-from: v6
[...]
commit 0ce027f7fb2cf19b7e92d74d3f09eb02e8fea50e
Author: S. R. Bourne <[email protected]>
Date:   Fri Jan 12 02:17:45 1979 -0500

    Research V7 development
    Work on file usr/src/cmd/sh/blok.c

    Synthesized-from: v7
[...]
Author: Eric Schmidt <[email protected]>
Date:   Sat Jan 5 22:49:18 1980 -0800

    BSD 3 development

    Work on file usr/src/cmd/net/sub.c

View changes to a specific file

Run

git checkout Research-Release
git log --follow --simplify-merges usr/src/cmd/c/c00.c

to see dates on which the C compiler was modified.

Annotate lines in a specific file by their version

Run

git blame -C -C usr/sys/sys/pipe.c

to see how the Unix pipe functionality evolved over the years.

3cc1108b usr/sys/ken/pipe.c     (Ken Thompson 1974-11-26 18:13:21 -0500  53) 	rf->f_flag = FREAD|FPIPE;
3cc1108b usr/sys/ken/pipe.c     (Ken Thompson 1974-11-26 18:13:21 -0500  54) 	rf->f_inode = ip;
3cc1108b usr/sys/ken/pipe.c     (Ken Thompson 1974-11-26 18:13:21 -0500  55) 	ip->i_count = 2;
[...]
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 122) 	register struct inode *ip;
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 123) 
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 124) 	ip = fp->f_inode;
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 125) 	c = u.u_count;
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 126) 
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 127) loop:
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 128) 
1f183be2 usr/sys/sys/pipe.c     (Ken Thompson 1979-01-10 15:19:35 -0500 129) 	/*
9a9f6b22 usr/src/sys/sys/pipe.c (Bill Joy     1980-01-05 05:51:18 -0800 130) 	 * If error or all done, return.
9a9f6b22 usr/src/sys/sys/pipe.c (Bill Joy     1980-01-05 05:51:18 -0800 131) 	 */
9a9f6b22 usr/src/sys/sys/pipe.c (Bill Joy     1980-01-05 05:51:18 -0800 132) 
9a9f6b22 usr/src/sys/sys/pipe.c (Bill Joy     1980-01-05 05:51:18 -0800 133) 	if (u.u_error)
9a9f6b22 usr/src/sys/sys/pipe.c (Bill Joy     1980-01-05 05:51:18 -0800 134) 		return;
6d632e85 usr/sys/ken/pipe.c     (Ken Thompson 1975-07-17 10:33:37 -0500 135) 	plock(ip);
6d632e85 usr/sys/ken/pipe.c     (Ken Thompson 1975-07-17 10:33:37 -0500 136) 	if(c == 0) {
6d632e85 usr/sys/ken/pipe.c     (Ken Thompson 1975-07-17 10:33:37 -0500 137) 		prele(ip);
6d632e85 usr/sys/ken/pipe.c     (Ken Thompson 1975-07-17 10:33:37 -0500 138) 		u.u_count = 0;
6d632e85 usr/sys/ken/pipe.c     (Ken Thompson 1975-07-17 10:33:37 -0500 139) 		return;
6d632e85 usr/sys/ken/pipe.c     (Ken Thompson 1975-07-17 10:33:37 -0500 140) 	}

How you can help

You can help if you were there at the time, or if you can locate a source that contains information that is currently missing.

  • If your current GitHub account is not linked to your past contributions, (you can search them through this page), associate your past email with your current account through your GitHub account settings. (Contact me for instructions on how to add email addresses to which you no longer have access.)
  • Look for errors and omissions in the files that map file paths to authors.
  • Look for parts of the system that have not yet been attributed in these files and propose suitable attributions. Keep in mind that attributions for parts that were developed in one place and modified elsewhere (e.g. developed at Bell Labs and modified at Berkeley) should be for the person who did the modification, not the original author.
  • Look for authors whose identifier starts with x- in the author id to name map files for Bell Labs, and Berkeley, and provide or confirm their actual login identifier. (The one used is a guess.)
  • Contribute a path regular expression to contributor map file (see v7.map) for 4.2BSD, 4.3BSD, 4.3BSD-Reno, 4.3BSD-Tahoe, 4.3BSD-Alpha, and Net2.
  • Import further branches, such as 2BSD, NetBSD, OpenBSD, and Plan 9 from Bell Labs.

Re-creating the historical repository from scratch

The -make repository is provided to share and document the creation process, rather than as a bullet-proof way to get consistent and repeatable results. For instance, importing the snapshots on a system that is case-insensitive (NTFS, HFS Plus with default settings) will result in a few files getting lost.

Prerequisites

  • Git
  • Perl
  • The Perl modules VCS::SCCS and Git::FastExport (Install with sudo cpanm VCS::SCCS Git::FastExport.)
  • If compiling patch under GNU/Linux and library libbsd (e.g. the libbsd-dev package).
  • Sudo (and authorization to use it to mount ISO images)

Repository creation

The -repo repository can be created with the following commands.

make
./import.sh

Adding a single source

If you want to add a new source without running the full import process, you can do the following.

  • Prepare the source's maps and data
  • cd to the repo directory
  • Checkout the repo at the point where the new source will branch out
  • Run a Perl command such as the following.
perl ../import-dir.pl [-v] -m Research-V7 -c ../author-path/Bell-32V \
-n ../bell.au -r Research-V7 -i ../ignore/Bell-32V \
$ARCHIVE/32v Bell 32V -0500 | gfi

Further reading

Acknowledgements

  • The following people helped with Bell Labs login identifiers.
    • Brian W. Kernighan
    • Doug McIlroy
    • Arnold D. Robbins
  • The following people helped with *BSD login identifiers.
    • Clem Cole
    • Era Eriksson
    • Mary Ann Horton
    • Warner Losh
    • Kirk McKusick
    • Jeremy C. Reed
    • Ingo Schwarze
    • Anatole Shaw
  • The BSD SCCS import code is based on work by

More Repositories

1

unix-history-repo

Continuous Unix commit history from 1970 until today
Assembly
6,318
star
2

latex-advice

Advice for writing LaTeX documents
TeX
1,106
star
3

git-issue

Git-based decentralized issue management
Shell
711
star
4

awesome-msr

A curated repository of software engineering repository mining data sets
374
star
5

UMLGraph

Declarative specification and drawing of UML diagrams
Java
339
star
6

dgsh

Shell supporting pipelines to and from multiple processes
C
324
star
7

pmonitor

Progress monitor: monitor a job's progress
Shell
185
star
8

cscout

C code refactoring browser
C
176
star
9

ai-cli-lib

Add AI capabilities to any readline-enabled command-line program
C
145
star
10

unix-v4man

Typeset the Fourth Research Edition Unix Programmer's Manual
Roff
136
star
11

ckjm

Chidamber and Kemerer Java Metrics
HTML
84
star
12

unix-architecture

Unix architecture evolution diagrams
Python
81
star
13

alexandria3k

Local relational access to openly-available publication data sets
Python
81
star
14

tokenizer

Convert source code into numerical tokens
C++
64
star
15

cqmetrics

C Quality Metrics
C++
56
star
16

effective-debugging

Code examples used in the book Effective Debugging (Addison-Wesley, 2016)
Java
42
star
17

speak

Reviving the Research Edition Unix speak command
Rust
35
star
18

bib2xhtml

Convert BibTeX references into XHTML
HTML
34
star
19

awesome-rest-apis

Currated collaborative list of open RESTful API web services
33
star
20

simple-rolap

Simple relational online analytical processing
Shell
28
star
21

unix-history-man

Manual page availability across major Unix releases
Perl
26
star
22

greek-vat-data

Retrieve the registration data associated with a Greek VAT number
Java
25
star
23

rdbunit

Unit testing for SQL queries
Python
23
star
24

unix-v3man

Typeset the Third Research Edition Unix Programmer's Manual
Roff
22
star
25

outwit

Command-line tools for accessing the Windows clipboard, registry, databases, document properties, and links.
C
20
star
26

lego-power-scratch

Control Lego power functions from Scratch
Python
17
star
27

oral-history-of-unix

Work by the late Michael Sean Mahoney, Professor of the History of Science at Princeton University, to create a history of Unix
HTML
16
star
28

kbd-layout-fix

Auto-correct text entered with the wrong keyboard layout
AutoHotkey
13
star
29

holiday-card

Simple Java AWT application to draw a Christmas card
Java
12
star
30

socketpipe

Super efficient TCP connection between remote processes
C
11
star
31

linux-history-repo

Reconstruction of the Linux kernel history with correct dates; see https://github.com/dspinellis/linux-history-make
C
11
star
32

git-subst

Git plugin for substituting a regular expression with some text across all files under revision control
Shell
11
star
33

manview

Unix man pages online viewer
CSS
10
star
34

word-master-ancient-greek

Ancient Greek version of the Wordly look-alike Word Master
JavaScript
10
star
35

OpenMIC

Open source implementation of the maximal information coefficient measure
C++
10
star
36

dostrace

A tool for logging MS-DOS system calls
C
9
star
37

greek-classifier

Classify surnames as Greek
Emacs Lisp
8
star
38

fileprune

Prune a file set according to a given age distribution
Roff
8
star
39

Kerberos

DSL-Configurable burglar alarm system for the Raspberry Pi
C
8
star
40

alt-truth

Alternative version of truth
C
7
star
41

linux-history-make

Reconstruct the Linux kernel history with correct dates
Shell
7
star
42

inaugural-analysis

Analysis of US inaugural presidential addresses
Python
7
star
43

Secrets-for-Java-SE

Decode Secrets for Android files on a Java SE platform
Java
6
star
44

cas2svg

Visualize Graphic 2 terminal .cas character descriptions
Perl
6
star
45

bibtools

Extract BibTeX records to standalone file for sharing with others
Perl
5
star
46

montty

Monitor input coming on a serial port
C
4
star
47

phd-reading-list

A reading list for research students (and their supervisors)
4
star
48

swill

Embedded web server interface library by S. Lampoudi and D. Beazley
C
4
star
49

dgcmodem

Code fixes for the linuxant dgc modem drivers for 3.x kernels.
Shell
4
star
50

PPS-monitor

Monitor a point-to-point (PPS) heating automation network link
Python
4
star
51

code-lifetime

Tools for analyzing the lifetime of code lines and tokens
Perl
4
star
52

win32-bitmap-print

Demonstration of Win32 bitmap printing issue
C++
3
star
53

top-trumps-cards

"Top Trumps" cards for chemical elements
Perl
3
star
54

fast-libc

Improve C library performance (currently qsort) through multi-threading
C
3
star
55

mpcd

mpcd: Modular Performant Clone Detector
C++
3
star
56

madplay-playlist

Fork of MAD with a few extra features (see the commits)
C
3
star
57

athens-visitor-info

Information for Athens visitors
2
star
58

taru

Process and display space usage in tar files
Python
2
star
59

grconv

Greek character set converter
C++
2
star
60

git-mine-briefing

Presentation and handouts for MSR briefing on Git mining
HTML
2
star
61

rat-name

Rational C++ Naming Conventions
2
star
62

leap-sec

Leap second testing and visualization
C
2
star
63

gi-example

2
star
64

jit-binary

On demand compile and run programs distributed in source code form
Makefile
2
star
65

sandbox

1
star
66

pcsecrets

Desktop client for the Secrets for Android password manager app
Java
1
star
67

bioinformatics

Adventures on Bioinformatics
1
star
68

CAGR

Compound Annual Growth Rate for Software
Perl
1
star
69

BlogRoll

BlogRoll of Software Data Analytics Blog and Mining Software Repositories
1
star
70

ax-178-logger

AXIO MET AX-178 multimeter logger
Python
1
star
71

elf-notes

Verify operation of ELF note section on Travis
C
1
star
72

scratch-joystick

Code that allows Scratch to read a joystick's values
Python
1
star
73

unicode-hotstrings

AutoHotkey hotstrings for entering useful Unicode characters
AutoHotkey
1
star
74

gi-issues

Issue management repository for gi
1
star
75

Favourite-movies

1
star