This repository has been archived!
This IPFS-related repository has been archived, and all issues are therefore frozen. If you want to ask a question or open/continue a discussion related to this repo, please visit the official IPFS forums.
We archive repos for one or more of the following reasons:
- Code or content is unmaintained, and therefore might be broken
- Content is outdated, and therefore may mislead readers
- Code or content evolved into something else and/or has lived on in a different place
- The repository or project is not active in general
Please note that in order to keep the primary IPFS GitHub org tidy, most archived repos are moved into the ipfs-inactive org.
If you feel this repo should not be archived (or portions of it should be moved to a non-archived repo), please reach out and let us know. Archiving can always be reversed if needed.
IPFS Archives (archives)
Repo to coordinate archival efforts with IPFS
One of the fundamental goals of IPFS is to improve archival storage of humanity's public record. This is a critically important endeavor. In particular, our highest priority is research artifacts -- scientific publications, data repositories, wikipedia, etc.
This repo helps us organize efforts. See the efforts in the issues.
Table of Contents
Improving on the status quo of Archival
IPFS improves data storage
- Chunking: IPFS employs a clever chunking strategy: by default it uses Rabin Fingerprint chunking, a content-driven chunking algorithm that optimizes for finding duplicate data blocks. Chunking is also modular, so users can chunk data in whatever way is most useful to them. (we may employ diff-chunking in the future). This is particularly useful for special chunking of audio and video media.
- Deduplication: the chunking strategy allows IPFS to deduplicate vast amounts of data in large repositories.
- Cryptographic Integrity: the integrity of the data is protected and guaranteed by cryptography. Bitrot will be caught and not passed to the user.
IPFS improves distribution
- Bandwidth Sharing: as a meshing protocol, IPFS can use the bandwidth of replicas, which act like seeds in a swarm, NOT traditional HTTP mirrors. These can be dedicated systems, or even users currently viewing the files. This reduces the bandwidth usage for servers, and improves the download bandwidth for end users.
- Speed: IPFS achieves blazing fast speeds and great bandwidth utilization by leveraging immutability, cryptographic integrity checks, and data locality in the network.
- Replication: it is easy to replicate an archive:
ipfs daemon --init & ipfs pin add -r <path>
. One needs only to follow theHEAD
of the archive (a hash reference), and retrieve the latest additions to an archive. (Today in git, in the future in an IPNS name). - Collaboration: the distribution model makes it extremely easy (one command) to replicate an archive for safe-keeping, and contribute bandwidth to the effort of serving it to others.
Current Projects
Current archival efforts are being coordinated via issues.
Feel free to suggest other open-access archives by opening a new issue. [However, please ensure that data is under an appropriate license (such as Creative Commons), or you have obtained proper permission, before copying it to IPFS.]
Some suggestions for possible future archival efforts can be found here.
Examples
- Arxiv.org CC-By Papers: https://ipfs.io/ipfs/QmfXH9XtP7xmoTH8WAp4HNSduqWMwLTH8B8TvbTkdgzNAa/
(TODO finish README)
Maintainers
Captain: @flyingzumwalt
If you're interested in captaining this repo, open an issue!
Contribute
Feel free to join in! Look at the existing discussions in the issues, or open an issue if you want to talk about something new. All welcome.
Want to hack on IPFS?
License
CC-BY 3.0 © 2016 Protocol Labs Inc.