• Stars
    star
    112
  • Rank 310,447 (Top 7 %)
  • Language
    Python
  • Created almost 10 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

keepgrabbing.py

This is a transcription of the Python script which Aaron Swartz used to download a large number of documents from JSTOR archive between 2010 and 2011.

I'm not sure what Aaron would have wanted us to do with this code, but my instinct is that he'd want it freely available, and it's worth having in an executable machine readable format under version control, rather than on a hard drive somewhere which has long since stopped spinning. I guess this is sort of a memorial in some sense.

Rest in peace.

Todo

  • Line 5 contains a redacted hostname/domain, does anyone know what that was?
  • #1 @speedplane points out there there was a second version of the script (keepgrabbing2.py) which is referenced in the indictment. If anyone has a copy of this please get in touch or submit a pull request.