• Stars
    star
    2,393
  • Rank 19,088 (Top 0.4 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created almost 2 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python code to parse a Twitter archive and output in various ways

How do I use it?

  1. Download your Twitter archive (Settings > Your account > Download an archive of your data).
  2. Unzip to a folder.
  3. Right-click this link --> parser.py <-- and select "Save Link as", and save into the folder where you extracted the archive. (Or use wget or curl on that link. Or clone the git repo.)
  4. Open a command prompt and change directory into the unzipped folder where you just saved parser.py.
    (Here's how to do that on Windows: Hold shift while right-clicking in the folder. Click on Open PowerShell.)
  5. Run parser.py with Python 3. e.g. python parser.py.
    (On Windows: When the command window opens, paste or enter python parser.py at the command prompt.)

If you are having problems please check the issues list to see if it has happened before, and open a new issue otherwise.

What does it do?

The Twitter archive gives you a bunch of data and an HTML file (Your archive.html). Open that file to take a look! It lets you view your tweets in a nice interface. It has some flaws but maybe that's all you need. If so then stop here, you don't need our script.

Flaws of the Twitter archive:

  • It shows you tweets you posted with images, but if you click on one of the images to expand it then it takes you to the Twitter website. If you are offline or have deleted your account or twitter.com is down then that won't work.
  • The tweets are stored in a complex JSON structure so you can't just copy them into your blog for example.
  • The images they give you are smaller than the ones you uploaded. I don't know why they would do this to us.
  • DMs are included but don't show you who they are from - many of the user handles aren't included in the archive.
  • The links are all obfuscated in a short form using t.co, which hides their origin and redirects traffic to Twitter, giving them analytics. Also they will stop working if t.co goes down.

Our script does the following:

  • Converts the tweets to markdown and also HTML, with embedded images, videos and links.
  • Replaces t.co URLs with their original versions (the ones that can be found in the archive).
  • Copies used images to an output folder, to allow them to be moved to a new home.
  • Will query Twitter for the missing user handles (checks with you first).
  • Converts DMs (including group DMs) to markdown with embedded media and links, including the handles that we retrieved.
  • Outputs lists of followers and following.
  • Downloads the original size images (checks with you first).

For advanced users:

Some of the functionality requires the requests and imagesize modules. parser.py will offer to install these for you using pip. To avoid that you can install them before running the script.

Articles about handling your Twitter archive:

Related tools:

If our script doesn't do what you want then maybe a different tool will help:

More Repositories

1

opengl-canvas-wasm

Minimal example of animating the HTML5 canvas from C++ using OpenGL through WebAssembly
C++
339
star
2

GravityIsNotAForce

Visualising the spacetime geodesics of general relativity
JavaScript
212
star
3

sdl-canvas-wasm

Minimal example of animating the HTML5 canvas from C++ using SDL through WebAssembly
C++
206
star
4

hyperplay

Explore tilings on the hyperbolic plane
HTML
86
star
5

crochet-simulator

Predicting the 3D shape from a crochet pattern.
HTML
57
star
6

mobius-transforms

Exploring Mรถbius transformations and implementing the book Indra's Pearls
HTML
38
star
7

vtkpowercrust

A VTK port of the PowerCrust surface reconstruction algorithm
C++
31
star
8

klein-quartic

Exploring the Klein Quartic's geometry.
Python
28
star
9

mandelstir

Animating fractional iterations in the Mandelbrot Set and Julia Sets.
C++
27
star
10

livingphysics

Artificial chemistry game
JavaScript
27
star
11

zomes

Zomes are polar zonohedron domes, made from rhombi.
HTML
20
star
12

squirm3

Artificial chemistry
C++
18
star
13

cmake-swig-java

Minimal example of using SWIG to call C++ code from Java, with a CMake cross-platform build system.
CMake
15
star
14

latticegas

Lattice Gas Explorer, previously at http://code.google.com/p/latticegas
C++
15
star
15

PseudosphereGeodesics

Visualising straight lines (geodesics) on the pseudosphere and related geometries.
JavaScript
10
star
16

slinker

Generating Slitherlink puzzles, previously at http://code.google.com/p/slinker
C++
9
star
17

linear-enzymes

Artificial chemistry where chains of atoms can catalyse reactions
C++
7
star
18

voronoi-honeycombs

Making space-filling polyhedra using Voronoi cells
Python
3
star
19

chessviz

Visualizing the space of chess moves
HTML
3
star
20

turmite-trajectories

Analysing the trajectories of turmites
C++
2
star
21

circle-squaring

Cutting up a circle to make a square - how close can you get?
HTML
2
star
22

terminatingturmites

Searching for Busy Beaver Turing machines and their higher-dimensional cousins the Terminating Turmites. Previously at http://code.google.com/p/terminatingturmites
C++
2
star
23

povray-polyhedra

Script to render different polyhedra for Wikipedia
POV-Ray SDL
2
star
24

geofun

Exploring location-based fun
JavaScript
2
star
25

grid-physics

Flexible movement on a grid
C++
2
star
26

image-based-fractals

Image-based fractal rendering
HTML
2
star
27

timhutton.github.io

Blog:
Python
1
star
28

twisty-quads

Playing around with an optical illusion doodle
HTML
1
star
29

curved-mirrors

What does it look like when you are standing in a hall of mirrors and the walls curve inwards or outwards?
POV-Ray SDL
1
star
30

hat-freedom

Exploring the degrees of freedom in the new aperiodic monotile, 'the hat'.
JavaScript
1
star