• Stars
    star
    379
  • Rank 113,004 (Top 3 %)
  • Language
    Go
  • License
    GNU General Publi...
  • Created over 4 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Periscope gives you "duplicate vision" to help you organize and de-duplicate your files without losing data ๐Ÿ”ญ

Periscope Build Status

Periscope gives you "duplicate vision" to help you organize and de-duplicate your files without losing data.

Periscope demo

Periscope (psc) works differently from most other duplicate file finders. It is designed to be used interactively: Periscope will help you explore the filesystem, understand which files are duplicated, determine where duplicate copies live, and safely delete duplicates without losing data.

Following a psc scan, Periscope lets you navigate and explore your filesystem with the workflow you're already used to โ€” using your shell and commands like cd, ls, tree, and so on โ€” while providing additional duplicate-aware commands that mirror core filesystem utilities. For example, psc ls gives a directory listing that highlights duplicates, and psc rm deletes files only if a duplicate exists elsewhere. This makes it easy to understand how data is organized (and duplicated), reorganize files, and delete duplicates without worrying about accidentally losing data.

Workflow ยท Commands ยท Installation ยท Contributing

Workflow

Find duplicates

Start with psc scan to scan folders for duplicates. Once you run this, you shouldn't need to run it again while looking at and deleting duplicates, unless you move files around. If you delete files manually (rather than with psc rm), you can make Periscope detect deletions with psc refresh, which runs much faster than a full scan. psc scan is incremental, so if you want to scan a new directory or re-analyze one that was already scanned, you can always run the command again.

Understand duplicates

You can get a high-level understanding of how many duplicates you have and where they are located:

  • psc summary gives statistics on duplicate files
  • psc report shows a full list of duplicates, sorted by file size

After identifying areas to explore with psc report, you can navigate to those directories in your shell with cd, and then you can use Periscope commands to understand duplicates:

  • psc ls gives a duplicate-aware directory listing (optionally recursively, with the -R flag)
  • psc info shows information on a specific file (and its duplicates)

Delete duplicates

You can use the psc rm command to delete duplicates. You can think of it like a safe version of rm: it will not let you delete files unless there are duplicate copies elsewhere. A psc rm -r will recursively delete duplicates but not unique files. A psc rm --contained <path> will delete duplicates only if a copy is contained in the given folder.

Remove duplicate database

When you're done with a Periscope session, you can delete the duplicate database with psc finish.

Commands

Run psc help to see the full list of commands and psc help [command] to see help on a specific command.

psc scan scans for duplicates

Scans paths for duplicates and populates the database with information about duplicates. Scans the current directory if given no argument. Scanning is incremental; if you want to start from scratch, run psc finish first.

psc refresh removes deleted files from the database

Removes deleted files from the duplicate database. psc rm does this automatically, so this command only needs to be used if you use some other program (e.g. coreutils rm) and want to remove missing files from the database. This command does not re-analyze files, so if you've made substantial changes to the filesystem, like moving files around or adding new files, it's best to do a psc scan of the relevant directories.

psc finish deletes the duplicate database

Deletes the duplicate database. Once you're done using Periscope, it's good to use this command to delete the duplicate database, so it doesn't waste space on disk.

psc summary reports statistics

Prints statistics about the duplicate database, such as number of duplicate files and the amount of space duplicates consume.

psc report reports scan results

Lists all duplicates in the duplicate database, sorted by file size. Because this list is usually large, it's helpful to pipe the output to a pager, e.g. psc report | less.

psc export exports scan results

Exports information about duplicates in a machine-readable format (default JSON). This is the only output from Periscope that other programs should consume. Future versions of Periscope may add to the information that's included in the dump, but the layout of existing data will not change.

psc ls lists a directory

Lists files and folders in the given directory (or the current directory, if none is given). This command shows the number of duplicates that each file has: 1 means that there is a single duplicate elsewhere in the filesystem; if a file has no duplicates, the number is omitted. Directories are tagged with a 'd', and special files are tagged with a character describing their type, e.g. 'p' for named pipes. -a shows hidden files. -d lists only duplicates, while -u lists only unique files. -v lists all duplicates of every file, and -r shows the path to the duplicate as a relative path instead of an absolute path. -R lists files recursively; this flag combines well with the -d flag, to list only duplicate files recursively contained in a given directory.

psc tree lists all duplicates in a given directory

Lists all files recursively contained in the given directory (or the current directory, if none is given) that have a duplicate file elsewhere. This command hides hidden files and folders by default; the -a flag includes hidden files.

This command shows a "flattened" representation; in most cases, a psc ls -Rd is more useful.

psc info inspects a file

Shows information about a single file's duplicates. Like with psc ls, the -r flag shows the path to the duplicate as a path relative to the given file.

psc rm deletes duplicates

Deletes duplicates but not unique files; no way of invoking this command will delete unique files. This command makes use of the database, but it double-checks files and their copies before it deletes anything, so a stale duplicate database will not result in data loss. The -n flag will perform a dry run, listing files that would be deleted but not actually deleting anything. -r deletes duplicates recursively. The --contained <path> argument gives more fine-grained control over deletion: files are only deleted if they have a duplicate in the given location. This is useful, for example, for deleting files from a "to organize" directory only if they are also contained in the "organized" directory, as in the demo video above. By default, psc rm does not delete any files when it's given a set where there are no duplicates outside the set: for example, if files "/a/x1" and "/a/x2" are duplicates, recursively removing "/a" will leave both files untouched. Passing the --arbitrary flag will result in such duplicates being handled by arbitrarily choosing one file to save and deleting the rest.

Installation

Install with Homebrew (on macOS):

brew install periscope

Download a binary release: Periscope releases.

Periscope has binary releases for macOS and Linux. It has not been tested on Windows.

Install from source with go install:

go install -v github.com/anishathalye/periscope/cmd/psc@latest

Periscope depends on go-sqlite3, which uses cgo, so you need a C compiler present in your path. You might also need to set CGO_ENABLED=1 if you have it disabled otherwise.

Contributing

Bug reports, feature requests, feedback on the tool or documentation, and pull requests are all appreciated. If you are planning on making substantial changes that you hope to have merged, it is highly recommended that you first open an issue to discuss your proposed change.

License

Copyright (c) Anish Athalye ([email protected]). Released under GPLv3. See LICENSE.txt for details.

More Repositories

1

dotbot

A tool that bootstraps your dotfiles โšก๏ธ
Python
7,030
star
2

neural-style

Neural style in TensorFlow! ๐ŸŽจ
Python
5,539
star
3

git-remote-dropbox

A transparent bridge between Git and Dropbox - use a Dropbox (shared) folder as a Git remote! ๐ŸŽ
Python
3,055
star
4

lumen

Magic auto brightness based on screen contents ๐Ÿ’ก
Objective-C
2,286
star
5

gemini

Gemini is a modern LaTex beamerposter theme ๐Ÿ–ผ
TeX
982
star
6

porcupine

A fast linearizability checker written in Go ๐Ÿ”Ž
Go
883
star
7

obfuscated-gradients

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Jupyter Notebook
881
star
8

seashells

The official client for seashells.io ๐Ÿš
Python
707
star
9

dotfiles

~anish โ€ข powered by https://github.com/anishathalye/dotbot ๐Ÿ’พ
Shell
704
star
10

neural-hash-collider

Preimage attack against NeuralHash ๐Ÿ’ฃ
Python
664
star
11

gavel

A project expo judging system ๐Ÿ“Š
Python
433
star
12

auriga

Auriga is a minimalist LaTeX beamer presentation theme ๐Ÿ“ฝ
TeX
328
star
13

offix

"Who is in the office?" ๐Ÿ‘€
JavaScript
186
star
14

dotfiles_template

A template for structuring dotfiles (using Dotbot as an installer) ๐Ÿ“œ
PowerShell
184
star
15

ribosome

Synthesize photos from PhotoDNA using machine learning ๐ŸŒฑ
Python
141
star
16

imagenet-simple-labels

Simpler human-readable labels for ImageNet ๐Ÿท
122
star
17

dotfiles-local

~anish [local config] โ€ขย powered by https://github.com/anishathalye/dotbot ๐Ÿ 
Shell
75
star
18

mathematics-of-deep-learning

The Mathematics of Deep Learning, SIPB IAP 2018
Jupyter Notebook
74
star
19

proof-html

A GitHub Action to validate HTML, check links, and more โœ…
Ruby
57
star
20

seashells-server

The seashells.io server ๐Ÿš
Go
53
star
21

knox

A framework for formally verifying hardware security modules to be free of hardware, software, and timing side-channel vulnerabilities ๐Ÿ”
Racket
29
star
22

notary

Notary: A Device for Secure Transaction Approval ๐Ÿ“Ÿ
Verilog
28
star
23

hubot-group

A hubot script that expands mentions of groups ๐Ÿ‘ซ
CoffeeScript
26
star
24

synox

Rust library for program synthesis of string transformations from input-output examples ๐Ÿ”ฎ
Rust
26
star
25

knox-hsm

Circuits and hardware security modules formally verified with Knox ๐Ÿ”
Verilog
23
star
26

disposable

Create a Reddit throwaway account with the click of a button! ๐Ÿšฎ
JavaScript
23
star
27

skipchat

SkipChat - MHacks V
C
20
star
28

rtlv

Tools for reasoning about circuits in Rosette/Racket ๐Ÿ”Œ
Racket
18
star
29

micro-wwvb

A tiny WWVB station ๐Ÿ“ก
C
17
star
30

linux-bootstrap

get a debian-based system set up the way I like it, with minimal effort on my part
17
star
31

hubot-shortcut

A macro system for hubot ๐Ÿ’จ
JavaScript
16
star
32

bin2coe

A tool to convert binary files to COE files ๐Ÿ’ซ
Python
14
star
33

anishathalye

A self-updating GitHub profile ๐Ÿฏ
Python
12
star
34

hubot-conf

A simple configuration management system for hubot ๐Ÿ”ง
JavaScript
11
star
35

gitlive

the source code that powered gitlive.net
Java
11
star
36

chroniton

A tool for formally verifying constant-time software against hardware ๐Ÿ•ฐ๏ธ
Racket
10
star
37

unblock

A tiny utility to make shell pipes behave as if they have unlimited buffering โ™พ
Go
7
star
38

countdown

A simple countdown timer you can set as your homepage โฐ
HTML
5
star
39

coqioa

A formalization of IO automata in the Coq proof assistant
Coq
5
star
40

x

playground for testing stuff on github
4
star
41

assets

README assets for my GitHub projects ๐ŸŽญ
4
star
42

easy-security

Slides from the SIPB Cluedump on Low Effort High Impact Security
HTML
3
star
43

learn-pgp

Slides from the SIPB Cluedump on PGP
HTML
3
star
44

learn-git

An introduction to using Git - prepared for HackMIT / Hack Week 2015
HTML
2
star
45

deterministic-start-benchmark

Verilog
2
star
46

ipr

A formalization of information-preserving refinement (IPR) in the Coq Proof Assistant ๐Ÿงฉ
Coq
2
star
47

xclips

Rust
1
star
48

scripts

Python
1
star
49

formal-methods-tutorial-2022-10-11

Racket
1
star