• Stars
    star
    265
  • Rank 154,577 (Top 4 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created almost 5 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python codecs extension featuring CLI tools for encoding/decoding anything

CodExt Tweet

Encode/decode anything.

PyPi Read The Docs Build Status Coverage Status Python Versions Known Vulnerabilities DOI License

CodExt is a (Python2-3 compatible) library that extends the native codecs library (namely for adding new custom encodings and character mappings) and provides 120+ new codecs, hence its name combining CODecs EXTension. It also features a guess mode for decoding multiple layers of encoding and CLI tools for convenience.

$ pip install codext
Want to contribute a new codec ? Want to contribute a new macro ?
Check the documentation first
Then PR your new codec
PR your updated version of macros.json

πŸ” Demonstrations

Using CodExt from the command line

Using base tools from the command line

Using the unbase command line tool

πŸ’» Usage (main CLI tool) Tweet on codext

$ codext -i test.txt encode dna-1
GTGAGCGGGTATGTGA

$ echo -en "test" | codext encode morse
- . ... -

$ echo -en "test" | codext encode braille
β žβ ‘β Žβ ž

$ echo -en "test" | codext encode base100
πŸ‘«πŸ‘œπŸ‘ͺπŸ‘«

Chaining codecs

$ echo -en "Test string" | codext encode reverse
gnirts tseT

$ echo -en "Test string" | codext encode reverse morse
--. -. .. .-. - ... / - ... . -

$ echo -en "Test string" | codext encode reverse morse dna-2
AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC

$ echo -en "Test string" | codext encode reverse morse dna-2 octal
101107124103101107124103101107124107101107101101101107124103101107124107101107101101101107124107101107124107101107101101101107124107101107124103101107124107101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124124101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124107101107101101101107124103

$ echo -en "AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC" | codext -d dna-2 morse reverse
test string

Using macros

$ codext add-macro my-encoding-chain gzip base63 lzma base64

$ codext list macros
example-macro, my-encoding-chain

$ echo -en "Test string" | codext encode my-encoding-chain
CQQFAF0AAIAAABuTgySPa7WaZC5Sunt6FS0ko71BdrYE8zHqg91qaqadZIR2LafUzpeYDBalvE///ug4AA==

$ codext remove-macro my-encoding-chain

$ codext list macros
example-macro

πŸ’» Usage (base CLI tool) Tweet on unbase

$ echo "Test string !" | base122
*.7!ft9οΏ½-f9Γ‚

$ echo "Test string !" | base91 
"ONK;WDZM%Z%xE7L

$ echo "Test string !" | base91 | base85
B2P|BJ6A+nO(j|-cttl%

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr
QVx5tvgjvCAkXaMSuKoQmCnjeCV1YyyR3WErUUErFf

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | base58-flickr -d | base36 -d | base85 -d | base91 -d
Test string !
$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -m 3
Test string !

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -f Test
Test string !

πŸ’» Usage (Python)

Getting the list of available codecs:

>>> import codext

>>> codext.list()
['ascii85', 'base85', 'base100', 'base122', ..., 'tomtom', 'dna', 'html', 'markdown', 'url', 'resistor', 'sms', 'whitespace', 'whitespace-after-before']

>>> codext.encode("this is a test", "base58-bitcoin")
'jo91waLQA1NNeBmZKUF'

>>> codext.encode("this is a test", "base58-ripple")
'jo9rA2LQwr44eBmZK7E'

>>> codext.encode("this is a test", "base58-url")
'JN91Wzkpa1nnDbLyjtf'

>>> codecs.encode("this is a test", "base100")
'πŸ‘«πŸ‘ŸπŸ‘ πŸ‘ͺπŸ—πŸ‘ πŸ‘ͺπŸ—πŸ‘˜πŸ—πŸ‘«πŸ‘œπŸ‘ͺπŸ‘«'

>>> codecs.decode("πŸ‘«πŸ‘ŸπŸ‘ πŸ‘ͺπŸ—πŸ‘ πŸ‘ͺπŸ—πŸ‘˜πŸ—πŸ‘«πŸ‘œπŸ‘ͺπŸ‘«", "base100")
'this is a test'

>>> for i in range(8):
        print(codext.encode("this is a test", "dna-%d" % (i + 1)))
GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA
CTCACGGACGGCCTATAGAACGGCCTATAGAACGACAGAACTCACGCCCTATCTCA
ACAGATTGATTAACGCGTGGATTAACGCGTGGATGAGTGGACAGATAAACGCACAG
AGACATTCATTAAGCGCTCCATTAAGCGCTCCATCACTCCAGACATAAAGCGAGAC
TCTGTAAGTAATTCGCGAGGTAATTCGCGAGGTAGTGAGGTCTGTATTTCGCTCTG
TGTCTAACTAATTGCGCACCTAATTGCGCACCTACTCACCTGTCTATTTGCGTGTC
GAGTGCCTGCCGGATATCTTGCCGGATATCTTGCTGTCTTGAGTGCGGGATAGAGT
CACTCGGTCGGCCATATGTTCGGCCATATGTTCGTCTGTTCACTCGCCCATACACT
>>> codext.decode("GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA", "dna-1")
'this is a test'

>>> codecs.encode("this is a test", "morse")
'- .... .. ... / .. ... / .- / - . ... -'

>>> codecs.decode("- .... .. ... / .. ... / .- / - . ... -", "morse")
'this is a test'

>>> with open("morse.txt", 'w', encoding="morse") as f:
	f.write("this is a test")
14

>>> with open("morse.txt",encoding="morse") as f:
	f.read()
'this is a test'

>>> codext.decode("""
      =            
              X         
   :            
      x         
  n  
    r 
        y   
      Y            
              y        
     p    
         a       
 `          
            n            
          |    
  a          
o    
       h        
          `            
          g               
           o 
   z      """, "whitespace-after+before")
'CSC{not_so_invisible}'

>>> print(codext.encode("An example test string", "baudot-tape"))
***.**
   . *
***.* 
*  .  
   .* 
*  .* 
   . *
** .* 
***.**
** .**
   .* 
*  .  
* *. *
   .* 
* *.  
* *. *
*  .  
* *.  
* *. *
***.  
  *.* 
***.* 
 * .* 

πŸ“ƒ List of codecs

BaseXX

  • base1: useless, but for the sake of completeness
  • base2: simple conversion to binary (with a variant with a reversed alphabet)
  • base3: conversion to ternary (with a variant with a reversed alphabet)
  • base4: conversion to quarternary (with a variant with a reversed alphabet)
  • base8: simple conversion to octal (with a variant with a reversed alphabet)
  • base10: simple conversion to decimal
  • base11: conversion to digits with a "a"
  • base16: simple conversion to hexadecimal (with a variant holding an alphabet with digits and letters inverted)
  • base26: conversion to alphabet letters
  • base32: classical conversion according to the RFC4648 with all its variants (zbase32, extended hexadecimal, geohash, Crockford)
  • base36: Base36 conversion to letters and digits (with a variant inverting both groups)
  • base45: Base45 DRAFT algorithm (with a variant inverting letters and digits)
  • base58: multiple versions of Base58 (bitcoin, flickr, ripple)
  • base62: Base62 conversion to lower- and uppercase letters and digits (with a variant with letters and digits inverted)
  • base63: similar to base62 with the "_" added
  • base64: classical conversion according to RFC4648 with its variant URL (or file) (it also holds a variant with letters and digits inverted)
  • base67: custom conversion using some more special characters (also with a variant with letters and digits inverted)
  • base85: all variants of Base85 (Ascii85, z85, Adobe, (x)btoa, RFC1924, XML)
  • base91: Base91 custom conversion
  • base100 (or emoji): Base100 custom conversion
  • base122: Base100 custom conversion
  • base-genericN: see base encodings ; supports any possible base

This category also contains ascii85, adobe, [x]btoa, zeromq with the base85 codec.

Binary

  • baudot: supports CCITT-1, CCITT-2, EU/FR, ITA1, ITA2, MTK-2 (Python3 only), UK, ...
  • baudot-spaced: variant of baudot ; groups of 5 bits are whitespace-separated
  • baudot-tape: variant of baudot ; outputs a string that looks like a perforated tape
  • bcd: Binary Coded Decimal, encodes characters from their (zero-left-padded) ordinals
  • bcd-extended0: variant of bcd ; encodes characters from their (zero-left-padded) ordinals using prefix bits 0000
  • bcd-extended1: variant of bcd ; encodes characters from their (zero-left-padded) ordinals using prefix bits 1111
  • excess3: uses Excess-3 (aka Stibitz code) binary encoding to convert characters from their ordinals
  • gray: aka reflected binary code
  • manchester: XORes each bit of the input with 01
  • manchester-inverted: variant of manchester ; XORes each bit of the input with 10
  • rotateN: rotates characters by the specified number of bits (N belongs to [1, 7] ; Python 3 only)

Common

  • a1z26: keeps words whitespace-separated and uses a custom character separator
  • cases: set of case-related encodings (including camel-, kebab-, lower-, pascal-, upper-, snake- and swap-case, slugify, capitalize, title)
  • dummy: set of simple encodings (including integer, replace, reverse, word-reverse, substite and strip-spaces)
  • octal: dummy octal conversion (converts to 3-digits groups)
  • octal-spaced: variant of octal ; dummy octal conversion, handling whitespace separators
  • ordinal: dummy character ordinals conversion (converts to 3-digits groups)
  • ordinal-spaced: variant of ordinal ; dummy character ordinals conversion, handling whitespace separators

Compression

  • gzip: standard Gzip compression/decompression
  • lz77: compresses the given data with the algorithm of Lempel and Ziv of 1977
  • lz78: compresses the given data with the algorithm of Lempel and Ziv of 1978
  • pkzip_deflate: standard Zip-deflate compression/decompression
  • pkzip_bzip2: standard BZip2 compression/decompression
  • pkzip_lzma: standard LZMA compression/decompression

⚠️ Compression functions are of course definitely NOT encoding functions ; they are implemented for leveraging the .encode(...) API from codecs.

Cryptography

  • affine: aka Affine Cipher
  • atbash: aka Atbash Cipher
  • bacon: aka Baconian Cipher
  • barbie-N: aka Barbie Typewriter (N belongs to [1, 4])
  • citrix: aka Citrix CTX1 password encoding
  • railfence: aka Rail Fence Cipher
  • rotN: aka Caesar cipher (N belongs to [1,25])
  • scytaleN: encrypts using the number of letters on the rod (N belongs to [1,[)
  • shiftN: shift ordinals (N belongs to [1,255])
  • xorN: XOR with a single byte (N belongs to [1,255])

⚠️ Crypto functions are of course definitely NOT encoding functions ; they are implemented for leveraging the .encode(...) API from codecs.

Hashing

  • blake: includes BLAKE2b and BLAKE2s (Python 3 only ; relies on hashlib)
  • checksums: includes Adler32 and CRC32 (relies on zlib)
  • crypt: Unix's crypt hash for passwords (Python 3 and Unix only ; relies on crypt)
  • md: aka Message Digest ; includes MD4 and MD5 (relies on hashlib)
  • sha: aka Secure Hash Algorithms ; includes SHA1, 224, 256, 384, 512 (Python2/3) but also SHA3-224, -256, -384 and -512 (Python 3 only ; relies on hashlib)
  • shake: aka SHAKE hashing (Python 3 only ; relies on hashlib)

⚠️ Hash functions are of course definitely NOT encoding functions ; they are implemented for convenience with the .encode(...) API from codecs and useful for chaning codecs.

Languages

  • braille: well-known braille language (Python 3 only)
  • ipsum: aka lorem ipsum
  • galactic: aka galactic alphabet or Minecraft enchantment language (Python 3 only)
  • leetspeak: based on minimalistic elite speaking rules
  • morse: uses whitespace as a separator
  • navajo: only handles letters (not full words from the Navajo dictionary)
  • radio: aka NATO or radio phonetic alphabet
  • southpark: converts letters to Kenny's language from Southpark (whitespace is also handled)
  • southpark-icase: case insensitive variant of southpark
  • tap: converts text to tap/knock code, commonly used by prisoners
  • tomtom: similar to morse, using slashes and backslashes

Others

  • dna: implements the 8 rules of DNA sequences (N belongs to [1,8])
  • letter-indices: encodes consonants and/or vowels with their corresponding indices
  • markdown: unidirectional encoding from Markdown to HTML

Steganography

  • hexagram: uses Base64 and encodes the result to a charset of I Ching hexagrams (as implemented here)
  • klopf: aka Klopf code ; Polybius square with trivial alphabetical distribution
  • resistor: aka resistor color codes
  • rick: aka Rick cipher (in reference to Rick Astley's song "Never gonna give you up")
  • sms: also called T9 code ; uses "-" as a separator for encoding, "-" or "_" or whitespace for decoding
  • whitespace: replaces bits with whitespaces and tabs
  • whitespace_after_before: variant of whitespace ; encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. "whitespace+2*after-3*before")

Web

  • html: implements entities according to this reference
  • url: aka URL encoding

πŸ‘ Supporters

Stargazers repo roster for @dhondta/python-codext

Forkers repo roster for @dhondta/python-codext

Back to top

More Repositories

1

dronesploit

Drone pentesting framework console
Python
1,366
star
2

awesome-executable-packing

A curated list of awesome resources related to executable packing
495
star
3

python-sploitkit

Devkit for building Metasploit-like consoles
Python
229
star
4

webgrep

Grep Web pages with extra features like JS deobfuscation and OCR
Python
106
star
5

rpl-attacks

RPL attacks framework for simulating WSN with a malicious mote based on Contiki
Python
72
star
6

tex-course-index-template

A template for writing a condensed course index leveraging LaTeX indexing
Python
49
star
7

python-tinyscript

Devkit for quickly building CLI tools with Python
Python
47
star
8

zotero-cli

Tinyscript tool for sorting and exporting Zotero references based on pyzotero
Python
40
star
9

stegano-tools

Collection of steganography tools for images and text
34
star
10

AppmemDumper

Forensics triage tool relying on Volatility and Foremost
Python
24
star
11

mkdocs-revealjs-template

Template of MkDocs + Reveal.js static documentation website
CSS
19
star
12

bots-scheduler

Cron-like system based on Nextdoor Scheduler, PyBots and Tinyscript
Python
17
star
13

tex-book-template

A template for writing a nice book with LaTeX
TeX
15
star
14

peid

Python implementation of the Packed Executable iDentifier (PEiD)
Python
15
star
15

python-pybots

πŸ”§ Devkit for quickly creating client bots for remote communications
Python
13
star
16

recursive-compression

Tinyscript tool for recursively (de)compressing nested archives using multiple algorithms (bzip2, rar, lzma, ...)
Python
12
star
17

malicious-macro-tester

CLI tool for testing Office documents with macros using MaliciousMacroBot
Python
9
star
18

tex-master-thesis-template

A template for writing a nice master thesis dissertation with LaTeX
TeX
8
star
19

python-asciistuff

🎨 Library for producing ASCII arts
Python
7
star
20

docker-packing-box

Docker image gathering packers and tools for making datasets of packed executables and training machine learning models for packing detection
Python
7
star
21

bintropy

Analysis tool for estimating the likelihood that a binary contains compressed or encrypted bytes
Python
6
star
22

pentest-for-beginners

PenTesting course made with Mkdocs/Reveal.js
HTML
5
star
23

searchpass

Tinyscript tool for searching for default passwords on various open source databases based on pybots
Python
4
star
24

scapl-search

SCAPL search engine component.
Python
2
star
25

scapl-automation

SCAPL automation system component.
Python
2
star
26

scapl-install

SCAPL application installation files
Shell
1
star
27

scapl-backend

SCAPL backend component.
1
star
28

tex-poster-template

A template for creating a nice scientific poster with LaTeX
TeX
1
star
29

tex-cheat-sheet-template

A template for creating a nice cheat sheet with LaTeX
TeX
1
star
30

scapl-frontend

SCAPL frontend component.
JavaScript
1
star