• Stars
    star
    1,568
  • Rank 29,857 (Top 0.6 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A collection of common regular expressions bundled with an easy to use interface.

CommonRegex

Find all times, dates, links, phone numbers, emails, ip addresses, prices, hex colors, and credit card numbers in a string. We did the hard work so you don't have to.

Pull requests welcome!

Installation

Install via pip

sudo pip install commonregex

or via setup.py

python setup.py install

Usage

>>> from commonregex import CommonRegex
>>> parsed_text = CommonRegex("""John, please get that article on www.linkedin.com to me by 5:00PM 
                               on Jan 9th 2012. 4:00 would be ideal, actually. If you have any 
                               questions, You can reach me at (519)-236-2723x341 or get in touch with
                               my associate at [email protected]""")
>>> parsed_text.times
['5:00PM', '4:00']
>>> parsed_text.dates
['Jan 9th 2012']
>>> parsed_text.links
['www.linkedin.com']
>>> parsed_text.phones
['(519)-236-2727']
>>> parsed_text.phones_with_exts
['(519)-236-2723x341']
>>> parsed_text.emails
['[email protected]']

Alternatively, you can generate a single CommonRegex instance and use it to parse multiple segments of text.

>>> parser = CommonRegex()
>>> parser.times("When are you free?  Do you want to meet up for coffee at 4:00?")
['4:00']

Finally, all regular expressions used are publicly exposed.

>>> from commonregex import email
>>> import re
>>> text = "...get in touch with my associate at [email protected]"
>>> re.sub(email, "[email protected]", text)
'...get in touch with my associate at [email protected]'
>>> from commonregex import time
>>> for m in time.finditer("Does 6:00 or 7:00 work better?"):
>>>     print m.start(), m.group()     
5 6:00 
13 7:00 

Please note that this module is currently English/US specific.

Supported Methods/Attributes

  • obj.dates, obj.dates()
  • obj.times, obj.times()
  • obj.phones, obj.phones()
  • obj.phones_with_exts, obj.phones_with_exts()
  • obj.links, obj.links()
  • obj.emails, obj.emails()
  • obj.ips, obj.ips()
  • obj.ipv6s, obj.ipv6s()
  • obj.prices, obj.prices()
  • obj.hex_colors, obj.hex_colors()
  • obj.credit_cards, obj.credit_cards()
  • obj.btc_addresses, obj.btc_addresses()
  • obj.street_addresses, obj.street_addresses()
  • obj.zip_codes, obj.zip_codes()
  • obj.po_boxes, obj.po_boxes()
  • obj.ssn_number, obj.ssn_number()

CommonRegex Ports:

CommonRegexRust

[CommonRegexJS] (https://github.com/talyssonoc/CommonRegexJS)

[CommonRegexScala] (https://github.com/everpeace/CommonRegexScala)

[CommonRegexJava] (https://github.com/talyssonoc/CommonRegexJava)

[CommonRegexCobra] (https://github.com/PurityLake/CommonRegex-Cobra)

[CommonRegexDart] (https://github.com/aufdemrand/CommonRegexDart)

[CommonRegexRuby] (https://github.com/talyssonoc/CommonRegexRuby)

[CommonRegexPHP] (https://github.com/james2doyle/CommonRegexPHP)

Analytics

More Repositories

1

Tomorrow

Magic decorator syntax for asynchronous code in Python
Python
1,462
star
2

BlackWidow

Visualizing Python Project Import Graphs
Python
108
star
3

tracker

A time machine for debugging pesky stateful errors.
Python
35
star
4

SemiSync

Semi-synchronous programming in python
Python
34
star
5

Runtype

Runtime type checking for python
Python
7
star
6

docai

Structured information extraction from documents
Python
7
star
7

coplay

Collaborative music at its finest.
JavaScript
3
star
8

MotorControl

Simple arduino code using AdaFruit MotorShield for PD control of brushed motor.
Arduino
3
star
9

TheanoAutoencoder

A generic autoencoder in Theano.
Python
2
star
10

visual-lastfm

Last.fm statistics with D3.js
JavaScript
2
star
11

ScalingLawsCalculator

Calculator to compute scaling laws for neural language models
Python
2
star
12

UnicodePlease

Utilities to prevent you from tearing your hair out dealing with unicode and text encodings in python 2.7.
Python
2
star
13

dldsl

Deep Learning Domain Specific Language
Python
2
star
14

Spokes-Extension

A chrome extension for Spokes -- an intelligent, unified search engine
JavaScript
1
star
15

siren-haiku

Implementation of SIREN described in https://arxiv.org/abs/2006.09661
Python
1
star
16

sublime-settings

Sublime Text 3 settings for python hackers with an appreciation for visual minimalism
Python
1
star
17

CrashCourse

Crash course for Olin College ReadML cocurricular.
1
star
18

Algorithms

Basic algorithms in Python, Javascript, and C++
C++
1
star
19

Decorative

Handy python decorators
Python
1
star
20

SensorDebugger

Simple android project to aid with sensor debugging
Java
1
star