• Stars
    star
    1,781
  • Rank 25,957 (Top 0.6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 11 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

%%sql magic for IPython, hopefully evolving into full SQL client

ipython-sql

Author: Catherine Devlin, http://catherinedevlin.blogspot.com

Introduces a %sql (or %%sql) magic.

Legacy project

IPython-SQL's functionality and maintenance have been eclipsed by JupySQL, a fork maintained and developed by the Ploomber team. Future work will be directed into JupySQL - please file issues there, as well!

Description

Connect to a database, using SQLAlchemy URL connect strings, then issue SQL commands within IPython or IPython Notebook.

screenshot of ipython-sql in the Notebook

Examples

In [1]: %load_ext sql

In [2]: %%sql postgresql://will:longliveliz@localhost/shakes
   ...: select * from character
   ...: where abbrev = 'ALICE'
   ...:
Out[2]: [(u'Alice', u'Alice', u'ALICE', u'a lady attending on Princess Katherine', 22)]

In [3]: result = _

In [4]: print(result)
charid   charname   abbrev                description                 speechcount
=================================================================================
Alice    Alice      ALICE    a lady attending on Princess Katherine   22

In [4]: result.keys
Out[5]: [u'charid', u'charname', u'abbrev', u'description', u'speechcount']

In [6]: result[0][0]
Out[6]: u'Alice'

In [7]: result[0].description
Out[7]: u'a lady attending on Princess Katherine'

After the first connection, connect info can be omitted:

In [8]: %sql select count(*) from work
Out[8]: [(43L,)]

Connections to multiple databases can be maintained. You can refer to an existing connection by username@database

In [9]: %%sql will@shakes
   ...: select charname, speechcount from character
   ...: where  speechcount = (select max(speechcount)
   ...:                       from character);
   ...:
Out[9]: [(u'Poet', 733)]

In [10]: print(_)
charname   speechcount
======================
Poet       733

If no connect string is supplied, %sql will provide a list of existing connections; however, if no connections have yet been made and the environment variable DATABASE_URL is available, that will be used.

For secure access, you may dynamically access your credentials (e.g. from your system environment or getpass.getpass) to avoid storing your password in the notebook itself. Use the $ before any variable to access it in your %sql command.

In [11]: user = os.getenv('SOME_USER')
   ....: password = os.getenv('SOME_PASSWORD')
   ....: connection_string = "postgresql://{user}:{password}@localhost/some_database".format(user=user, password=password)
   ....: %sql $connection_string
Out[11]: u'Connected: some_user@some_database'

You may use multiple SQL statements inside a single cell, but you will only see any query results from the last of them, so this really only makes sense for statements with no output

In [11]: %%sql sqlite://
   ....: CREATE TABLE writer (first_name, last_name, year_of_death);
   ....: INSERT INTO writer VALUES ('William', 'Shakespeare', 1616);
   ....: INSERT INTO writer VALUES ('Bertold', 'Brecht', 1956);
   ....:
Out[11]: []

As a convenience, dict-style access for result sets is supported, with the leftmost column serving as key, for unique values.

In [12]: result = %sql select * from work
43 rows affected.

In [13]: result['richard2']
Out[14]: (u'richard2', u'Richard II', u'History of Richard II', 1595, u'h', None, u'Moby', 22411, 628)

Results can also be retrieved as an iterator of dictionaries (result.dicts()) or a single dictionary with a tuple of scalar values per key (result.dict())

Variable substitution

Bind variables (bind parameters) can be used in the "named" (:x) style. The variable names used should be defined in the local namespace.

In [15]: name = 'Countess'

In [16]: %sql select description from character where charname = :name
Out[16]: [(u'mother to Bertram',)]

In [17]: %sql select description from character where charname = '{name}'
Out[17]: [(u'mother to Bertram',)]

Alternately, $variable_name or {variable_name} can be used to inject variables from the local namespace into the SQL statement before it is formed and passed to the SQL engine. (Using $ and {} together, as in ${variable_name}, is not supported.)

Bind variables are passed through to the SQL engine and can only be used to replace strings passed to SQL. $ and {} are substituted before passing to SQL and can be used to form SQL statements dynamically.

Assignment

Ordinary IPython assignment works for single-line %sql queries:

In [18]: works = %sql SELECT title, year FROM work
43 rows affected.

The << operator captures query results in a local variable, and can be used in multi-line %%sql:

In [19]: %%sql works << SELECT title, year
    ...: FROM work
    ...:
43 rows affected.
Returning data to local variable works

Connecting

Connection strings are SQLAlchemy URL standard.

Some example connection strings:

mysql+pymysql://scott:tiger@localhost/foo
oracle://scott:[email protected]:1521/sidname
sqlite://
sqlite:///foo.db
mssql+pyodbc://username:password@host/database?driver=SQL+Server+Native+Client+11.0

Note that mysql and mysql+pymysql connections (and perhaps others) don't read your client character set information from .my.cnf. You need to specify it in the connection string:

mysql+pymysql://scott:tiger@localhost/foo?charset=utf8

Note that an impala connection with impyla for HiveServer2 requires disabling autocommit:

%config SqlMagic.autocommit=False
%sql impala://hserverhost:port/default?kerberos_service_name=hive&auth_mechanism=GSSAPI

Connection arguments not whitelisted by SQLALchemy can be provided as a flag with (-a|--connection_arguments)the connection string as a JSON string. See SQLAlchemy Args.

%sql --connection_arguments {"timeout":10,"mode":"ro"} sqlite:// SELECT * FROM work;
%sql -a '{"timeout":10, "mode":"ro"}' sqlite:// SELECT * from work;

DSN connections

Alternately, you can store connection info in a configuration file, under a section name chosen to refer to your database.

For example, if dsn.ini contains

[DB_CONFIG_1]
drivername=postgres
host=my.remote.host
port=5433
database=mydatabase
username=myuser
password=1234

then you can

%config SqlMagic.dsn_filename='./dsn.ini'
%sql --section DB_CONFIG_1

Configuration

Query results are loaded as lists, so very large result sets may use up your system's memory and/or hang your browser. There is no autolimit by default. However, autolimit (if set) limits the size of the result set (usually with a LIMIT clause in the SQL). displaylimit is similar, but the entire result set is still pulled into memory (for later analysis); only the screen display is truncated.

In [2]: %config SqlMagic
SqlMagic options
--------------
SqlMagic.autocommit=<Bool>
    Current: True
    Set autocommit mode
SqlMagic.autolimit=<Int>
    Current: 0
    Automatically limit the size of the returned result sets
SqlMagic.autopandas=<Bool>
    Current: False
    Return Pandas DataFrames instead of regular result sets
SqlMagic.column_local_vars=<Bool>
    Current: False
    Return data into local variables from column names
SqlMagic.displaycon=<Bool>
    Current: False
    Show connection string after execute
SqlMagic.displaylimit=<Int>
    Current: None
    Automatically limit the number of rows displayed (full result set is still
    stored)
SqlMagic.dsn_filename=<Unicode>
    Current: 'odbc.ini'
    Path to DSN file. When the first argument is of the form [section], a
    sqlalchemy connection string is formed from the matching section in the DSN
    file.
SqlMagic.feedback=<Bool>
    Current: False
    Print number of rows affected by DML
SqlMagic.short_errors=<Bool>
    Current: True
    Don't display the full traceback on SQL Programming Error
SqlMagic.style=<Unicode>
    Current: 'DEFAULT'
    Set the table printing style to any of prettytable's defined styles
    (currently DEFAULT, MSWORD_FRIENDLY, PLAIN_COLUMNS, RANDOM)

In[3]: %config SqlMagic.feedback = False

Please note: if you have autopandas set to true, the displaylimit option will not apply. You can set the pandas display limit by using the pandas max_rows option as described in the pandas documentation.

Pandas

If you have installed pandas, you can use a result set's .DataFrame() method

In [3]: result = %sql SELECT * FROM character WHERE speechcount > 25

In [4]: dataframe = result.DataFrame()

The --persist argument, with the name of a DataFrame object in memory, will create a table name in the database from the named DataFrame. Or use --append to add rows to an existing table by that name.

In [5]: %sql --persist dataframe

In [6]: %sql SELECT * FROM dataframe;

Graphing

If you have installed matplotlib, you can use a result set's .plot(), .pie(), and .bar() methods for quick plotting

In[5]: result = %sql SELECT title, totalwords FROM work WHERE genretype = 'c'

In[6]: %matplotlib inline

In[7]: result.pie()

pie chart of word count of Shakespeare's comedies

Dumping

Result sets come with a .csv(filename=None) method. This generates comma-separated text either as a return value (if filename is not specified) or in a file of the given name.

In[8]: result = %sql SELECT title, totalwords FROM work WHERE genretype = 'c'

In[9]: result.csv(filename='work.csv')

PostgreSQL features

psql-style "backslash" meta-commands commands (\d, \dt, etc.) are provided by PGSpecial. Example:

In[9]: %sql \d

Options

-l / --connections
List all active connections
-x / --close <session-name>
Close named connection
-c / --creator <creator-function>
Specify creator function for new connection
-s / --section <section-name>
Section of dsn_file to be used for generating a connection string
-p / --persist
Create a table name in the database from the named DataFrame
--append
Like --persist, but appends to the table if it already exists
-a / --connection_arguments <"{connection arguments}">
Specify dictionary of connection arguments to pass to SQL driver
-f / --file <path>
Run SQL from file at this path

Caution

Comments

Because ipyton-sql accepts ---delimited options like --persist, but -- is also the syntax to denote a SQL comment, the parser needs to make some assumptions.

  • If you try to pass an unsupported argument, like --lutefisk, it will be interpreted as a SQL comment and will not throw an unsupported argument exception.
  • If the SQL statement begins with a first-line comment that looks like one of the accepted arguments - like %sql --persist is great! - it will be parsed like an argument, not a comment. Moving the comment to the second line or later will avoid this.

Installing

Install the latest release with:

pip install ipython-sql

or download from https://github.com/catherinedevlin/ipython-sql and:

cd ipython-sql
sudo python setup.py install

Development

https://github.com/catherinedevlin/ipython-sql

Credits

  • Matthias Bussonnier for help with configuration
  • Olivier Le Thanh Duong for %config fixes and improvements
  • Distribute
  • Buildout
  • modern-package-template
  • Mike Wilson for bind variable code
  • Thomas Kluyver and Steve Holden for debugging help
  • Berton Earnshaw for DSN connection syntax
  • Bruno Harbulot for DSN example
  • Andrés Celis for SQL Server bugfix
  • Michael Erasmus for DataFrame truth bugfix
  • Noam Finkelstein for README clarification
  • Xiaochuan Yu for << operator, syntax colorization
  • Amjith Ramanujam for PGSpecial and incorporating it here
  • Alexander Maznev for better arg parsing, connections accepting specified creator
  • Jonathan Larkin for configurable displaycon
  • Jared Moore for connection-arguments support
  • Gilbert Brault for --append
  • Lucas Zeer for multi-line bugfixes for var substitution, <<
  • vkk800 for --file
  • Jens Albrecht for MySQL DatabaseError bugfix
  • meihkv for connection-closing bugfix
  • Abhinav C for SQLAlchemy 2.0 compatibility

More Repositories

1

ddl-generator

Guesses table DDL based on data
HTML
263
star
2

opensourceshakespeare

Forsooth, a dataset! opensourceshakespeare.org data in PostgreSQL form.
Python
107
star
3

python_learners_glossary

Definitions of Pardon jargon to help Python beginners understand Pythonista gobbletigook
Python
53
star
4

ipython_doctester

Helps you make tutorials with IPython Notebook, using doctests for student feeback.
Python
28
star
5

db-introspection-notebook

Toolkit of queries for examining a PostgreSQL database, in executable IPython Notebook format.
18
star
6

talks

Container for assorted talk slides and materials
HTML
7
star
7

mpwfw_exercises

IPython Notebooks with code exercises for the Midwest Python Workshop for women and their friends
Python
7
star
8

data-dispenser

Given a source of rowlike data, acts as a generator of Python OrderedDicts.
Python
7
star
9

ipython-pyohio2013-talk

IPython Notebook Revolution - PyOhio 2013 talk
Shell
6
star
10

imperial-walker

Walks complex nested data structures to map or do arbitrary commands.
Python
6
star
11

just-enough-python

Python intro at PyOhio 2019
Python
5
star
12

peer_learning_party_game

Description of a techie party game to encourage casual peer learning
5
star
13

argument-clinic

Sample code for demonstrating executable documentation techniques
Python
5
star
14

texas-targeted-books

List of books under investigation by Texas state legislature
Jupyter Notebook
4
star
15

quilt-data-packaging-notebooks

Jupyter Notebooks used in the course of creating Quilt data packages.
Jupyter Notebook
3
star
16

reddit-to-sqlite

Save data from Reddit to SQLite. Dogsheep-based.
Python
3
star
17

docassemble-GLHDaytonNLPStandardsSubQuestionnaireDemo

A docassemble extension.
Python
3
star
18

fuzzy-sequence-matcher

Finds closest pairings of elements between two sequences
Python
3
star
19

rideshare-matchmaker

Service to find and notify potential carpool partners among travelers to a given place
Python
3
star
20

smalldata

A curated list of resources for beginner-level data manipulation in Python
3
star
21

pgopen2014-data-science

Frictionless Data Science with Python - presentation from Postgres Open 2014
Shell
3
star
22

rogue-sql

Preprocesses enhanced pseudo-SQL into legal SQL
2
star
23

pycon2015_sqla_lightning

SQLAlchemy-themed lightning talk - PyCon Montreal 2015
2
star
24

goodtables-py

RESTful web service applying GoodTables Table Schemas
Python
2
star
25

harvest-wild-data-olf-2018

Harvesting Wild Data workshop. Example notebooks.
Jupyter Notebook
2
star
26

sql_quest

Ohio LinuxFest 2015 talk introducing databases in tabletop RPG terms
HTML
2
star
27

sql-legalizer

Preprocesses enhanced pseudo-SQL into legal SQL
1
star
28

holy-grail-run-away

A proof of concept arcade game. Death awaits you all, with nasty, big, pointy teeth.
Python
1
star
29

gameplay_analysis_exercise

A CLI for analyzing gameplay statistics. Exercise
Python
1
star
30

sql-insert-generator

Generates INSERT statements with clarifying comments
Python
1
star
31

minimal_ipython_extension

Skeleton of an IPython 1.0 extension; think Hello World
Python
1
star
32

htsql_tweak_tags

tweak for HTSQL allowing active, unescaped tags in HTML output
1
star
33

ipython_docent

ipython_docent
Python
1
star
34

eatfeed

Downloads items from an RSS feed
1
star
35

hashed_id_rest_service

A sample Django REST Framework app with no data storage.
Python
1
star
36

pythology-git-seekret-demo

Throwaway demo of Git Seekret
Jupyter Notebook
1
star
37

miami-valley-school-demo

Jupyter Notebook
1
star
38

reddit-comment-history-grabber

Pulls quotes from a user's recent comments
Python
1
star
39

data-federation-ingest

Tool for accepting and validating data uploads.
1
star
40

hashtag-to-schedule

Generates a schedule grid based on hashtags
1
star
41

reddit2sql

Backs up history of subreddits or users to RDBMS
Jupyter Notebook
1
star
42

python-intro-exercises

Jupyter Notebook
1
star
43

python-over-the-bumps

Intro Python tutorial including just the hard parts
1
star
44

crossdb-psql-meta-commands

Cross-RDBMS implementation of psql's meta commands
Python
1
star