• Stars
    star
    136
  • Rank 266,686 (Top 6 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created about 8 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python AST that abstracts the underlying Python version

GAST, daou naer!

A generic AST to represent Python2 and Python3's Abstract Syntax Tree(AST).

GAST provides a compatibility layer between the AST of various Python versions, as produced by ast.parse from the standard ast module.

Basic Usage

>>> import ast, gast
>>> code = open('file.py').read()
>>> tree = ast.parse(code)
>>> gtree = gast.ast_to_gast(tree)
>>> ... # process gtree
>>> tree = gast.gast_to_ast(gtree)
>>> ... # do stuff specific to tree

API

gast provides the same API as the ast module. All functions and classes from the ast module are also available in the gast module, and work accordingly on the gast tree.

Three notable exceptions:

  1. gast.parse directly produces a GAST node. It's equivalent to running
    gast.ast_to_gast on the output of ast.parse.
  2. gast.dump dumps the gast common representation, not the original
    one.
  3. gast.gast_to_ast and gast.ast_to_gast can be used to convert
    from one ast to the other, back and forth.

Version Compatibility

GAST is tested using tox and Travis on the following Python versions:

  • 2.7
  • 3.4
  • 3.5
  • 3.6
  • 3.7
  • 3.8
  • 3.9
  • 3.10
  • 3.11

AST Changes

Python3

The AST used by GAST is the same as the one used in Python3.9, with the notable exception of ast.arg being replaced by an ast.Name with an ast.Param context.

For minor version before 3.9, please note that ExtSlice and Index are not used.

For minor version before 3.8, please note that Ellipsis, Num, Str, Bytes and NamedConstant are represented as Constant.

Python2

To cope with Python3 features, several nodes from the Python2 AST are extended with some new attributes/children, or represented by different nodes

  • ModuleDef nodes have a type_ignores attribute.
  • FunctionDef nodes have a returns attribute and a type_comment attribute.
  • ClassDef nodes have a keywords attribute.
  • With's context_expr and optional_vars fields are hold in a withitem object.
  • For nodes have a type_comment attribute.
  • Raise's type, inst and tback fields are hold in a single exc field, using the transformation raise E, V, T => raise E(V).with_traceback(T).
  • TryExcept and TryFinally nodes are merged in the Try node.
  • arguments nodes have a kwonlyargs and kw_defaults attributes.
  • Call nodes loose their starargs attribute, replaced by an argument wrapped in a Starred node. They also loose their kwargs attribute, wrapped in a keyword node with the identifier set to None, as done in Python3.
  • comprehension nodes have an async attribute (that is always set to 0).
  • Ellipsis, Num and Str nodes are represented as Constant.
  • Subscript which don't have any Slice node as slice attribute (and Ellipsis are not Slice nodes) no longer hold an ExtSlice but an Index(Tuple(...)) instead.

Pit Falls

  • In Python3, None, True and False are parsed as Constant while they are parsed as regular Name in Python2.

ASDL

This closely matches the one from https://docs.python.org/3/library/ast.html#abstract-grammar, with a few trade-offs to cope with legacy ASTs.

-- ASDL's six builtin types are identifier, int, string, bytes, object, singleton

module Python
{
    mod = Module(stmt* body, type_ignore *type_ignores)
        | Interactive(stmt* body)
        | Expression(expr body)
        | FunctionType(expr* argtypes, expr returns)

        -- not really an actual node but useful in Jython's typesystem.
        | Suite(stmt* body)

    stmt = FunctionDef(identifier name, arguments args,
                       stmt* body, expr* decorator_list, expr? returns,
                       string? type_comment)
          | AsyncFunctionDef(identifier name, arguments args,
                             stmt* body, expr* decorator_list, expr? returns,
                             string? type_comment)

          | ClassDef(identifier name,
             expr* bases,
             keyword* keywords,
             stmt* body,
             expr* decorator_list)
          | Return(expr? value)

          | Delete(expr* targets)
          | Assign(expr* targets, expr value, string? type_comment)
          | AugAssign(expr target, operator op, expr value)
          -- 'simple' indicates that we annotate simple name without parens
          | AnnAssign(expr target, expr annotation, expr? value, int simple)

          -- not sure if bool is allowed, can always use int
          | Print(expr? dest, expr* values, bool nl)

          -- use 'orelse' because else is a keyword in target languages
          | For(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment)
          | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment)
          | While(expr test, stmt* body, stmt* orelse)
          | If(expr test, stmt* body, stmt* orelse)
          | With(withitem* items, stmt* body, string? type_comment)
          | AsyncWith(withitem* items, stmt* body, string? type_comment)

          | Match(expr subject, match_case* cases)

          | Raise(expr? exc, expr? cause)
          | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
          | TryStar(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
          | Assert(expr test, expr? msg)

          | Import(alias* names)
          | ImportFrom(identifier? module, alias* names, int? level)

          -- Doesn't capture requirement that locals must be
          -- defined if globals is
          -- still supports use as a function!
          | Exec(expr body, expr? globals, expr? locals)

          | Global(identifier* names)
          | Nonlocal(identifier* names)
          | Expr(expr value)
          | Pass | Break | Continue

          -- XXX Jython will be different
          -- col_offset is the byte offset in the utf8 string the parser uses
          attributes (int lineno, int col_offset)

          -- BoolOp() can use left & right?
    expr = BoolOp(boolop op, expr* values)
         | NamedExpr(expr target, expr value)
         | BinOp(expr left, operator op, expr right)
         | UnaryOp(unaryop op, expr operand)
         | Lambda(arguments args, expr body)
         | IfExp(expr test, expr body, expr orelse)
         | Dict(expr* keys, expr* values)
         | Set(expr* elts)
         | ListComp(expr elt, comprehension* generators)
         | SetComp(expr elt, comprehension* generators)
         | DictComp(expr key, expr value, comprehension* generators)
         | GeneratorExp(expr elt, comprehension* generators)
         -- the grammar constrains where yield expressions can occur
         | Await(expr value)
         | Yield(expr? value)
         | YieldFrom(expr value)
         -- need sequences for compare to distinguish between
         -- x < 4 < 3 and (x < 4) < 3
         | Compare(expr left, cmpop* ops, expr* comparators)
         | Call(expr func, expr* args, keyword* keywords)
         | Repr(expr value)
         | FormattedValue(expr value, int? conversion, expr? format_spec)
         | JoinedStr(expr* values)
         | Constant(constant value, string? kind)

         -- the following expression can appear in assignment context
         | Attribute(expr value, identifier attr, expr_context ctx)
         | Subscript(expr value, slice slice, expr_context ctx)
         | Starred(expr value, expr_context ctx)
         | Name(identifier id, expr_context ctx, expr? annotation,
                string? type_comment)
         | List(expr* elts, expr_context ctx)
         | Tuple(expr* elts, expr_context ctx)

          -- col_offset is the byte offset in the utf8 string the parser uses
          attributes (int lineno, int col_offset)

    expr_context = Load | Store | Del | AugLoad | AugStore | Param

    slice = Slice(expr? lower, expr? upper, expr? step)
          | ExtSlice(slice* dims)
          | Index(expr value)

    boolop = And | Or

    operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift
                 | RShift | BitOr | BitXor | BitAnd | FloorDiv

    unaryop = Invert | Not | UAdd | USub

    cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn

    comprehension = (expr target, expr iter, expr* ifs, int is_async)

    excepthandler = ExceptHandler(expr? type, expr? name, stmt* body)
                    attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)

    arguments = (expr* args, expr* posonlyargs, expr? vararg, expr* kwonlyargs,
                 expr* kw_defaults, expr? kwarg, expr* defaults)

    -- keyword arguments supplied to call (NULL identifier for **kwargs)
    keyword = (identifier? arg, expr value)

    -- import name with optional 'as' alias.
    alias = (identifier name, identifier? asname)
            attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)

    withitem = (expr context_expr, expr? optional_vars)

    match_case = (pattern pattern, expr? guard, stmt* body)

    pattern = MatchValue(expr value)
            | MatchSingleton(constant value)
            | MatchSequence(pattern* patterns)
            | MatchMapping(expr* keys, pattern* patterns, identifier? rest)
            | MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns)

            | MatchStar(identifier? name)
            -- The optional "rest" MatchMapping parameter handles capturing extra mapping keys

            | MatchAs(pattern? pattern, identifier? name)
            | MatchOr(pattern* patterns)

             attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)

    type_ignore = TypeIgnore(int lineno, string tag)
}

More Repositories

1

pythran

Ahead of Time compiler for numeric kernels
C++
1,993
star
2

frozen

a header-only, constexpr alternative to gperf for C++14 users
C++
1,287
star
3

beniget

Extract semantic information about static Python code
Python
51
star
4

numpy-benchmarks

A collection of scientific kernels using the numpy module for benchmarking purpose
Python
37
star
5

mapping-line-to-offset

C
12
star
6

params14

keyword parameters for C++14
C++
11
star
7

num-utils-ng

programs for dealing with numbers from the command line
C
8
star
8

talks

serge sans paille's talks
JavaScript
8
star
9

fortify-test-suite

A minimal test suite for compilers targeting -D_FORTIFY_SOURCE=1 compatibility
C
6
star
10

stack-clash-tracer

A QBDI based tracer to check `-fstack-clash-protection` implementation
C
6
star
11

pythran-openblas

Wheel builder for OpenBLAS
Python
5
star
12

scipy-kernels

Jupyter Notebook
4
star
13

tog

Static typing of python scientific program using Hindley Milner algorithm
Python
4
star
14

hardening-artefacts

A collection of small tests for hardening artefacts produced by various compilers
Shell
3
star
15

sivart

Poor man's build farm
Python
3
star
16

pythran-stories

pythran blog
HTML
3
star
17

sautoconf

simpler autoconf
C
3
star
18

cxx-snippets

Explore C++ compilation through various code snippets
C++
3
star
19

isl

Integer Set Library mirror
C
2
star
20

penty

Python
2
star
21

scipy2021

slide deck for scipy 2021 conference
CSS
2
star
22

compil-ou-face

Material for some teachings on compilation (in french)
Jupyter Notebook
2
star
23

jit-do-it

JIT compilation for C
LLVM
2
star
24

frozen_conan

conan setup for frozen
Python
2
star
25

pythran-asv

airspeed velocity-powered performance regression testing
Python
1
star
26

preprocessor-utils

Python
1
star
27

pythran-asv-snapshot

Python
1
star