• Stars
    star
    432
  • Rank 96,055 (Top 2 %)
  • Language
    Scala
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated 18 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code Property Graph: specification, query language, and utilities

CI pipeline Maven Central

Code Property Graph - Specification and Tooling

You can find a clickable specification at:

https://cpg.joern.io

Note: for first-time users, we recommend building "joern" at https://github.com/joernio/joern/ instead. It combines this repo with a C/C++ language frontend to construct a complete code analysis platform.

A Code Property Graph (CPG) is an extensible and language-agnostic representation of program code designed for incremental and distributed code analysis. This repository hosts the base specification together with a build process that generates data structure definitions for accessing the graph with different programming languages.

We are publishing the Code Property Graph specification as a suggestion for an open standard for the exchange of code in intermediate representations along with analysis results. With this goal in mind, the specification consists of a minimal base schema that can be augmented via extension schemas to enable storage of application-specific data.

Usage as a dependency

build.sbt:

libraryDependencies += "io.shiftleft" %% "codepropertygraph" % "x.y.z"

Building the code

The build process has been verified on Linux and it should be possible to build on OS X and BSD systems as well. The build process requires the following prerequisites:

Some binary files required for testing are managed through git-lfs. If you haven't cloned this repository yet, simply run git lfs install. If you have cloned it already, additionally run git lfs pull (from within the repository).

Additional build-time dependencies are automatically downloaded as part of the build process. To build and install into your local Maven cache, issue the command sbt clean test publishM2.

This repo cross-compiles to Scala 2.13.x and 3.x. The default version is Scala 3.x which means all sbt commands will compile using Scala 3.x. If you want to run an sbt command for both versions you need to prefix it with a "+" like sbt +test. If you use an open sbt shell you can switch the default compiler version for the duration of the shell session with ++ e.g., ++2.13.7 and ++3.1.0.

Code style

Code style is automatically verified by scalafmt

If your PR build fails code formatting check, simply run sbt scalafmt and submit the change along with the rest of the code. The commands runs necessary formatting in the right order.

Creating Protocol Buffer bindings for different languages

The codepropertygraph-VERSION.jar artifact contains a Protocol Buffer definition file cpg.proto that you can use to generate your own language-specific bindings. For instance, to create C++ and Python bindings, issue the following series of commands:

sbt package
cd codepropertygraph/target
mkdir cpp python
protoc --cpp_out=cpp --python_out=python cpg.proto

Loading a codepropertygraph

Here's how you can load a cpg into ShiftLeft Tinkergraph [3] in the sbt console - the next section will list some queries you can interactively run from there.

There are some sample cpgs in this repository in the resources/testcode/cpgs directory. You can run ./regenerate-test-cpgs.sh to update them, but this requires the proprietary java2cpg installed locally.

Tinkergraph (in memory reference db)

sbt semanticcpg/console
import io.shiftleft.codepropertygraph.Cpg
import io.shiftleft.semanticcpg.language._
val cpg = io.shiftleft.codepropertygraph.cpgloading.CpgLoader.load("./resources/testcode/cpgs/hello-shiftleft-0.0.5/cpg.bin.zip")

Querying the cpg

Once you've loaded a cpg you can run queries, which are provided by the query-primitives subproject. Note that if you're in the sbt shell you can play with it interactively: TAB completion is your friend. Otherwise your IDE will assist.

Here are some simple traversals to get all the base nodes. Running all of these without errors is a good test to ensure that your cpg is valid:

cpg.literal.toList
cpg.file.toList
cpg.namespace.toList
cpg.types.toList
cpg.methodReturn.toList
cpg.parameter.toList
cpg.member.toList
cpg.call.toList
cpg.local.toList
cpg.identifier.toList
cpg.argument.toList
cpg.typeDecl.toList
cpg.method.toList

From here you can traverse through the cpg. The query-primitives DSL ensures that only valid steps are available - anything else will result in a compile error:

cpg.method.name("getAccountList").parameter.toList
/* List(
 *   MethodParameterIn(Some(v[7054781587948444580]),this,0,this,BY_SHARING,io.shiftleft.controller.AccountController,Some(28),None,None,None),
 *   MethodParameterIn(Some(v[7054781587948444584]),request,2,request,BY_SHARING,javax.servlet.http.HttpServletRequest,Some(28),None,None,None),
 *   MethodParameterIn(Some(v[7054781587948444582]),response,1,response,BY_SHARING,javax.servlet.http.HttpServletResponse,Some(28),None,None,None)
 *   )
 **/

cpg.method.name("getAccountList").definingTypeDecl.toList.head
// TypeDecl(Some(v[464]),AccountController,io.shiftleft.controller.AccountController,false,List(java.lang.Object))

Scripts

Dump the CPG schema

You can dump the current Code Property Graph schema using the schema2json.sh bash script.

./schema2json.sh
Schema written to: /tmp/schema.json

Further Reading

More Repositories

1

sast-scan

Scan is a free & Open Source DevSecOps tool for performing static analysis based security testing of your applications and its dependencies. CI and Git friendly.
Python
764
star
2

overflowdb

ShiftLeft OverflowDB
Java
101
star
3

traceleft

eBPF based syscalls, files and network events tracing framework
Go
81
star
4

llvm2cpg

LLVM meets Code Property Graphs
C++
80
star
5

tarpit-java

Tarpit - A Web application seeded with vulnerabilities, rootkits, backdoors & data leaks
Java
70
star
6

llvm2graphml

Explore LLVM Bitcode interactively using a graph database
C++
56
star
7

scan-action

51
star
8

tinkergraph-gremlin

Java
38
star
9

fuzzyc2cpg

A fuzzy parser for C/C++ that creates semantic code property graphs
35
star
10

scan-docs

28
star
11

sbt-ci-release-early

Sbt plugin for fully automated releases, without SNAPSHOT and git sha's in the version. A remix of the best ideas from sbt-ci-release and sbt-release-early. For local CI and/or sonatype/maven central.
Scala
20
star
12

SharpSyntaxRewriter

A C# syntax rewriter
C#
18
star
13

gaum

Go
18
star
14

flask-webgoat

flask-webgoat is a deliberately-vulnerable application written with the Flask web framework.
Python
17
star
15

js2cpg

Scala
15
star
16

bctrace

A library for creating hook-based java agents, without dealing with bytecode
Java
12
star
17

shiftleft-scan-vscode

ShiftLeft Scan is a free and open-source commercial-grade security tool for modern DevOps teams.
TypeScript
12
star
18

sql-task-queue

PLpgSQL
10
star
19

HelloShiftLeft

Java
9
star
20

tarpit-c

TARPIT-C : A set of C code snippets seeded with vulnerable conditions
C
8
star
21

cpgqls-client-python

Python
7
star
22

shiftleft-java-demo

Java
6
star
23

shiftleft-python-demo

Python
6
star
24

shiftleft-js-demo

JavaScript
6
star
25

joern-sample-extension

A sample of a standalone extension for Joern/Ocular
Scala
6
star
26

atlassian-connect-go

This repo contains a set of tools you can use to create Jira plugins using the Atlassian Connect framework. It is written in Go.
Go
5
star
27

shiftleft-go-demo

Go
4
star
28

field-integrations

integration tools and docs
Python
4
star
29

ocular-docs

All things ocular related
4
star
30

tarpit-python

TARPIT-PYTHON - A WEB APPLICATION SEEDED WITH VULNERABILITIES, ROOTKITS, BACKDOORS AND DATA LEAKS
Python
4
star
31

overflowdb-codegen

Scala
4
star
32

shiftleft-go-example

Sample go application with ShiftLeft Inspect integration
Go
2
star
33

shiftleft-kotlin-demo

Kotlin
2
star
34

shiftleft-ts-demo

TypeScript
2
star
35

http4k-webgoat

Kotlin
2
star
36

soot

Java
2
star
37

shiftleft-python-example

Sample python application with ShiftLeft Inspect integration
Python
2
star
38

HelloShiftLeft-Mar2021

Java
2
star
39

shiftleft-java-example

Sample Java application with ShiftLeft Inspect integration
Java
2
star
40

x42

LLVM
1
star
41

gather-dependencies-gradle-plugin

Kotlin
1
star
42

tarpit-nodejs

JavaScript
1
star
43

zipdu

zipdu is a webservice implementation vulnerable to zip bombs and directory traversals. Written in multiple different languages
C++
1
star
44

shiftleft-js-example

Sample JavaScript application with ShiftLeft Inspect integration
JavaScript
1
star
45

HelloShiftLeft-Scala

Scala
1
star
46

shiftleft-terraform-demo

HCL
1
star
47

shiftleft-csharp-demo

C#
1
star