• Stars
    star
    883
  • Rank 51,702 (Top 2 %)
  • Language
    HTML
  • Created about 10 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Definition and DDLs for the OMOP Common Data Model (CDM)
title output
Readme
pdf_document html_document
toc
true
toc toc_float
true
true

How to Use this Repository

If you are looking for the SQL DDLs and don't wish to generate them through R, they can be accessed here.

If you are looking for information on how to submit a bugfix, skip to the next section

Generating the DDLs

This module will demonstrate two different ways the CDM R package can be used to create the CDM tables in your environment. First, it uses the buildRelease function to create the DDL files on your machine, intended for end users that wish to generate these scripts from R without the need to clone or download the source code from github. The SQL scripts that are created through this process are available as zip files as part of the latest release. They are also available on the master branch here.

Second, the script shows the executeDdl function that will connect up to your SQL client directly (assuming your dbms is one of the supported dialects) and instantiate the tables through R.

Dependencies and prerequisites

This process required R-Studio to be installed as well as DatabaseConnector and SqlRender.

Create DDL, Foreign Keys, Primary Keys, and Indexes from R

First, install the package from GitHub

install.packages("devtools")
devtools::install_github("OHDSI/CommonDataModel")

List the currently supported SQL dialects

CommonDataModel::listSupportedDialects()

List the currently supported CDM versions

CommonDataModel::listSupportedVersions()

1. Use the buildRelease function

This function will generate the text files in the dialect you choose, putting the output files in the folder you specify.

CommonDataModel::buildRelease(cdmVersions = "5.4",
                              targetDialects = "postgresql",
                              outputfolder = "/pathToOutput")

2. Use the executeDdl function

If you have an empty schema ready to go, the package will connect and instantiate the tables for you. To start, you need to download DatabaseConnector in order to connect to your database.

devtools::install_github("DatabaseConnector")

cd <- DatabaseConnector::createConnectionDetails(dbms = "postgresql",
                                                 server = "localhost/ohdsi",
                                                 user = "postgres",
                                                 password = "postgres",
                                                 pathToDriver = "/pathToDriver"
                                                 )

CommonDataModel::executeDdl(connectionDetails = cd,
                            cdmVersion = "5.4",
                            cdmDatabaseSchema = "ohdsi_demo"
                            )

Bug Fixes/Model Updates

NOTE This information is for the maintainers of the CDM as well as anyone looking to submit a pull request. If you want to suggest an update or addition to the OMOP Common Data Model itself please open an issue using the proposal template. The instructions contained herein are meant to describe the process by which bugs in the DDL code should be addressed and/or new versions of the CDM are produced.

Just looking for the latest version of the CDM and you don't care about the R package? Please visit the releases tab and download the latest. It will include the DDLs for all currently supported versions of the CDM for all supported SQL dialects.

Typically, new CDM versions and updates are decided by the CDM working group (details to join meetings on homepage). These changes are tracked as issues in the github repo. Once the working group decides which changes make up a version, all the corresponding issues should be tagged with a version number, e.g. v5.4, and added to a project board.

Step 0

Changes to the model structure should be made in the representative csv files by adding, subtracting, or renaming fields or tables. ETL conventions are not currently tracked by CDM version unless they are conventions specific to new fields (for example CONDITION_STATUS was added in v5.3 which specifies the way in which primary condition designations should be captured).

Bug fixes are made much the same way using the csv files, but they should be limited to typos, primary/foreign key relationships, and formatting (like datetime vs datetime2).

Step 1

If you are making changes to the model structure request a new branch in the CommonDataModel repository for the new version of the CDM you are creating. Then, fork the repository and clone the newly made branch. If you are squashing bugs fork the repository and clone the master branch.

Step 1.1

For changes to the model structure, rename the table level and field level inst/csv files from the current released version to the new version. For example, if the new version you are creating is v5.4 and the most recent released version is v5.3, rename the csv files named "OMOP_CDMv5.3_Field_Level.csv" and "OMOP_CDMv5.3_Table_Level.csv" to "OMOP_CDMv5.4_Field_Level.csv" and "OMOP_CDMv5.4_Table_Level.csv". These files serve multiple functions; they serve as the basis for the CDM DDL, CDM documentation, and Data Quality Dashboard (DQD). You can read more about the DQD here.

For squashing bugs make the necessary changes in the csv file corresponding to the major.minor version you are fixing. For example, if you are working on fixes to v5.3.3 you would make changes in the v5.3 files. (skip to step 2)

Step 1.2

The csv files can now be updated with the changes and additions for the new CDM version. If a new table should be added, add a line to the Table_Level.csv with the table name and description and list it as part of the CDM schema. The remaining columns are quality checks that can be run. Details here on what those are. After adding any tables, make any changes or additions to CDM fields in the Field_Level.csv. The columns are meant to mimic how a DDL is structured, which is how it will eventually be generated. A yes in the field isRequired indicates a NOT NULL constraint and the datatype field should be filled in exactly how it would look in the DDL. Any additions or changes should also be reflected in the userGuidance and etlConventions fields, which are the basis for the documentation. DO NOT MAKE ANY CHANGES IN THE DDL ITSELF. The structure is set up in such a way that the csv files are the ground truth. If changes are made in the DDL instead of the csv files then the DDL will be out of sync with the documentation and the DQD.

Step 2

Once all changes are made the csvs, rebuild the package and then open extras/codeToRun.R. To make sure that your new version is recognized by the package run the function listSupportedVersions(). If you do not see it, make sure your new csv files are in inst/csv and that you have rebuilt the package. Once you have confirmed that the package recognizes your new version, run the function buildRelease(). You should now see a file in inst/ddl for your new version.

NOTE ABOUT CDM v6.0

Please be aware that v6.0 of the OMOP CDM is not fully supported by the OHDSI suite of tools and methods. The major difference in CDM v5.3 and CDM v6.0 involves switching the *_datetime fields to mandatory rather than optional. This switch radically changes the assumptions related to exposure and outcome timing. Rather than move forward with v6.0, please transform your data to CDM v5.4 until such time that we as a community have fully defined the role of dates vs datetimes both when it comes to the model and the evidence we generate.

More Repositories

1

Atlas

ATLAS is an open source software tool for researchers to conduct scientific analyses on standardized observational data
JavaScript
266
star
2

Vocabulary-v5.0

Build process for the OHDSI Standardized Vocabularies. Currently not available as independent release.
PLpgSQL
214
star
3

PatientLevelPrediction

An R package for performing patient level prediction in an observational database in the OMOP Common Data Model.
HTML
187
star
4

WhiteRabbit

WhiteRabbit is a small application that can be used to analyse the structure and contents of a database as preparation for designing an ETL. It comes with RabbitInAHat, an application for interactive design of an ETL to the OMOP Common Data Model with the help of the the scan report generated by White Rabbit.
Java
177
star
5

DataQualityDashboard

A tool to help improve data quality standards in observational data science.
JavaScript
136
star
6

WebAPI

OHDSI WebAPI contains all OHDSI services that can be called from OHDSI applications
Java
128
star
7

Achilles

Automated Characterization of Health Information at Large-scale Longitudinal Evidence Systems (ACHILLES) - descriptive statistics about a OMOP CDM database
R
128
star
8

TheBookOfOhdsi

The Book of OHDSI repository
R
104
star
9

ETL-Synthea

A package supporting the conversion from Synthea CSV to OMOP CDM
R
97
star
10

Usagi

Usagi is an application to help create mappings between coding systems and the Vocabulary standard concepts.
Java
91
star
11

ETL-CMS

Workproducts to ETL CMS datasets into OMOP Common Data Model
Python
84
star
12

CohortMethod

An R package for performing new-user cohort studies in an observational database in the OMOP Common Data Model.
R
82
star
13

SqlRender

This is an R package and Java library for rendering parameterized SQL, and translating it to different SQL dialects.
R
77
star
14

OHDSIonAWS

Automation code and documentation for standing up the OHDSI toolstack in an AWS environment
Shell
72
star
15

Broadsea

Broadsea deploys the core OHDSI technology stack (Atlas & R Hades), using cross-platform Docker container technology.
R
70
star
16

MIMIC

MIMIC (Medical Information Mart for Intensive Care) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care hospital. This repository contains the ETL to the OMOP CDM.
Python
70
star
17

Athena

Web application for distributing and browsing the Standardized Vocabularies for all instances of an OMOP CDM
Java
57
star
18

FeatureExtraction

An R package for generating features (covariates) for a cohort using data in the Common Data Model.
R
57
star
19

OncologyWG

Oncology Working Group Repository
Ruby
56
star
20

KnowledgeBase

Source code used to develop the OHDSI knowledge base of sources with information relevant for assessing assocations between drugs and health outcomes of interest.
HTML
55
star
21

DatabaseConnector

An R package for connecting to databases using JDBC.
R
54
star
22

Criteria2Query

[In Development] An application to parse freetext inclusion criteria and produce a structured cohort definition that can be executed against OMOP CDM
Java
49
star
23

ETL-CDMBuilder

ETL-CDMBuilder is a repo containing a .NET Core application to perform ETL to OMOP CDM for multiple databases
C#
49
star
24

OHDSI-in-a-Box

Virtual Machine containing SynPUF data in OMOP CDM, a RDBS including query client, WebAPI, ATLAS, R and Python.
49
star
25

Tutorial-ETL

Course materials for OHDSI ETL tutorial
R
45
star
26

Eunomia

An R package that facilitates access to a variety of OMOP CDM sample data sets.
R
42
star
27

CohortDiagnostics

An R package for performing various cohort diagnostics.
R
41
star
28

PhenotypeLibrary

A repository to store, organize and maintain the content of the OHDSI Phenotype library. OHDSI Forum post https://forums.ohdsi.org/t/ohdsi-phenotype-library-announcements/16910
R
38
star
29

StudyProtocols

Repository of OHDSI Collaborative Research Protocols
R
37
star
30

Cyclops

Cyclops (Cyclic coordinate descent for logistic, Poisson and survival analysis) is an R package for performing large scale regularized regressions.
C++
34
star
31

Aphrodite

[in development]
R
33
star
32

NLPTools

[in development] Tools to support Natural Language Processing of freetext to create structured data elements for analysis
Java
32
star
33

StudyProtocolSandbox

This repository is for developing study packages for OHDSI studies. Once completed, they can be moved to the StudyProtocols repository.
R
32
star
34

Perseus

[under development] Tools for ETL into OMOP CDM and deployment of OHDSI toolstack
TypeScript
32
star
35

Themis

Repository for OMOP CDM conventions as defined by THEMIS. These can be reference lists of concepts, pieces of standardized code for data generation or quality certification, and debates.
27
star
36

ShinyDeploy

Shiny apps in this repository will be automatically deployed to the OHDSI Shiny server.
R
26
star
37

OMOP-Queries

Jupyter Notebook
23
star
38

Hades

Health Analytics Data-to-Evidence Suite (HADES): A collection of R packages for performing analytics against the Common Data Model.
R
23
star
39

Tutorial-CDM

Training materials for Vocabulary & CDM Tutorial.
TSQL
20
star
40

ClinicalTrialsWGETL

[under development] ETL materials to support proposal for CDM enhancements for clinical trial data
HTML
20
star
41

AchillesWeb

Interactive web site for reviewing the results of the Achilles R package.
JavaScript
19
star
42

bayes-bridge

Bayesian sparse regression with regularized shrinkage and conjugate gradient acceleration
Jupyter Notebook
19
star
43

Genomic-CDM

Repository for development of the genomic module of the CDM.
18
star
44

PheValuator

An R package for evaluating phenotype algorithms.
R
17
star
45

Radiology-CDM

Pilot model and converter for integration of radiology data into OMOP-CDM
R
17
star
46

FhirToCdm

Conversion from FHIR HL7 to OMOP CDM
C#
16
star
47

FuzzyForest

[Under development] Random classification and regression trees
R
16
star
48

Capr

Cohort definition Application Programming in R
R
15
star
49

dbt-synthea

[Under development] A dbt ETL project to convert a Synthea synthetic data set into the OMOP CDM
Python
15
star
50

CommonEvidenceModel

Common Evidence Model (CEM) is a structure for standardizing evidence about drug-outcome relationship across disparate evidence sources.
R
14
star
51

InspectOMOP

InspectOmop is a lightweight python 3 package that assists in the extraction of electronic health record(EHR) data from relational databases following the OHDSI OMOP Common Data Model(CDM) standard v>=5.
Python
14
star
52

ArachneUI

Network infrastructure for collaborative studies across disparate data nodes and researches
SCSS
14
star
53

ETL-LambdaBuilder

CDM Builder leveraging AWS Lambda
C#
14
star
54

CureIdRegistry

TSQL
14
star
55

SelfControlledCaseSeries

An R package for performing Self-Controlled Case Series (SCCS) analyses in an observational database in the OMOP Common Data Model.
R
13
star
56

QueryLibrary

This is an R package that implements a library of standard queries that run against the OMOP-CDM.
R
13
star
57

Covid-19

The OHDSI repository to provide comprehensive evidence for the covid-19
HTML
13
star
58

OMOP-Standardized-Vocabularies

This repository is not longer active. It used to have the only purpose of creating releases of the Standardized Vocabularies, i.e. the content, not those of the Pallas Vocabulary Build System itself. As of 17-July-2018, vocabulary releases are also processed by Pallas. Please visit https://github.com/OHDSI/Vocabulary-v5.0/releases.
13
star
59

RcppXsimd

R package wrapper for the C++ header-only library Xsimd that provides parallelized math implementations using SIMD
C++
12
star
60

Ares

A Research Exploration System
Vue
12
star
61

CohortGenerator

Cohort Generation for the OMOP Common Data Model
R
11
star
62

Andromeda

AsynchroNous Disk-based Representation of MassivE DAta: An R package aimed at replacing ff for storing large data objects.
R
11
star
63

Apollo

[Under development] Assessment of Pre-trained Observational Large Language-models in OHDSI (APOLLO)
Python
11
star
64

DeepPatientLevelPrediction

An R package for performing patient level prediction using deep learning in an observational database in the OMOP Common Data Model.
R
11
star
65

ROhdsiWebApi

An R package for interfacing with a WebAPI instance
R
10
star
66

GIS

R
10
star
67

EmpiricalCalibration

An R package for performing empirical calibration of observational study estimates
R
10
star
68

ParallelLogger

An R package for easy parallel computing, logging, and function call automation.
R
10
star
69

Broadsea-WebTools

A Docker container that includes the OHDSI WebAPI (running in Apache Tomcat) and the OHDSI web applications.
Dockerfile
10
star
70

circe-be

CIRCE is a cohort definition and syntax compiler tool for OMOP CDMv5
Java
9
star
71

Koios

Tool to identify concept in the OMOP Genomic vocabulary from VCF and other files as well as HGVS notations
R
9
star
72

Visualizations

[Under development] Visualizations is a collection of JavaScript modules to support D3 visualizations in web-based applications
JavaScript
8
star
73

ETL-German-FHIR-Core

ETL process from FHIR (defined by the German Medical Informatics Initiative) to OMOP
Java
8
star
74

Tutorial-PLP

R
8
star
75

OMOPV4_PCORNetV1_ETL

ETL script to transform data from OMOP v4 CDM to PCORNet V1 CDM
8
star
76

CaseControl

An R package for performing (nested) matched case-control analyses in an observational database in the OMOP Common Data Model.
R
8
star
77

EvidenceSynthesis

An R package for combining evidence from multiple sources (e.g. multiple data sites)
R
8
star
78

OhdsiShinyModules

An R package containing Shiny modules used by various OHDSI Shiny apps
R
8
star
79

Nostos

Navigate OMOP-structured data via text-to-SQL
Jupyter Notebook
7
star
80

ETL---Korean-NSC

ETL code for converting Korean National Sample Cohort (NSC) derived from national insurance health service into OMOP-CDM v5 developed by Ajou University
R
7
star
81

ImageWG

Repository for medical image working group
Rich Text Format
7
star
82

OSIM-v5

An updated version of OSIM for CDM v5
PLpgSQL
7
star
83

Strategus

[Under development] An R packages for coordinating and executing analytics using HADES modules
R
7
star
84

AthenaUI

UI of web application for distributing and browsing the Standardized Vocabularies for the OMOP CDM
TypeScript
6
star
85

BigKnn

An R package implementing a large scale k-nearest neighbor classifier using the Lucene search engine
R
6
star
86

IcTemporalPatternDiscovery

An R package for performing the IC Temporal Pattern Discovery method.
R
6
star
87

Legend

An R package implementing Large-Scale Evidence Generation and Evaluation in a Network of Databases (LEGEND).
R
6
star
88

OhdsiRTools

An R package of support tools that didn’t fit other categories, including tools for maintaining R libraries.
R
6
star
89

DbDiagnostics

Package to profile a database and execute data diagnostics based on individual analysis settings
R
6
star
90

CohortIncidence

Contains the Java and R assets to perform Incidence calculations on a CDM
R
6
star
91

CirceR

R package wrapper for CIRCE
R
6
star
92

SelfControlledCohort

An R package for performing self-controlled cohort analyses, a method to estimate risk by comparing time exposed with time unexposed among the exposed cohort.
R
6
star
93

MethodEvaluation

An R package for the evaluation of estimation methods
R
6
star
94

Hermes

(DEPRECATED) HERMES is a vocabulary browser tool for OMOP CDM v5
JavaScript
6
star
95

Hestia

Hestia is an API for function calling on the OMOP CDM.
Python
6
star
96

Circe

[Under development] CIRCE is a cohort definition and syntax compiler tool for OMOP CDMv5
JavaScript
5
star
97

EunomiaDatasets

Hosting of sample CDM datasets in CSV format for use in testing throughout the OHDSI community. Eunomia R package to manage the datasets can be accessed at https://github.com/OHDSI/Eunomia.
5
star
98

ArachneCentralAPI

Arachne Central middle-tier including Service API.
Java
5
star
99

Calypso

CALYPSO (Criteria Assessment Logic for Your Population Study in Observational data) is a web user interface to define, instantiate and evaluate a study population and the implications of inclusion criteria
JavaScript
5
star
100

StandardizedAnalysisAPI

Interfaces for standardized OHDSI analyses (Cohort Characterization, TxPathway, Incidence Rate etc.) used as an exchange standard or implementation guide
Java
5
star