• Stars
    star
    378
  • Rank 108,324 (Top 3 %)
  • Language
    C++
  • License
    Other
  • Created over 7 years ago
  • Updated 3 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Connect to ODBC databases (using the DBI interface)

odbc

Project Status: Active โ€“ The project has reached a stable, usable state and is being actively developed. CRAN_Status_Badge R-CMD-check Codecov test coverage

The goal of the odbc package is to provide a DBI-compliant interface to ODBC drivers. This makes it easy to connect databases such as SQL Server, Oracle, Databricks, and Snowflake.

The odbc package is an alternative to RODBC and RODBCDBI packages, and is typically much faster. See vignette("benchmarks") to learn more.

Overview

The odbc package is one piece of the R interface to databases with support for ODBC:

A diagram containing four boxes with arrows linking each pointing left to right. The boxes read, in order, "R interface," "driver manager," "ODBC driver," and "DBMS." The left-most box, R interface, contains three smaller components, labeled "dbplyr," "DBI," and "odbc."

Support for a given DBMS is provided by an ODBC driver, which defines how to interact with that DBMS using the standardized syntax of ODBC and SQL. Drivers can be downloaded from the DBMS vendor or, if youโ€™re a Posit customer, using the professional drivers.

Drivers are managed by a driver manager, which is responsible for configuring driver locations, and optionally named data sources that describe how to connect to a specific database. Windows is bundled with a driver manager, while MacOS and Linux require installation of unixODBC. Drivers often require some manual configuration; see vignette("setup") for details.

In the R interface, the DBI package provides a front-end while odbc implements a back-end to communicate with the driver manager. The odbc package is built on top of the nanodbc C++ library. To interface with DBMSs using R and odbc:

A high-level workflow for using the R interface in 3 steps. In step 1, configure drivers and data sources, the functions odbcListDrivers() and odbcListDataSources() help to interface with the driver manager. In step 2, the dbConnect() function, called with the first argument odbc(), connects to a database using the specified ODBC driver to create a connection object "con." Finally, in step 3, that connection object can be passed to various functions to retrieve information on database structure, iteratively develop queries, and query data objects.

You might also use the dbplyr package to automatically generate SQL from your dplyr code.

Installation

Install the latest release of odbc from CRAN with the following code:

install.packages("odbc")

To get a bug fix or to use a feature from the development version, you can install the development version of odbc from GitHub:

# install.packages("pak")
pak::pak("r-dbi/odbc")

Usage

To use odbc, begin by creating a database connection, which might look something like this:

library(DBI)

con <- dbConnect(
  odbc::odbc(),
  driver = "SQL Server",
  server = "my-server",
  database = "my-database",
  uid = "my-username",
  pwd = rstudioapi::askForPassword("Database password")
)

(See vignette("setup") for examples of connecting to a variety of databases.)

dbListTables() is used for listing all existing tables in a database.

dbListTables(con)

dbReadTable() will read a full table into an R data.frame().

data <- dbReadTable(con, "flights")

dbWriteTable() will write an R data.frame() to an SQL table.

dbWriteTable(con, "iris", iris)

dbGetQuery() will submit a SQL query and fetch the results:

df <- dbGetQuery(
  con,
  "SELECT flight, tailnum, origin FROM flights ORDER BY origin"
)

It is also possible to submit the query and fetch separately with dbSendQuery() and dbFetch(). This allows you to use the n argument to dbFetch() to iterate over results that would otherwise be too large to fit in memory.