• Stars
    star
    272
  • Rank 151,209 (Top 3 %)
  • Language
  • Created over 3 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Learn NoSQL fundamentals in this hands-on workshop

🎓🔥 Introduction to NoSQL Databases

License Apache2 Discord

image

These instructions will lead you step by step for the workshop on introducing the NoSQL Databases technologies.

Materials for the Session

It doesn't matter if you join our workshop live or you prefer to do at your own pace, we have you covered. In this repository, you'll find everything you need for this workshop:

Participation Badge / Homework

To get the verified badge, you have to complete the following steps:

  1. Complete the practice steps of this workshop as explained below. Steps 1-4 (Astra account + tabular/document/key-value databases) are mandatory, step 5 (graph database) is optional. Take a screenshot of completion of the last step for sections 2, 3 and 4 (either a CQL command output or a response in the Swagger UI). NOTE: When taking screenshots ensure NOT to copy your Astra DB secrets!
  2. Submit the practice here, answering a few "theory" questions and also attaching the screenshots.

Practice

  1. Login or Register to AstraDB and create database
  2. Tabular Databases
  3. Document Databases
  4. Key-Value Databases
  5. Graph Databases

1. Login or Register to AstraDB and create database

ASTRADB is the simplest way to run Cassandra with zero operations at all - just push the button and get your cluster. No credit card required, a monthly free credit to use, covering about 20M reads/writes and 80GB storage (sufficient to run small production workloads), all for FREE.

1a. Register a free account on Astra

Click the button below to login or register on DataStax Astra DB. You can use your Github, Google accounts or register with an email.

Use the following values when creating the database (this makes your life easier further on):

Field Value
database name workshops
keyspace nosql1
Cloud Provider Stick to GCP and then pick an "unlocked" region to start immediately

More info on account creation here.

You will see your new database as pending or initializing on the Dashboard. The status will then change to Active when the database is ready: this will only take 2-3 minutes. At that point you will also receive a confirmation email.

2. Tabular databases

In a tabular database we will store ... tables! The Astra DB Service is built on Apache Cassandra™, which is tabular. Let's start with this.

Tabular databases organize data in rows and columns, but with a twist from the traditional RDBMS. Also known as wide-column stores or partitioned row stores, they provide the option to organize related rows in partitions that are stored together on the same replicas to allow fast queries. Unlike RDBMSs, the tabular format is not necessarily strict. For example, Apache Cassandra™ does not require all rows to contain values for all columns in the table. Like Key/Value and Document databases, Tabular databases use hashing to retrieve rows from the table. Examples include: Cassandra, HBase, and Google Bigtable.

2a. Describe your Keyspace

At database creation you provided a keyspace, a logical grouping for tables. Let's visualize it. In Astra DB go to CQL Console to enter the following commands

Select your db

image

Go to the Cql Console

image

Enter the describe command

... and press Enter:

DESCRIBE KEYSPACES;

image

2b. Create table

Table creation

Execute the following Cassandra Query Language commands

USE nosql1;

CREATE TABLE IF NOT EXISTS accounts_by_user (
  user_id         UUID,
  account_id      UUID,
  account_type    TEXT,
  account_balance DECIMAL,
  user_name       TEXT      STATIC,
  user_email      TEXT      STATIC,
  PRIMARY KEY ( (user_id), account_id)
)   WITH CLUSTERING ORDER BY (account_id ASC);

Check

Check keyspace contents and structure:

DESCRIBE KEYSPACE nosql1;

👁️ Expected output

CREATE KEYSPACE nosql1 WITH replication = {'class': 'NetworkTopologyStrategy', 'eu-central-1': '3'}  AND durable_writes = true;

CREATE TABLE nosql1.accounts_by_user (
    user_id uuid,
    account_id uuid,
    account_balance decimal,
    account_type text,
    user_email text static,
    user_name text static,
    PRIMARY KEY (user_id, account_id)
) WITH CLUSTERING ORDER BY (account_id ASC)
    AND additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

2c. Working with DATA

Insert some entries into the table

INSERT INTO accounts_by_user(user_id, account_id, account_balance, account_type, user_email, user_name)
VALUES(
    1cafb6a4-396c-4da1-8180-83531b6a41e3,
    811b56c3-cead-40d9-9a3d-e230dcd64f2f,
    1500,
    'Savings',
    '[email protected]',
    'Alice'
);

INSERT INTO accounts_by_user(user_id, account_id, account_balance, account_type)
VALUES(
    1cafb6a4-396c-4da1-8180-83531b6a41e3,
    83428a85-5c8f-4398-8019-918d6e1d3a93,
    2500,
    'Checking'
);

INSERT INTO accounts_by_user(user_id, account_id, account_balance, account_type, user_email, user_name)
VALUES(
    0d2b2319-9c0b-4ecb-8953-98687f6a99ce,
    81def5e2-84f4-4885-a920-1c14d2be3c20,
    1000,
    'Checking',
    '[email protected]',
    'Bob'
);

Read values

SELECT * FROM accounts_by_user;

Such a full-table query is strongly discouraged in most distributed databases as it involves contacting many nodes to assemble the whole result dataset: here we are using it for learning purposes, not in production and on a table with very few rows!

👁️ Expected output

 user_id                              | account_id                           | user_email        | user_name | account_balance | account_type
--------------------------------------+--------------------------------------+-------------------+-----------+-----------------+--------------
 0d2b2319-9c0b-4ecb-8953-98687f6a99ce | 81def5e2-84f4-4885-a920-1c14d2be3c20 |   [email protected] |       Bob |            1000 |     Checking
 1cafb6a4-396c-4da1-8180-83531b6a41e3 | 811b56c3-cead-40d9-9a3d-e230dcd64f2f | [email protected] |     Alice |            1500 |      Savings
 1cafb6a4-396c-4da1-8180-83531b6a41e3 | 83428a85-5c8f-4398-8019-918d6e1d3a93 | [email protected] |     Alice |            2500 |     Checking

(3 rows)

Notice that all three rows are "filled with data", despite the second of the insertions above skipping the user_email and user_name columns: this is because these are static columns (i.e. associated to the whole partition) and their value had been written already in the first insertion.

Read by primary key

SELECT user_email, account_type, account_balance
  FROM accounts_by_user
  WHERE user_id=0d2b2319-9c0b-4ecb-8953-98687f6a99ce
    AND account_id=81def5e2-84f4-4885-a920-1c14d2be3c20;

👁️ Expected output

 user_email      | account_type | account_balance
-----------------+--------------+-----------------
 [email protected] |     Checking |            1000

(1 rows)

2d. Working with PARTITIONS

But data can be grouped, we stored together what should be retrieved together.

Try a query not compatible with the data model

(Optional: click to expand)
SELECT account_id, account_type, account_balance
   FROM accounts_by_user
   WHERE account_id=81def5e2-84f4-4885-a920-1c14d2be3c20;

Yes, we know, and now let's see why.

TRACING ON;
SELECT account_id, account_type, account_balance
   FROM accounts_by_user
   WHERE account_id=81def5e2-84f4-4885-a920-1c14d2be3c20
   ALLOW FILTERING;
TRACING OFF;

Note: ALLOW FILTERING is almost never to be used in production, we use it here to see what happens!

👁️ Output

 account_id                           | account_type | account_balance
--------------------------------------+--------------+-----------------
 81def5e2-84f4-4885-a920-1c14d2be3c20 |     Checking |            1000

(1 rows)

But also ("Anatomy of a full-cluster scan"):

Tracing session: e97b98b0-d146-11ec-a4e5-19251c2b96e1

 activity                                                                                                                   | timestamp                  | source      | source_elapsed | client
----------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------+----------------+-----------------------------------------
                                                                                                         Execute CQL3 query | 2022-05-11 16:25:03.675000 | 10.0.63.218 |              0 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
 Parsing SELECT[....]_by_user\n   WHERE account_id=81def5e2-84f4-4885-a920-1c14d2be3c20\n   ALLOW FILTERING; [CoreThread-0] | 2022-05-11 16:25:03.676000 | 10.0.63.218 |            229 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                                         Preparing statement [CoreThread-0] | 2022-05-11 16:25:03.676000 | 10.0.63.218 |            445 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                                Computing ranges to query... [CoreThread-0] | 2022-05-11 16:25:03.681000 | 10.0.63.218 |           5970 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                         READS.RANGE_READ message received from /10.0.63.218 [CoreThread-9] | 2022-05-11 16:25:03.682000 | 10.0.31.189 |             -- | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                Submitting range requests on 25 ranges with a concurrency of 1 (0.0 rows per range expected) [CoreThread-0] | 2022-05-11 16:25:03.682000 | 10.0.63.218 |           6197 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                       Submitted 1 concurrent range requests [CoreThread-0] | 2022-05-11 16:25:03.682000 | 10.0.63.218 |           6312 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                             Sending READS.RANGE_READ message to /10.0.32.75, size=227 bytes [CoreThread-9] | 2022-05-11 16:25:03.682000 | 10.0.63.218 |           6436 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                            Sending READS.RANGE_READ message to /10.0.31.189, size=227 bytes [CoreThread-8] | 2022-05-11 16:25:03.682000 | 10.0.63.218 |           6436 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                         READS.RANGE_READ message received from /10.0.63.218 [CoreThread-4] | 2022-05-11 16:25:03.683000 |  10.0.32.75 |             -- | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
             Executing seq scan across 0 sstables for (min(-9223372036854775808), min(-9223372036854775808)] [CoreThread-4] | 2022-05-11 16:25:03.683000 |  10.0.32.75 |            444 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
             Executing seq scan across 0 sstables for (min(-9223372036854775808), min(-9223372036854775808)] [CoreThread-9] | 2022-05-11 16:25:03.684000 | 10.0.31.189 |            356 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                       Read 1 live rows and 0 tombstone ones [CoreThread-4] | 2022-05-11 16:25:03.684000 |  10.0.32.75 |            789 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                       Read 1 live rows and 0 tombstone ones [CoreThread-9] | 2022-05-11 16:25:03.684000 | 10.0.31.189 |            731 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                          Enqueuing READS.RANGE_READ response to /10.0.32.75 [CoreThread-4] | 2022-05-11 16:25:03.684000 |  10.0.32.75 |            897 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                         Enqueuing READS.RANGE_READ response to /10.0.31.189 [CoreThread-9] | 2022-05-11 16:25:03.684000 | 10.0.31.189 |            731 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                            Sending READS.RANGE_READ message to /10.0.63.218, size=212 bytes [CoreThread-7] | 2022-05-11 16:25:03.684000 |  10.0.32.75 |            954 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                            Sending READS.RANGE_READ message to /10.0.63.218, size=212 bytes [CoreThread-1] | 2022-05-11 16:25:03.684000 | 10.0.31.189 |           1098 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                          READS.RANGE_READ message received from /10.0.32.75 [CoreThread-9] | 2022-05-11 16:25:03.685000 | 10.0.63.218 |           9626 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                         READS.RANGE_READ message received from /10.0.31.189 [CoreThread-1] | 2022-05-11 16:25:03.702000 | 10.0.63.218 |          27526 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                        Processing response from /10.0.32.75 [CoreThread-0] | 2022-05-11 16:25:03.856000 | 10.0.63.218 |         181075 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                       Processing response from /10.0.31.189 [CoreThread-0] | 2022-05-11 16:25:03.856000 | 10.0.63.218 |         181193 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
Didn't get enough response rows; actual rows per range: 0.04; remaining rows: 99, new concurrent requests: 1 [CoreThread-0] | 2022-05-11 16:25:03.856000 | 10.0.63.218 |         181384 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b
                                                                                                           Request complete | 2022-05-11 16:25:03.856560 | 10.0.63.218 |         181560 | 2898:d2d9:30d9:4a4f:acec:3e3a:3a76:4a7b

Retrieve data from a whole partition

SELECT account_id, account_type, account_balance
  FROM accounts_by_user
  WHERE user_id=1cafb6a4-396c-4da1-8180-83531b6a41e3;

👁️ Expected output

 account_id                           | account_type | account_balance
--------------------------------------+--------------+-----------------
 811b56c3-cead-40d9-9a3d-e230dcd64f2f |      Savings |            1500
 83428a85-5c8f-4398-8019-918d6e1d3a93 |     Checking |            2500

(2 rows)

🏠 Back to Table of Contents

3. Document Databases

Let's do some hands-on with document database queries.

Document databases expand on the basic idea of key-value stores where “documents” are more complex, in that they contain data and each document is assigned a unique key, which is used to retrieve the document. These are designed for storing, retrieving, and managing document-oriented information, often stored as JSON. Since the Document database can inspect the document contents, the database can perform some additional retrieval processing. Unlike RDBMSs which require a static schema, Document databases have a flexible schema as defined by the document contents. Examples include: MongoDB and CouchDB. Note that some RDBMS and NoSQL databases outside of pure document stores are able to store and query JSON documents, including Cassandra.

3a. Cassandra native JSON support

It is not widely known, but Cassandra accepts JSON queries out of the box. You can find more information here.

Show native JSON support

JSON syntax for insertions

Insert data into Cassandra with JSON syntax:

INSERT INTO accounts_by_user JSON '{
  "user_id": "1cafb6a4-396c-4da1-8180-83531b6a41e3",
  "account_id": "811b56c3-cead-40d9-9a3d-e230dcd64f2f",
  "user_email": "[email protected]",
  "user_name": "Alice",
  "account_type": "Savings",
  "account_balance": "8500"
}' ;

Warning: missing fields in the provided JSON will entail explicit insertion of corresponding null values.

JSON output when querying

In the same way you can retrieve JSON out of Cassandra (more info here).

SELECT JSON account_type, account_balance
  FROM accounts_by_user
  WHERE user_id=1cafb6a4-396c-4da1-8180-83531b6a41e3;

👁️ Output

 [json]
-------------------------------------------------------
  {"account_type": "Savings", "account_balance": 8500}
 {"account_type": "Checking", "account_balance": 2500}

(2 rows)

This JSON support is but a wrapper around access to the same fixed-schema tables seen in the previous section ("Tabular").

3b. Create a token and open Swagger

We now turn to using Astra DB's Document API.

Token creation

To do so, first you need to create an Astra DB token, which will be used for authentication to your database.

Create a token with "Database Administrator" privileges following the instructions here: Create an Astra DB token. (See also the official docs on tokens.)

Keep the "token" ready to use (it is the long string starting with AstraCS:.....).

⚠️ Important

The instructor will show you on screen how to create a token 
but will have to destroy to token immediately for security reasons.

Swagger UI

The Document API can be easily accessed through a Swagger UI: go the "Connect" page, stay in the "Document API" subpage, and locate the URL under the "Launching Swagger UI" heading:

image

Locate the "documents" section in the Swagger UI. You are now ready to fire requests to the Document API.

image

3c. Create a new empty collection

Swagger 3c

  • Access Create a new empty collection in a namespace
  • Click Try it out button
  • Fill Header X-Cassandra-Token with <your_token>
  • For namespace-id use nosql1
  • For body use
{ "name": "users" }
  • Click the Execute button

You will get an HTTP 201 - Created return code.

Note: the response you just got from actually calling the API endpoint is given under the "Server response" heading. Do not confuse it with the "Responses" found immediately after, which are simply a documentation of all possible response codes (and the return object they quote are static example JSONs).

Click to show a screenshot

image

3d. Create new documents

Add a first document

Swagger 3d

  • Access Create a new document
  • Click Try it out button
  • Fill with Header X-Cassandra-Token with AstraCS:...[your_token]...
  • For namespace-id use nosql1
  • For collection-id use users
  • For body use
{
    "accounts": [
        {
            "balance": "1000",
            "id": "81def5e2-84f4-4885-a920-1c14d2be3c20",
            "type": "Checking"
        }
    ],
    "email": "[email protected]",
    "id": "0d2b2319-9c0b-4ecb-8953-98687f6a99ce",
    "name": "Bob"
}
  • Click the Execute button

👁️ Expected output (your documentId will be different)

{
  "documentId": "137d8609-87f6-4cb7-9506-e52f338e79e9"
}

Add another document

Repeat with the following body, which has a different structure:

{
    "accounts": [
        {
            "balance": "2500",
            "id": "83428a85-5c8f-4398-8019-918d6e1d3a93",
            "type": "Checking"
        },
        {
            "balance": "1500",
            "id": "811b56c3-cead-40d9-9a3d-e230dcd64f2f",
            "type": "Savings"
        }
    ],
    "email": "[email protected]",
    "id": "1cafb6a4-396c-4da1-8180-83531b6a41e3",
    "name": "Alice"
}

As before, the document will automatically be given an internal unique documentId.

3e. Retrieve a document by its ID

Swagger 3e

  • Access Get a document
  • Click Try it out button
  • Fill Header X-Cassandra-Token with <your_token>
  • For namespace-id use nosql1
  • For collection-id use users
  • For document-id use Bob's documentId (e.g. 137d8609-87f6-4cb7-9506-e52f338e79e9 in the above sample output)
  • Click the Execute button

👁️ Expected output

{
  "documentId": "137d8609-87f6-4cb7-9506-e52f338e79e9",
  "data": {
    "accounts": [
      {
        "balance": "1000",
        "id": "81def5e2-84f4-4885-a920-1c14d2be3c20",
        "type": "Checking"
      }
    ],
    "email": "[email protected]",
    "id": "0d2b2319-9c0b-4ecb-8953-98687f6a99ce",
    "name": "Bob"
  }
}

3f. Find all documents in a collection

Swagger 3f

  • Access Search documents in a collection
  • Click Try it out button
  • Fill Header X-Cassandra-Token with <your_token>
  • For namespace-id use nosql1
  • For collection-id use users

Leave other fields blank (in particular, every query is paged in Cassandra).

  • Click the Execute button

👁️ Expected output (take note of the documentIds of your output for later)

{
  "data": {
    "6d0aafd9-3c2c-461f-92c6-08322eaef5da": {
      "accounts": [
        {
          "balance": "2500",
          "id": "83428a85-5c8f-4398-8019-918d6e1d3a93",
          "type": "Checking"
        },
        {
          "balance": "1500",
          "id": "811b56c3-cead-40d9-9a3d-e230dcd64f2f",
          "type": "Savings"
        }
      ],
      "email": "[email protected]",
      "id": "1cafb6a4-396c-4da1-8180-83531b6a41e3",
      "name": "Alice"
    },
    "137d8609-87f6-4cb7-9506-e52f338e79e9": {
      "accounts": [
        {
          "balance": "1000",
          "id": "81def5e2-84f4-4885-a920-1c14d2be3c20",
          "type": "Checking"
        }
      ],
      "email": "[email protected]",
      "id": "0d2b2319-9c0b-4ecb-8953-98687f6a99ce",
      "name": "Bob"
    }
  }
}

3g. Search document with a "where" clause

The endpoint you just used can support where clauses as well, expressed as JSON. You don't need to navigate away from it do try the following:

Swagger 3g

  • Access Search documents in a collection (you should be there already)
  • Click Try it out button
  • Fill Header X-Cassandra-Token with <your_token>
  • For namespace-id use nosql1
  • For collection-id use users
  • For where use {"name": {"$eq": "Alice"}}
  • Click the Execute button

👁️ Expected output

{
  "data": {
    "6d0aafd9-3c2c-461f-92c6-08322eaef5da": {
      "accounts": [
        {
          "balance": "2500",
          "id": "83428a85-5c8f-4398-8019-918d6e1d3a93",
          "type": "Checking"
        },
        {
          "balance": "1500",
          "id": "811b56c3-cead-40d9-9a3d-e230dcd64f2f",
          "type": "Savings"
        }
      ],
      "email": "[email protected]",
      "id": "1cafb6a4-396c-4da1-8180-83531b6a41e3",
      "name": "Alice"
    }
  }
}

🏠 Back to Table of Contents

4. Key/Value Databases

Key/Value databases are some of the simplest and yet powerful as all of the data within consists of an indexed key and a value. Key-value databases use a hashing mechanism, so that that given a key, the database can quickly retrieve the associated value. Hashing mechanisms provide constant time access, which means they maintain high performance even at large scale. The keys can be any type of object, but are typically a string. The values are generally opaque blobs (i.e. a sequence of bytes that the database does not interpret). Examples include: Redis, Amazon DynamoDB, Riak, and Oracle NoSQL database. Some tabular NoSQL databases, like Cassandra, can also service key/value needs.

4a. Create a table for Key/Value

Go to the CQL Console again and issue the following commands to create a new, simple table with just two columns:

USE nosql1;

CREATE TABLE users_kv (
  key   TEXT PRIMARY KEY,
  value TEXT
);

4b. Populate the table

Insert into the table all the following entries. Note that all inserted values, regardless of their "true" data type, have been coerced into strings according to the table schema. Also note how the keys are structured and how some entries reference other, effectively creating a set of interconnected pieces of information on the users:

INSERT INTO users_kv (key, value) VALUES ('user:1cafb6a4-396c-4da1-8180-83531b6a41e3:name',       'Alice');
INSERT INTO users_kv (key, value) VALUES ('user:1cafb6a4-396c-4da1-8180-83531b6a41e3:email',      '[email protected]');
INSERT INTO users_kv (key, value) VALUES ('user:1cafb6a4-396c-4da1-8180-83531b6a41e3:accounts',   '{83428a85-5c8f-4398-8019-918d6e1d3a93, 811b56c3-cead-40d9-9a3d-e230dcd64f2f}');

INSERT INTO users_kv (key, value) VALUES ('user:0d2b2319-9c0b-4ecb-8953-98687f6a99ce:name',       'Bob');
INSERT INTO users_kv (key, value) VALUES ('user:0d2b2319-9c0b-4ecb-8953-98687f6a99ce:email',      '[email protected]');
INSERT INTO users_kv (key, value) VALUES ('user:0d2b2319-9c0b-4ecb-8953-98687f6a99ce:accounts',   '{81def5e2-84f4-4885-a920-1c14d2be3c20}');

INSERT INTO users_kv (key, value) VALUES ('account:83428a85-5c8f-4398-8019-918d6e1d3a93:type',    'Checking');
INSERT INTO users_kv (key, value) VALUES ('account:83428a85-5c8f-4398-8019-918d6e1d3a93:balance', '2500');

INSERT INTO users_kv (key, value) VALUES ('account:811b56c3-cead-40d9-9a3d-e230dcd64f2f:type',    'Savings');
INSERT INTO users_kv (key, value) VALUES ('account:811b56c3-cead-40d9-9a3d-e230dcd64f2f:balance', '1500');

INSERT INTO users_kv (key, value) VALUES ('account:81def5e2-84f4-4885-a920-1c14d2be3c20:type',    'Checking');
INSERT INTO users_kv (key, value) VALUES ('account:81def5e2-84f4-4885-a920-1c14d2be3c20:balance', '1000');

4c. Update a value

You can imagine an application "navigating the keys" (e.g, from an user to an account) for instance when it must update a balance. The actual update would look like:

INSERT INTO users_kv (key, value) VALUES ('account:81def5e2-84f4-4885-a920-1c14d2be3c20:balance', '9000');

Let's check:

SELECT * FROM users_kv WHERE key = 'account:81def5e2-84f4-4885-a920-1c14d2be3c20:balance';

👁️ Expected output

 key                                                  | value
------------------------------------------------------+-------
 account:81def5e2-84f4-4885-a920-1c14d2be3c20:balance |  9000

(1 rows)

Alternative update syntax

The same result is obtained with

UPDATE users_kv SET value = '-500' WHERE key = 'account:81def5e2-84f4-4885-a920-1c14d2be3c20:balance';

indeed, in most key-value data stores, inserting and updating are one and the same operation since the main goal is usually the highest performance (hence, row-existence checks are skipped altogether).

Thus, writing entries with the key of a pre-existing entry will simply overwrite the less recent values, enabling a very efficient and simple deduplication strategy.

Check once more what's in the table:

SELECT * FROM users_kv ;

👁️ Expected output

 key                                                  | value
------------------------------------------------------+------------------------------------------------------------------------------
 account:81def5e2-84f4-4885-a920-1c14d2be3c20:balance |                                                                         -500
   user:0d2b2319-9c0b-4ecb-8953-98687f6a99ce:accounts |                                       {81def5e2-84f4-4885-a920-1c14d2be3c20}
 account:811b56c3-cead-40d9-9a3d-e230dcd64f2f:balance |                                                                         1500
   user:1cafb6a4-396c-4da1-8180-83531b6a41e3:accounts | {83428a85-5c8f-4398-8019-918d6e1d3a93, 811b56c3-cead-40d9-9a3d-e230dcd64f2f}
      user:1cafb6a4-396c-4da1-8180-83531b6a41e3:email |                                                            [email protected]
       user:1cafb6a4-396c-4da1-8180-83531b6a41e3:name |                                                                        Alice
       user:0d2b2319-9c0b-4ecb-8953-98687f6a99ce:name |                                                                          Bob
      user:0d2b2319-9c0b-4ecb-8953-98687f6a99ce:email |                                                              [email protected]
    account:83428a85-5c8f-4398-8019-918d6e1d3a93:type |                                                                     Checking
    account:811b56c3-cead-40d9-9a3d-e230dcd64f2f:type |                                                                      Savings
    account:81def5e2-84f4-4885-a920-1c14d2be3c20:type |                                                                     Checking
 account:83428a85-5c8f-4398-8019-918d6e1d3a93:balance |                                                                         2500

(12 rows)

🏠 Back to Table of Contents

5. Graph Databases

Graph databases store their data using a graph metaphor to exploit the relationships between data. Nodes in the graph represent data items, and edges represent the relationships between the data items. Graph databases are designed for highly complex and connected data, which outpaces the relationship and JOIN capabilities of an RDBMS. Graph databases are often exceptionally good at finding commonalities and anomalies among large data sets. Examples of Graph databases include DataStax Graph, Neo4J, JanusGraph, and Amazon Neptune.

Astra DB does not contain yet a way to implement Graph Databases use cases. But at Datastax we do have a product called DataStax Graph that you can use for free when not in production.

For graph databases, the presenter will show a demo based on the example in the slides.

The hands-on practice for you is different. But since it cannot be done in the browser using Astra DB like the rest, it is kept separate and not included in today's curriculum.

🔥 Yet, you are strongly encouraged to try it at your own pace, on your own computer, by following the instructions given here: Graph Databases Practice. 🔥

Try it out, it's super cool!

THE END

Congratulations! You made it to the END.

See you next time!

🏠 Back to Table of Contents

More Repositories

1

workshop-graphql-netflix

Workshop to illustrate how to use GraphQL
JavaScript
623
star
2

workshop-intro-to-cassandra

Learn Apache Cassandra fundamentals in this hands-on workshop
219
star
3

workshop-ai-as-api

Python
132
star
4

workshop-ecommerce-app

Are you building or do you support an e-commerce website? If so, then this content is for you! Worldwide digital sales in 2020 eclipsed four trillion dollars (USD). Businesses that want to compete, need a high performing e-commerce website. Here, we will demonstrate how to build a high performing persistence layer with DataStax ASTRA DB.
Java
86
star
5

appdev-week1-todolist

Summer Series Week 1 - Run your first frontend application the Todolist
JavaScript
85
star
6

workshop-intro-to-graphql

Learn the basics of GraphQL. Plenty of hands-on experience covering: a React client interacting with Astra DB using the Netlify stack; usage of playgrounds and tooling; running your own Java GraphQL backend based on the Netflix DGS framework.
JavaScript
79
star
7

bootcamp-fullstack-apps-with-cassandra

Learn how to build a backend for Cassandra, from data model to drivers to API exposition
Java
69
star
8

workshop-spring-stargate

Building Spring Boot/data Application leveraging Cassandra and Stargate
JavaScript
58
star
9

learningpath-docker

Docker online course for developers
JavaScript
57
star
10

workshop-spring-data-cassandra

Build a Todolist with Spring Data Cassandra
Java
39
star
11

react-basics

A quick overview of React Basics
33
star
12

workshop-streaming-game

A simple multiplayer online game (with in-game chat) featuring: Astra Streaming, Astra DB, WebSockets, React for the front-end and FastAPI for the back-end. Also spiders!
Python
30
star
13

workshop-microservices-java

Rest API for Todobackend on top of Cassandra
Java
26
star
14

workshop-cassandra-data-modeling

This session looks at how to effectively design a data model for your application. You’ll leave knowing how to create data models that scale effectively as your system grows
22
star
15

workshop-pulsar

Getting started with Pulsar and Cassandra
Java
21
star
16

workshop-intro-quarkus-cassandra

20
star
17

workshop-cassandra-fundamentals

Welcome to the Apache Cassandra™ Fundamentals workshop! In this two-hour workshop, we shows the most important fundamentals and basics of the powerful distributed NoSQL database Apache Cassandra™.
19
star
18

event-streaming-series

Materials and content for the 3-episodes Event Streaming Series by DataStax Developers
11
star
19

workshop-cassandra-application-development

Learn about drivers, connectivity and requests by running a simple API with Apache Cassandra/Astra DB as its data backend.
Java
11
star
20

workshop-sql-to-nosql-migration

Shell
11
star
21

awesome-astra

⚒️ Collection of How to use technologies with ASTRA
10
star
22

workshop-storage-attached-indexes

Welcome to the 'Scalable Indexing for Cassandra using DataStax Astra' workshop! In this two-hour workshop, the Developer Advocate team of DataStax will explain the new Storage Attached Indexing (SAI) feature using Astra, the cloud based Cassandra-as-a-Service platform delivered by DataStax, to demonstrate how you can use them to add some much wante
10
star
23

conference-2024-devoxx-france

Generative AI with Java
Java
9
star
24

conference-2022-devoxx

Deep Dive - A Java Developer Journey in to Cassandra
Java
8
star
25

workshop-introduction-to-machine-learning

Come ready to discover the goals and approaches of machine learning, and how to build effective algorithms and solutions!
Jupyter Notebook
8
star
26

ai-agent-java

Online workshop to learn how to build your own Java AI Agent
7
star
27

ai_ml_workshop_notebooks

Jupyter Notebook
6
star
28

workshop-your-k8s-to-cloud

Taking your K8s App to the cloud Online Workshop
JavaScript
5
star
29

workshop-intro-streaming-and-cdc

We will visit the basics of change data capture (CDC) and why it's become so popular recently - with CDC, developers create smarter applications when real-time data is connected to the whole data ecosystem. During the workshop we will highlight use cases that make it your go-to solution when modernizing applications, getting more value out of your data (real-time), and joining new machine learning techniques with old applications. We will introduce you to Apache Pulsar, show you how to enable the CDC feature in Datastax Astra, how to set up Astra Streaming, and (of course) how to get it all working together!
Java
5
star
30

cassandra-day

Public resources for Cassandra Day (slide decks and other useful content)
4
star
31

advanced-cdc-for-astra

Java
4
star
32

workshop-battlestax

JavaScript
4
star
33

conference-2020-javafest-api

Source code for the talk on May 30th
Java
3
star
34

workshop-streaming-graph-quine

Workshop with Quine Partner showing power of Astra with Graph Analytics Quine.io
Shell
3
star
35

conf-grpc-rest-graphql-data-apis

Expose Rest,Graphql,Grpc Apis on Top for Your Databases
Java
3
star
36

workshop-realtime-data-pipelines

You will inspect and run a sample architecture making use of Apache Pulsar™ and Pulsar Functions for real-time, event-streaming-based data ingestion, cleaning and processing.
Python
3
star
37

workshop-swinburne

Repository for Swinburne University Hanoi workshop
Java
3
star
38

workshop-spring-quarkus-micronaut-cassandra

Support Material for workshops going over the 3 frameworks
Java
3
star
39

feast-cassandra-online-store

A Feast custom online store using Cassandra / Astra DB
Python
3
star
40

cdc-for-astra-guides

2
star
41

netlify-astra-example

JavaScript
2
star
42

conference-2021-apidays-stargate

Demos of Stargate with the APS
Java
2
star
43

workshop-beam

Getting Started with Beam and Astra
Java
2
star
44

conference-2020-dataconla-stargate

Material and Demo for a Talk on 2020/10/24
2
star
45

quarkus-astra-intro-demo

JavaScript
2
star
46

conference-2022-devoxx-france

DEVOXX FRANCE, Build Performant applications with Apache Cassandra
Java
2
star
47

workshop-wikipedia-qa

Real-time document Q&A using Pulsar, Cassandra, LangChain, and open-source language models.
Python
2
star
48

workshop-rag-fashion-buddy

Material and step by step instructions for the Fashion Buddy workshop
Python
2
star
49

workshop-cassandra-driver-nodejs-portugues

Portuguese version of the Cassandra driver javascript node.js workshop
JavaScript
2
star
50

workshop-astra-block

AstraBlock Workshop
TypeScript
2
star
51

astrapy

Python
1
star
52

datastaxdevs.github.io

Website for DataStax developers initiatives
HTML
1
star
53

demo-astra-sdk-java

Sample codes for the tutorial series on getting started with the SDK
JavaScript
1
star
54

conference-2021-apachecon-stargate

Materials for the conference Apachecon2021
1
star
55

learningpath-terraform-it

HCL
1
star
56

workshop-IOS-Swift-Astra

Sample IOS app in Swift that connects to Datastax Astra
Swift
1
star
57

workshop-nosqlbench

The goal of this workshop is to get you familiar with the powerful and versatile tool NoSQLBench. With that, you can perform industry-grade, robust benchmarks aimed at several (distributed) target systems, especially NoSQL databases.
Python
1
star
58

terraform-tryout

HCL
1
star