• Stars
    star
    256
  • Rank 153,923 (Top 4 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created about 4 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The DataStax Kubernetes Operator for Apache Cassandra

Cass Operator

Gitter Go Report Card License: Apache License 2.0

The DataStax Kubernetes Operator for Apache Cassandra®

⚠️ *** Ongoing development of cass-operator and future releases have been migrated to the K8ssandra project. Head over to the new repo for more information on using cass-operator! *** ⚠️

Getting Started

Quick start:

# *** This is for GKE Regular Channel - k8s 1.16 -> Adjust based on your cloud or storage options
kubectl create -f https://raw.githubusercontent.com/datastax/cass-operator/v1.6.0/docs/user/cass-operator-manifests-v1.16.yaml
kubectl create -f https://raw.githubusercontent.com/datastax/cass-operator/v1.6.0/operator/k8s-flavors/gke/storage.yaml
kubectl -n cass-operator create -f https://raw.githubusercontent.com/datastax/cass-operator/v1.6.0/operator/example-cassdc-yaml/cassandra-3.11.x/example-cassdc-minimal.yaml

Loading the operator

Installing the Cass Operator itself is straightforward. We have provided manifests for each Kubernetes version from 1.15 through 1.19. Apply the relevant manifest to your cluster as follows:

K8S_VER=v1.16
kubectl apply -f https://raw.githubusercontent.com/datastax/cass-operator/v1.6.0/docs/user/cass-operator-manifests-$K8S_VER.yaml

Note that since the manifest will install a Custom Resource Definition, the user running the above command will need cluster-admin privileges.

This will deploy the operator, along with any requisite resources such as Role, RoleBinding, etc., to the cass-operator namespace. You can check to see if the operator is ready as follows:

$ kubectl -n cass-operator get pods --selector name=cass-operator
NAME                             READY   STATUS    RESTARTS   AGE
cass-operator-555577b9f8-zgx6j   1/1     Running   0          25h

Creating a storage class

You will need to create an appropriate storage class which will define the type of storage to use for Cassandra nodes in a cluster. For example, here is a storage class for using SSDs in GKE, which you can also find at operator/deploy/k8s-flavors/gke/storage.yaml:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: server-storage
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  replication-type: none
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

Apply the above as follows:

kubectl apply -f https://raw.githubusercontent.com/datastax/cass-operator/v1.6.0/operator/k8s-flavors/gke/storage.yaml

Creating a CassandraDatacenter

The following resource defines a Cassandra 3.11.7 datacenter with 3 nodes on one rack, which you can also find at operator/example-cassdc-yaml/cassandra-3.11.x/example-cassdc-minimal.yaml:

apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  name: dc1
spec:
  clusterName: cluster1
  serverType: cassandra
  serverVersion: 3.11.7
  managementApiAuth:
    insecure: {}
  size: 3
  storageConfig:
    cassandraDataVolumeClaimSpec:
      storageClassName: server-storage
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
  config:
    cassandra-yaml:
      authenticator: org.apache.cassandra.auth.PasswordAuthenticator
      authorizer: org.apache.cassandra.auth.CassandraAuthorizer
      role_manager: org.apache.cassandra.auth.CassandraRoleManager
    jvm-options:
      initial_heap_size: 800M
      max_heap_size: 800M

Apply the above as follows:

kubectl -n cass-operator apply -f https://raw.githubusercontent.com/datastax/cass-operator/v1.6.0/operator/example-cassdc-yaml/cassandra-3.11.x/example-cassdc-minimal.yaml

You can check the status of pods in the Cassandra cluster as follows:

$ kubectl -n cass-operator get pods --selector cassandra.datastax.com/cluster=cluster1
NAME                         READY   STATUS    RESTARTS   AGE
cluster1-dc1-default-sts-0   2/2     Running   0          26h
cluster1-dc1-default-sts-1   2/2     Running   0          26h
cluster1-dc1-default-sts-2   2/2     Running   0          26h

You can check to see the current progress of bringing the Cassandra datacenter online by checking the cassandraOperatorProgress field of the CassandraDatacenter's status sub-resource as follows:

$ kubectl -n cass-operator get cassdc/dc1 -o "jsonpath={.status.cassandraOperatorProgress}"
Ready

(cassdc and cassdcs are supported short forms of CassandraDatacenter.)

A value of "Ready", as above, means the operator has finished setting up the Cassandra datacenter.

You can also check the Cassandra cluster status using nodetool by invoking it on one of the pods in the Cluster as follows:

$ kubectl -n cass-operator exec -it -c cassandra cluster1-dc1-default-sts-0 -- nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving/Stopped
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.233.105.125  224.82 KiB  1            65.4%             5e29b4c9-aa69-4d53-97f9-a3e26115e625  r1
UN  10.233.92.96    186.48 KiB  1            61.6%             b119eae5-2ff4-4b06-b20b-c492474e59a6  r1
UN  10.233.90.54    205.1 KiB   1            73.1%             0a96e814-dcf6-48b9-a2ca-663686c8a495  r1

The operator creates a secure Cassandra cluster by default, with a new superuser (not the traditional cassandra user) and a random password. You can get those out of a Kubernetes secret and use them to log into your Cassandra cluster for the first time. For example:

$ # get CASS_USER and CASS_PASS variables into the current shell
$ CASS_USER=$(kubectl -n cass-operator get secret cluster1-superuser -o json | jq -r '.data.username' | base64 --decode)
$ CASS_PASS=$(kubectl -n cass-operator get secret cluster1-superuser -o json | jq -r '.data.password' | base64 --decode)
$ kubectl -n cass-operator exec -ti cluster1-dc1-default-sts-0 -c cassandra -- sh -c "cqlsh -u '$CASS_USER' -p '$CASS_PASS'"

Connected to cluster1 at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.

cluster1-superuser@cqlsh> select * from system.peers;

 peer      | data_center | host_id                              | preferred_ip | rack    | release_version | rpc_address | schema_version                       | tokens
-----------+-------------+--------------------------------------+--------------+---------+-----------------+-------------+--------------------------------------+--------------------------
 10.28.0.4 |         dc1 | 4bf5e110-6c19-440e-9d97-c013948f007c |         null | default |          3.11.6 |   10.28.0.4 | e84b6a60-24cf-30ca-9b58-452d92911703 | {'-7957039572378599263'}
 10.28.5.5 |         dc1 | 3e84b0f1-9c1e-4deb-b6f8-043731eaead4 |         null | default |          3.11.6 |   10.28.5.5 | e84b6a60-24cf-30ca-9b58-452d92911703 | {'-3984092431318102676'}

(2 rows)

(Optional) Loading the operator via Helm

Helm may be used to install the operator. Consider installing it from our Helm Charts repo

helm repo add datastax https://datastax.github.io/charts
helm repo update

# Helm 2
helm install datastax/cass-operator

# Helm 3
helm install cass-operator datastax/cass-operator

or via a local checkout

kubectl create namespace cass-operator-system
helm install --namespace=cass-operator-system cass-operator ./charts/cass-operator-chart

The following Helm default values may be overridden:

clusterWideInstall: false
serviceAccountName: cass-operator
clusterRoleName: cass-operator-cr
clusterRoleBindingName: cass-operator-crb
roleName: cass-operator
roleBindingName: cass-operator
webhookClusterRoleName: cass-operator-webhook
webhookClusterRoleBindingName: cass-operator-webhook
deploymentName: cass-operator
deploymentReplicas: 1
defaultImage: "datastax/cass-operator:1.6.0"
imagePullPolicy: IfNotPresent
imagePullSecret: ""

NOTE: roleName and roleBindingName will be used for a clusterRole and clusterRoleBinding if clusterWideInstall is set to true.

NOTE: Helm does not install a storage-class for the cassandra pods.

If clusterWideInstall is set to true, then the operator will be able to administer CassandraDatacenters in all namespaces of the kubernetes cluster. A namespace must still be provided because some of the kubernetes resources for the operator require one.

Example:

kubectl create namespace cass-operator-system
helm install --set clusterWideInstall=true --namespace=cass-operator-system cass-operator ./charts/cass-operator-chart

Using a custom Docker registry with the Helm Chart

A custom Docker registry may be used as the source of the operator Docker image. Before "helm install" is run, a Secret of type "docker-registry" should be created with the proper credentials.

Then the "imagePullSecret" helm value may be set to the name of the ImagePullSecret to cause the custom Docker registry to be used.

Custom Docker registry example: Github packages

Github Packages may be used as a custom Docker registry.

First, a Github personal access token must be created.

See:

https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token

Second, the access token will be used to create the Secret:

kubectl create secret docker-registry github-docker-registry --docker-username=USERNAME --docker-password=ACCESSTOKEN --docker-server docker.pkg.github.com

Replace USERNAME with the github username and ACCESSTOKEN with the personal access token.

Now we can run "helm install" with the override value for imagePullSecret. This is often used with an override value for image so that a specific tag can be chosen. Note that the image value should include the full path to the custom registry.

helm install --set image=docker.pkg.github.com/datastax/cass-operator/operator:latest-ubi --set imagePullSecrets=github-docker-registry cass-operator ./charts/cass-operator-chart

Features

  • Proper token ring initialization, with only one node bootstrapping at a time
  • Seed node management - one per rack, or three per datacenter, whichever is more
  • Server configuration integrated into the CassandraDatacenter CRD
  • Rolling reboot nodes by changing the CRD
  • Store data in a rack-safe way - one replica per cloud AZ
  • Scale up racks evenly with new nodes
  • Scale down racks evenly by decommissioning existing nodes
  • Replace dead/unrecoverable nodes
  • Multi DC clusters (limited to one Kubernetes namespace)

All features are documented in the User Documentation.

Containers

The operator is comprised of the following container images working in concert:

Overriding properties of cass-operator created Containers

If the CassandraDatacenter specifies a podTemplateSpec field, then containers with specific names can be used to override default settings in containers that will be created by cass-operator.

Currently cass-operator will create an InitContainer with the name of "server-config-init". Normal Containers that will be created have the names "cassandra", "server-system-logger", and optionally "reaper".

In general, the values specified in this way by the user will override anything generated by cass-operator.

Of special note is that user-specified environment variables, ports, and volumes in the corresponding containers will be added to the values that cass-operator automatically generates for those containers.

apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  name: dc1
spec:
  clusterName: cluster1
  serverType: cassandra
  serverVersion: 3.11.7
  managementApiAuth:
    insecure: {}
  size: 3
  podTemplateSpec:
    spec:
      initContainers:
        - name: "server-config-init"
          env:
          - name: "EXTRA_PARAM"
            value: "123"
      containers:
        - name: "cassandra"
          terminationMessagePath: "/dev/other-termination-log"
          terminationMessagePolicy: "File"
  storageConfig:
    cassandraDataVolumeClaimSpec:
      storageClassName: server-storage
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
  config:
    cassandra-yaml:
      authenticator: org.apache.cassandra.auth.PasswordAuthenticator
      authorizer: org.apache.cassandra.auth.CassandraAuthorizer
      role_manager: org.apache.cassandra.auth.CassandraRoleManager
    jvm-options:
      initial_heap_size: 800M
      max_heap_size: 800M

Requirements

  • Kubernetes cluster, 1.15 or newer.

Contributing

As of version 1.0, Cass Operator is maintained by a team at DataStax and it is part of what powers DataStax Astra. We would love for open source users to contribute bug reports, documentation updates, tests, and features.

Developer setup

Almost every build, test, or development task requires the following pre-requisites...

  • Golang 1.14
  • Docker, either the docker.io packages on Ubuntu, Docker Desktop for Mac, or your preferred docker distribution.
  • mage: There are some tips for using mage in docs/developer/mage.md

Building

The operator uses mage for its build process.

Build the Operator Container Image

This build task will create the operator container image, building or rebuilding the binary from golang sources if necessary:

mage operator:buildDocker

Build the Operator Binary

If you wish to perform ONLY to the golang build or rebuild, without creating a container image:

mage operator:buildGo

Testing

mage operator:testGo

End-to-end Automated Testing

Run fully automated end-to-end tests...

mage integ:run

Docs about testing are here. These work against any k8s cluster with six or more worker nodes.

Manual Local Testing

There are a number of ways to run the operator, see the following docs for more information:

  • k8s targets: A set of mage targets for automating a variety of tasks for several different supported k8s flavors. At the moment, we support KIND, k3d, and gke. These targets can setup and manage a local cluster in either KIND or k3d, and also a remote cluster in gke. Both KIND and k3d can simulate a k8s cluster with multiple worker nodes on a single physical machine, though it's necessary to dial down the database memory requests.

The user documentation also contains information on spinning up your first operator instance that is useful regardless of what Kubernetes distribution you're using to do so.

Not (Yet) Supported Features

  • Cassandra:
    • Integrated data repair solution
    • Integrated backup and restore solution
  • DSE:
    • Advanced Workloads, like Search / Graph / Analytics

Uninstall

This will destroy all of your data!

Delete your CassandraDatacenters first, otherwise Kubernetes will block deletion because we use a finalizer.

kubectl delete cassdcs --all-namespaces --all

Remove the operator Deployment, CRD, etc.

kubectl delete -f https://raw.githubusercontent.com/datastax/cass-operator/v1.6.0/docs/user/cass-operator-manifests-v1.16.yaml

Contacts

For development questions, please reach out on Gitter, or by opening an issue on GitHub.

For usage questions, please visit our Community Forums: https://community.datastax.com

License

Copyright DataStax, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

More Repositories

1

spark-cassandra-connector

DataStax Connector for Apache Spark to Apache Cassandra
Scala
1,930
star
2

python-driver

DataStax Python Driver for Apache Cassandra
Python
1,371
star
3

nodejs-driver

DataStax Node.js Driver for Apache Cassandra
JavaScript
1,227
star
4

csharp-driver

DataStax C# Driver for Apache Cassandra
C#
623
star
5

php-driver

[MAINTENANCE ONLY] DataStax PHP Driver for Apache Cassandra
C
433
star
6

cpp-driver

DataStax C/C++ Driver for Apache Cassandra
C++
390
star
7

ruby-driver

[MAINTENANCE ONLY] DataStax Ruby Driver for Apache Cassandra
Ruby
227
star
8

graph-book

The Code Examples and Notebooks for The Practitioners Guide to Graph Data
Shell
187
star
9

cql-proxy

A client-side CQL proxy/sidecar.
Go
170
star
10

metric-collector-for-apache-cassandra

Drop-in metrics collection and dashboards for Apache Cassandra
Java
107
star
11

ragstack-ai

RAGStack is an out of the box solution simplifying Retrieval Augmented Generation (RAG) in AI apps.
Python
87
star
12

dsbulk

DataStax Bulk Loader (DSBulk) is an open-source, Apache-licensed, unified tool for loading into and unloading from Apache Cassandra(R), DataStax Astra and DataStax Enterprise (DSE)
Java
76
star
13

docker-images

Docker images published by DataStax.
Shell
73
star
14

dynamo-cassandra-proxy

Preview version of an open source tool that enables developers to run their AWS DynamoDB™ workloads on Apache Cassandra™. With the proxy, developers can run DynamoDB workloads outside of AWS (including on premises, other clouds, and in hybrid configurations).
Java
73
star
15

cstar_perf

Apache Cassandra performance testing platform
Python
72
star
16

zdm-proxy

An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.
Go
71
star
17

ai-chatbot-starter

A starter app to build AI powered chat bots with Astra DB and LlamaIndex
Python
63
star
18

astra-assistants-api

A backend implementation of the OpenAI beta Assistants API
Python
62
star
19

zdm-proxy-automation

An Ansible-based automation suite to deploy and manage the Zero Downtime Migration Proxy
Go
59
star
20

graph-examples

Java
52
star
21

fallout

Distributed System Testing as a Service
Java
51
star
22

pulsar-jms

DataStax Starlight for JMS, a JMS API for Apache Pulsar ®
Java
47
star
23

reactive-pulsar

Reactive Streams adapter for Apache Pulsar Java Client
Java
47
star
24

pulsar-helm-chart

Apache Pulsar Helm chart
Mustache
46
star
25

kafka-examples

Examples of using the DataStax Apache Kafka Connector.
Java
45
star
26

ragbot-starter

An Astra DB and OpenAI chatbot
TypeScript
40
star
27

cassandra-quarkus

An Apache Cassandra(R) extension for Quarkus
Java
39
star
28

wikichat

Python
39
star
29

SwiftieGPT

TypeScript
35
star
30

kaap

KAAP, Kubernetes Autoscaling for Apache Pulsar
Java
34
star
31

pulsar-admin-console

Pulsar Admin Console is a web based UI that administrates topics, namespaces, sources, sinks and various aspects of Apache Pulsar features.
Vue
34
star
32

sstable-to-arrow

Java
33
star
33

simulacron

Simulacron - An Apache Cassandra® Native Protocol Server Simulator
Java
32
star
34

cdc-apache-cassandra

Datastax CDC for Apache Cassandra
Java
32
star
35

code-samples

Code samples from DataStax
Scala
31
star
36

astra-cli

Command Line Interface for DataStax Astra
Java
30
star
37

diagnostic-collection

Set of scripts for collection of diagnostic information from DSE/Cassandra clusters
Python
28
star
38

starlight-for-rabbitmq

Starlight for RabbitMQ, a proxy layer between RabbitMQ/AMQP0.9.1 clients and Apache Pulsar
Java
27
star
39

dse-metric-reporter-dashboards

Prometheus & Grafana dashboards for DSE metric collector
Python
27
star
40

spark-cassandra-stress

A tool for testing the DataStax Spark Connector against Apache Cassandra or DSE
Scala
26
star
41

cla-enforcer

A Contributor License Agreement enforcement bot
Ruby
25
star
42

pulsar-heartbeat

Pulsar Heartbeat monitors Pulsar cluster availability, tracks latency of Pulsar message pubsub, and reports failures of the Pulsar cluster. It produces synthetic workloads to measure end-to-end message pubsub latency.
Go
23
star
43

cassandra-data-migrator

Cassandra Data Migrator - Migrate & Validate data between origin and target Apache Cassandra®-compatible clusters.
Java
22
star
44

cassandra-log4j-appender

Cassandra appenders for Log4j
Java
20
star
45

cassandra-data-apis

Data APIs for Apache Cassandra
Go
19
star
46

labs

DataStax Labs preview program
Java
19
star
47

terraform-provider-astra

A project that allows DataStax Astra users to manage their full database lifecycle for Astra Serverless databases (built on Apache Cassandra(TM)) using Terraform
Go
18
star
48

dc-failover-demo

Fault Tolerant Applications with Apache Cassandra™ Demo
HCL
17
star
49

astra-sdk-java

Set of client side libraries to help with Astra Platform usage
Java
17
star
50

kafka-sink

Apache Kafka® sink for transferring events/messages from Kafka topics to Apache Cassandra®, DataStax Astra and DataStax Enterprise (DSE).
Java
17
star
51

starlight-for-kafka

DataStax - Starlight for Kafka
Java
15
star
52

astrajs

A monorepo containing tools for interacting with DataStax Astra and Stargate
JavaScript
15
star
53

native-protocol

An implementation of the Apache Cassandra® native protocol
Java
14
star
54

astrapy

AstraPy is a Pythonic interface for DataStax Astra DB and the Data API
Python
14
star
55

block-explorer

TypeScript
13
star
56

go-cassandra-native-protocol

Cassandra Native Protocol bindings for the Go language
Go
12
star
57

cassandra-reactive-demo

A demo application that interacts with Apache Cassandra(R) using the Java driver 4.4+ and reactive programming
Java
11
star
58

pulsar-sink

An Apache Pulsar® sink for transferring events/messages from Pulsar topics to Apache Cassandra®, DataStax Astra or DataStax Enterprise (DSE) tables.
Java
11
star
59

adelphi

Automation tool for testing C* OSS that assembles cassandra-diff, nosqlbench, fqltool
Python
9
star
60

pulsar-transformations

Java
9
star
61

gatling-dse-plugin

Scala
8
star
62

snowflake-connector

Datastax Snowflake Sink Connector for Apache Pulsar
Java
8
star
63

gocql-astra

Support for gocql on Astra
Go
8
star
64

dsbulk-migrator

Java
8
star
65

release-notes

Release Notes for DataStax Products
8
star
66

vault-plugin-secrets-datastax-astra

HashiCorp Vault Plugin for Datstax Astra
Go
8
star
67

pulsar-3rdparty-connector

This project provides simple templates and instructions to build Apache Pulsar connectors on base of the existing Apache Kafka connectors.
Shell
8
star
68

dsbench-labs

DSBench - A Database Testing Power Tool
7
star
69

remote-junit-runner

JUnit runner that executes tests in a remote JVM
Java
7
star
70

cass-config-builder

Configuration builder for Apache Cassandra based on definitions at datastax/cass-config-definitions
Clojure
7
star
71

astra-db-ts

Typescript client for Astra DB Vector
TypeScript
7
star
72

java-driver-scala-extras

Scala extensions and utilities for the DataStax Java Driver
Scala
6
star
73

burnell

A proxy to Pulsar cluster
Go
6
star
74

gatling-dse-stress

Scala
5
star
75

astra-client-go

Go
5
star
76

gatling-dse-simcatalog

Scala
4
star
77

java-quotient-filter

A Java Quotient Filter implementation.
Java
4
star
78

pulsar-ansible

Shell
4
star
79

terraform-helm-oci-release

HCL
3
star
80

ds-support-diagnostic-collection

Scripts for collection of diagnostic information from DSE/Cassandra clusters running on various platforms
Shell
3
star
81

go-cassandra-simple-client

A simple Go client for the Cassandra native protocol
3
star
82

cass-config-definitions

Shell
3
star
83

astra-ide-plugin

Kotlin
3
star
84

charts

DataStax Helm Charts
Shell
3
star
85

astra-db-chatbot-starter

Python
2
star
86

java-driver-examples-osgi

Examples showing the usage of the DataStax Java driver in OSGi applications.
Java
2
star
87

nodejs-driver-graph

DataStax Node.js Driver Extensions for DSE Graph
JavaScript
2
star
88

aws-secrets-manager-integration-astra

Python
2
star
89

starlight-for-grpc

Java
2
star
90

astra-streaming-examples

Java
2
star
91

homebrew-luna-streaming-shell

Shell
2
star
92

astra-block-examples

Various Astra Block Examples
TypeScript
2
star
93

cassandra-drivers-smoke-test

Smoke tests for Apache Cassandra using the DataStax Drivers
Shell
2
star
94

junitpytest

JUnit5 plugin to run pytest via Gradle
Java
2
star
95

migration-docs

JavaScript
2
star
96

venice-helm-chart

Smarty
2
star
97

spark-cassandra-connector-devtools

Extra stuff useful for development of spark-cassandra-connector e.g. performance tests
2
star
98

cpp-dse-driver-examples

Examples for using the DataStax C/C++ Enterprise Driver
C
2
star
99

venice

Java
1
star
100

fallout-tests

Python
1
star