• This repository has been archived on 10/Jan/2019
  • Stars
    star
    185
  • Rank 208,271 (Top 5 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created almost 11 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[DEPRECATED] This project is deprecated. It will be archived on December 1, 2017.

DEPRECATED This project has been replaced by DC/OS Cassandra Service https://github.com/mesosphere/mesosphere/dcos-commons/frameworks/cassandra.

Cassandra Mesos Framework

------------

DISCLAIMER This is a very early version of Cassandra-Mesos framework. This document, code behavior, and anything else may change without notice and/or break older installations.


Documentation

Cassandra-Mesos documentation is available on the Cassandra-Mesos GitHub pages site.

Contributing

We heartily welcome external contributions to Cassandra-Mesos's documentation. Documentation should be committed to the master branch and published to our GitHub pages site using the instructions in docs/README.md.

Design

The design document outlining the features and characteristics being targeted by the Cassandra Mesos Framework can be found at the index of the docs.

Current Status

Implemented

  • The framework can register with Mesos, providing a failover timeout so that if the framework disconnects from Mesos tasks will continue to run.
  • The number of nodes, amount of resources (cpu, ram, disk and ports) are all configurable and evaluated when resources offers from Mesos are taken into consideration.
  • cassandra.yaml and varaibles for cassandra-env.sh are provided by the scheduler as part of the task definition.
  • Health checks are performed by the executor and results are sent back to the scheduler using messaging mechanisms provided by Mesos.
  • The Framework can restart and reregister with mesos without killing tasks.
  • The scheduler can send tasks to nodes to perform 'nodetool repair'
  • The scheduler can send tasks to nodes to perform 'nodetool cleanup'
  • The Framework can easily be launched by Marathon allowing for easy installation
  • Repair Job coordination
  • Cleanup Job coordination
  • Replace Node
  • Rolling restart
  • Improved heap calculation to allow for memory mapped files

Near Term Tasks

  • Integration tests
  • Create stress tests to try and simulate real world workloads and to identify bugs in fault tolerance handling

Running the Framework

Currently the recommended way to run the Cassandra-Mesos Framework is via Marathon. A marathon.json from the latest build can be found here.

Once you've downloaded the marathon.json update the MESOS_ZK URL and any other parameters you would like to change. Then POST the marathon.json to your marathon instance and the framework will boostrap itself.

Mesos Node Configuration

You will need to expand the port range managed by Mesos on each node so that it includes the standard Cassandra ports.

This can be done by passing the following flag to the mesos-slave process:

--resources='ports:[31000-32000,7000-7001,7199-7199,9042-9042,9160-9160]'

Configuration

All configuration is handled through environment variables (this lends itself well to being easy to configure marathon to run the framework).

Framework Runtime Configuration

The following environment variables can be used to bootstrap the configuration of the framework. After first run, configuration is read from the framework state in Zookeeper to be consistent across restarts.

# name of the cassandra cluster, this will be part of the framework name in Mesos
CASSANDRA_CLUSTER_NAME=dev-cluster

# Mesos ZooKeeper URL to locate leading master
MESOS_ZK=zk://localhost:2181/mesos

# ZooKeeper URL to be used to store framework state
CASSANDRA_ZK=zk://localhost:2181/cassandra-mesos

# The number of nodes in the cluster (default 3)
CASSANDRA_NODE_COUNT=3

# The number of seed nodes in the cluster (default 2)
# set this to 1, if you only want to spawn one node
CASSANDRA_SEED_COUNT=2

# The number of CPU Cores for each Cassandra Node (default 2.0)
CASSANDRA_RESOURCE_CPU_CORES=2.0

# The number of Megabytes of RAM for each Cassandra Node (default 2048)
CASSANDRA_RESOURCE_MEM_MB=2048

# The number of Megabytes of Disk for each Cassandra Node (default 2048)
CASSANDRA_RESOURCE_DISK_MB=2048

# The number of seconds between each health check of the cassandra node (default 60)
CASSANDRA_HEALTH_CHECK_INTERVAL_SECONDS=60

# The default bootstrap grace time - the minimum interval between two node starts
# You may set this to a lower value in pure local development environments.
CASSANDRA_BOOTSTRAP_GRACE_TIME_SECONDS=120

# The number of seconds that should be used as the mesos framework timeout (default 604800 seconds / 7 days)
CASSANDRA_FAILOVER_TIMEOUT_SECONDS=604800

# The mesos role to used to reserve resources (default *). If this is set, the framework accepts offers that have resources for that role or the default role *
CASSANDRA_FRAMEWORK_MESOS_ROLE=*

# A pre-defined data directory specifying where cassandra should write it's data. 
# Ensure that this directory can be created by the user the framework is running as (default . [mesos sandbox]).
# NOTE:
# This field will be removed once MESOS-1554 is released and the framework will
# be able to allocate the data volume itself.
CASSANDRA_DATA_DIRECTORY=.

System configuration

Cassandra requires some operating system settings. The recommended production settings are described in on the page [Cassandra 2.1 recommended production settings] - please follow this guideline seriously for the operating system user running Cassandra.

Cassandra memory usage

Memory used by Cassandra can be roughly categorized into:

  • Java heap memory. The amount of memory used by the Java VM for heap memory.
  • Off heap memory. Off heap is used for several reasons by Cassandra:
    • index-summary (default: 5% of the heap size) configured in cassandra.yaml - see index_summary_capacity_in_mb default to 5% of the heap size (may exceed)
    • key-cache (default: 5% of the heap size) configured in cassandra.yaml - see key_cache_size_in_mb default to 5% of the heap size
    • row-cache (default: off) configured in cassandra.yaml - see row_cache_size_in_mb (must be explicitly enabled in taskEnv) default to 0
    • counter-cache (default: min(2.5% of Heap (in MB), 50MB)) configured in cassandra.yaml - see counter_cache_size_in_mb default: min(2.5% of Heap (in MB), 50MB) ; 0 means no cache
    • memtables (default on-heap) configured in cassandra.yaml - see file_cache_size_in_mb default to the smaller of 1/4 of heap or 512MB
    • file-cache (default: min(25% of Heap (in MB), 512MB)) configured in cassandra.yaml - see file_cache_size_in_mb default to the smaller of 1/4 of heap or 512MB
    • overhead during flushes/compactions/cleanup implicitly defined by workload
  • OS buffer cache. The amount of (provisioned) memory reserved for the operating system for disk block buffers.

The default configuration simply assumes that you need as much off-heap memory than Java heap memory. It basically divides the provisioned amount of memory by 2 and assigns it to the Java heap.

A good planned production system is sized to meet its workload requirements. That does mean proper values for Cassandra process environment, cassandra.yaml and memory sizing.

You should not run Cassandra (even in test environments) with less than 4 GB configured in memMb. A recommended minimum value for memMb is 16GB. In times where RAM is getting cheaper, provision as much as you can afford - with 8 to 16 GB for memJavaHeapMb. Remember to figure out the really required numbers in load and stress tests with your application.

Rest API

See the Rest API Doc

Build

The Cassandra Mesos Framework is a maven project with modules for the Framework, Scheduler, Executor and Model. Standard maven convention applies. The Framework and Executor are both built as jar-with-dependencies in addition to their standalone jar, so that they are easy to run and distribute.

Install Maven

The Cassandra Mesos Framework requires an install of Maven 3.2.x.

Setup Maven toolchain for protoc

  1. Download version 2.5.0 of protobuf here

  2. Install

  3. Linux (make sure g++ compiler is installed) 1. Run the following commands to build protobuf

    ```
    tar xzf protobuf-2.5.0.tar.gz
    cd protobuf-2.5.0
    ./configure
    make
    ```
    
  4. Create ~/.m2/toolchains.xml with the following contents, Update PROTOBUF_HOME to match the directory you ran make in

<?xml version="1.0" encoding="UTF-8"?>
<toolchains>
  <toolchain>
    <type>protobuf</type>
    <provides>
      <version>2.5.0</version>
    </provides>
    <configuration>
      <protocExecutable>$PROTOBUF_HOME/src/protoc</protocExecutable>
    </configuration>
  </toolchain>
</toolchains>

Resources

Running unit tests

mvn clean test

Packaging artifacts

mvn clean package

If you want to skip running tests when developing locally and rebuilding the packages run the following:

mvn -Dmaven.test.skip=true package

Framework Package

There is a packaging script package.bash that can be used to package the framework and create a marathon.json to run the framework on Marathon

./package.bash package

Generating the marathon.json is dependent upon the great JSON command line tool jq. jq allows for accurate JSON document manipulation using the pipelineing functionality it provides. See package.bash for an example.

Development

For development of the Cassandra Framework you will need access to a Mesos Cluster (for help setting up a cluster see Setting up a Mesosphere Cluster).

The main class of the framework, io.mesosphere.mesos.frameworks.cassandra.framework.Main, can safely be ran from you IDE if that is your preferred development environment.

Run dev-run.bash to startup the framework. You should then be able to see tasks being launched in your Mesos UI.

Configuration

The following environment variables (with example values) should be specified for local development:

# The port the http server used for serving assets to tasks should use.
# In normal operations this dynamic port will be provided by Marathon as part of the task that
# will run the framework
## Any port will do, just so long as it can be bound on your dev machine and is accessible from
## the mesos slaves.
PORT0=18080

# The file path to where the cassandra-mesos-executor jar-with-dependencies is on the local file system
# This file will be served by the built-in http server so that tasks will be able to easily access
# the jar.
EXECUTOR_FILE_PATH=${PROJECT_DIR}/cassandra-mesos-executor/target/cassandra-mesos-executor-0.2.1-SNAPSHOT-jar-with-dependencies.jar

# The file path to where a tar of the Oracle JRE version 7 update 75 is on the local file system.
# This file will be served by the build-in http server so that tasks will be able to easily access
# the jre, and it doesn't have to be provided by the slave host.
JRE_FILE_PATH=${PROJECT_DIR}/target/framework-package/jdk.tar.gz

# The file path to where a tar of Apache Cassandra 2.1.4 is on the local file system.
# This file will be served by the build-in http server so that tasks will be able to easily access
# the cassandra server, and it doesn't have to be provided by the slave host.
CASSANDRA_FILE_PATH=${PROJECT_DIR}/target/framework-package/cassandra.tar.gz

Using Cassandra tools

Support for standard command line tools delivered with Apache Cassandra against clusters running on Apache Mesos is provided using the provided shell scripts starting with com-. These tools use the live nodes API discussed below.

These are:

  • com-cqlsh to invoke cqlsh without bothering about actual endpoints. It connects to any (random) live Cassandra node.
  • com-nodetool to invoke nodetool without bothering about actual endpoints. It connects to any (random) live Cassandra node.
  • com-stress to invoke cassandra-stress without bothering about actual endpoints. It connects to any (random) live Cassandra node.

All these tools are configured using environment variables and special command line options. These command line options must be specified directly after the command name.

Environment variables:

  • CASSANDRA_HOME path to where your local unpacked Apache Cassandra distribution lives. Defaults to .
  • API_HOST host name where the Cassandra-Mesos scheduler is running. Defaults to 127.0.0.1
  • API_PORT port on which the Cassandra-Mesos scheduler is listening. Defaults to 18080

Command line options:

  • --limit N the number of live nodes to use. Has no effect for cqlsh or nodetool.

Important security notice

CVE-2015-0225 describes a security vulnerability in Cassandra, which allows an attacker to execute arbitrary code via JMX/RMI.

Some non-critical tools of Cassandra-Mesos framework rely on some functionality via JMX to be available remotely.

  1. com-nodetool (as nodetool itself) requires the JMX port to be open from outside.
  2. com-qa-report uses com-nodetool for some functionality, means, it requires the JMX port to be open from outside.

Do not open JMX port without authentication and SSL and proper firewall rules unless you know exactly what you are doing! Opening the JMX port will make your Cassandra nodes vulnerable to the security risk. If you are really sure that you explicitly want to expose the JMX port, you can pass the environment variables CASSANDRA_JMX_LOCAL=false and CASSANDRA_JMX_NO_AUTHENTICATION=true to the framework upon initial invocation (i.e. when the framework first registers).

However CASSANDRA-9089 is meant to let JMX listen to a specific IP address, but is is not included in Cassandra 2.1.4.

References:

Resources

Cassandra 2.1 recommended production settings

More Repositories

1

minuteman

[Deprecated] A distributed Load Balancer
632
star
2

dcos-vagrant

Local DC/OS cluster provisioning
Shell
541
star
3

deimos

Mesos containerizer hooks for Docker
Python
249
star
4

docker-containers

Dockerfiles and assets for building Docker containers
Shell
175
star
5

aws-cli

Containerized AWS CLI on alpine to avoid requiring the aws cli to be installed on CI machines.
Shell
164
star
6

dcos-docker

DEPRECATED - Run DC/OS in Docker containers
Shell
159
star
7

mesos-cli

This project has been deprecated. Please use the DC/OS CLI.
Python
116
star
8

dcos-cassandra-service

DEPRECATEDβ€”Open source Apache Cassandra running on DC/OS is now replaced by mesosphere/dcos-commons/frameworks/cassandra. This repository will be deleted at the end of 2017.
Java
116
star
9

dcos-docs

Documentation for DC/OS
HTML
95
star
10

elasticsearch-mesos

Elastic Search on Mesos
Scala
85
star
11

mesos-framework-tutorial

How to create a Mesos Framework in Go
Go
80
star
12

etcd-mesos

self-healing etcd on mesos!
Go
67
star
13

presentations

Slide decks from presentations given around the world.
CSS
67
star
14

iot-demo

IoT - It's the thing you want! And so here's a full-stack demo.
Scala
63
star
15

time-series-demo

A DC/OS time series demo
Scala
62
star
16

terraform-dcos

DC/OS Terraform Installation and Upgrading Scripts
HCL
62
star
17

sssp

S3 Proxy Mesos Framework
Scala
61
star
18

training

Mesosphere Training
Shell
60
star
19

dcos-metrics

The metrics pipeline for DC/OS 1.9-1.11
C++
58
star
20

spartan

[Deprecated] DNS Dispatcher: An RFC5625 Compliant DNS Forwarder
47
star
21

ansible-dcos

[DEPRECATED] Please consider using the Ansible Roles for DC/OS maintained by the Mesosphere SRE team
Python
37
star
22

dcos-gce

Ansible script to install DC/OS on Google Compute Engine
Python
30
star
23

mesos-hydra

MPICH2 Hydra scheduler for Apache Mesos.
Python
29
star
24

packet-terraform

Terraform scripts for packet.net
HCL
28
star
25

tweeter-go

Mini twitter clone - Demo application for DC/OS
Go
28
star
26

dcos-jenkins-dind-agent

Jenkins Docker-in-Docker agent
Shell
27
star
27

dcos-bootstrap

Install DC/OS on AWS using a single command
Python
25
star
28

navstar01

[Deprecated] Navstar orchestrates virtual overlay networks using VXLAN.
21
star
29

open-docs

[DEPRECATED] Documentation for Mesosphere supported open source projects.
HTML
20
star
30

multiverse

Experimental packages not ready to be in mesosphere/universe
Python
19
star
31

k8s-bootcamp

Kubernetes Training Bootcamp
CSS
18
star
32

cassandra-kairosdb-tutorial

GitHub stream data demo using KairosDB with Cassandra
Python
17
star
33

dcos-installer-ui-01

JavaScript
17
star
34

dcos-zeppelin

DCOS Zeppelin package
HTML
16
star
35

telemetry-net

Erlang
14
star
36

tf_dcos_core

A Terraform module to install, upgrade, and modify nodes for DC/OS clusters.
Shell
13
star
37

dcos-signal-01

A passive data forwarding service for telemetry and analytics gathering of DC/OS clusters.
Go
13
star
38

dcos-windows

Microsoft Windows support to DCOS
C++
12
star
39

mesos-slave-dind

Mesos Slave with Docker-in-Docker
Shell
12
star
40

community

DC/OS community content
11
star
41

3dt-01

Go
11
star
42

aws-cfn-bootstrap

Track the progress of DCOS launches on AWS
Python
11
star
43

dcos-kubectl

Command line tooling for Kubernetes on DCOS
Python
10
star
44

dcos-cli-docker

DCOS CLI in a Docker Container
Shell
10
star
45

velocity-training

Velocity NYC 2015 training session
Shell
10
star
46

stellar

Light-weight monitoring for DCOS
Python
9
star
47

dcos-tunnel

Python
8
star
48

mesos-overlay-modules

C++
8
star
49

secure-mesos-workshop

MesosCon 2017 workshop material.
Shell
8
star
50

mockserver

A mockserver that allows you to mock XHR, long-polling XHR, server sent events and websocket connections
TypeScript
8
star
51

kubeaddons-configs

DEPRECATED: konvoy addons (see https://github.com/mesosphere/kubernetes-base-addons instead)
Go
7
star
52

service-net

Discovery and routing for location-agnostic services.
Scala
7
star
53

fuzzlr

go-fuzz on Mesos!
Go
6
star
54

MongoDB-01

MongoDB Framework
Go
6
star
55

dcos-management

[WIP] additional subcommands to DC/OS CLI to manage mesos cluster (maintenance, etc.)
Python
6
star
56

dispatch

Execute scripts on your mesos cluster
Python
6
star
57

mesosphere-shared-reactjs

JavaScript
5
star
58

mesos-client

Wraps Mesos Event Stream API into rxjs Observable.
JavaScript
5
star
59

mesos-buildenv-01

Build environment for Mesos allowing builds of Mesos Modules with the same dependencies
Makefile
5
star
60

dcos-swarm

DCOS Swarm CLI
Python
5
star
61

octarine

Go
5
star
62

oscon-smack-stack

Labs for the SMACK Stack workshop at OSCON 2018
Shell
5
star
63

tutorial-artefacts

This repository is for hosting of additional artefacts from DC/OS tutorials hosted on the dcos.io website
Shell
5
star
64

redir

HTTP redirector of DNS SRV records with configurable load-balancing strategies.
Go
5
star
65

software-architecture

Software Architecture Tutorial
5
star
66

charts

Mesosphere Kubernetes-a-a-S Helm charts repository
Smarty
5
star
67

opstools

A collection of tools for DC/OS operators
4
star
68

oscon-mesos-2014

4
star
69

godep-licenses

Godep dependency license report generation tool
Shell
4
star
70

vny

Velocity New York Tutorial Files
4
star
71

boot2dcos

Shell
4
star
72

presentations-community

Slide decks from presentations by/with community partners
CSS
4
star
73

edge-proxy

nginx based reverse proxy for auth and SSL termination
Shell
3
star
74

hue

Hue ported to DCOS
3
star
75

helloworld

Go
3
star
76

dcos-cli-vpn

Shell
3
star
77

connection-manager

Allows to manage connections inside the browser and order them in a queue by priority.
JavaScript
3
star
78

packaging-docs

Documentation for the DC/OS Packaging Subsystem
Shell
3
star
79

kafka-service

For packaging Kafka and deploying with Ubuntu
Shell
3
star
80

edgelb-kubernetes

Kubernetes controller for Edge-LB for easy L4 (Service) and L7 (Ingress) apps load-balancing.
Go
3
star
81

recordio

Provides a function to read records in the RecordIO format from the input string.
JavaScript
3
star
82

dcos-debugging

Go
3
star
83

nifi-containers

Shell
2
star
84

qcon

2
star
85

hackers-at-berkeley

Example files for H@B workshop.
Python
2
star
86

gists

General Store for all gists in order to allow for easier shared editing.
2
star
87

data-service

The foundation for all data access within DC/OS UI
TypeScript
2
star
88

logstash-pkg

Package Logstash. For great justice.
Shell
2
star
89

dcos-test-utils

DEPRECATED: Please use https://github.com/dcos/dcos-test-utils
2
star
90

http-service

Wraps connections managed by the `@dcos/connection-manager` package into an Observable.
JavaScript
2
star
91

reactjs-mixin-01

JavaScript
2
star
92

dcos-ui-common

JavaScript
2
star
93

weave-guide

Guide for installing Weave onto DC/OS - Experimental
2
star
94

oinker-bot

A simple bot that Oinks
Scala
2
star
95

teamcity-slack-notifier

A python script that when run in TeamCity will post alerts to a Slack Channel
Python
2
star
96

extension-kid

πŸ‘Ά UI tool set for DI powered plugin system
TypeScript
2
star
97

less-color-lighten

A LESS plugin for a simple function that blends a given color with white or black to produce a new color that observes the same general hue as the source color
JavaScript
2
star
98

moxy-docker

2
star
99

connections

Provides different connection types with a unified interface
JavaScript
2
star
100

dcsh

#!/usr/local/bin/dcsh > #!/bin/bash
Python
2
star