• Stars
    star
    224
  • Rank 177,792 (Top 4 %)
  • Language
    Scala
  • License
    Apache License 2.0
  • Created almost 11 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A replicated Akka Persistence journal backed by Apache Cassandra

Cassandra Plugins for Akka Persistence

Please note: work in this repository has been discontinued. The official location of the akka-persistence-cassandra project is now here.

Join the chat at https://gitter.im/krasserm/akka-persistence-cassandra

Replicated Akka Persistence journal and snapshot store backed by Apache Cassandra.

Build Status

Dependencies

Latest release

To include the latest release of the Cassandra plugins for Cassandra 2.1.x or 2.2.x into your sbt project, add the following lines to your build.sbt file:

resolvers += "krasserm at bintray" at "http://dl.bintray.com/krasserm/maven"

libraryDependencies += "com.github.krasserm" %% "akka-persistence-cassandra" % "0.7"

This version of akka-persistence-cassandra depends on Akka 2.4 and Scala 2.11.6. It is compatible with Cassandra 2.1.6 or higher (versions < 2.1.6 have a static column bug). Versions of the Cassandra plugins that are compatible with Cassandra 1.2.x are maintained on the cassandra-1.2 branch.

Latest release for Cassandra 3.x

To include the latest release of the Cassandra plugins for Cassandra 3.x into your sbt project, add the following lines to your build.sbt file:

resolvers += "krasserm at bintray" at "http://dl.bintray.com/krasserm/maven"

libraryDependencies += "com.github.krasserm" %% "akka-persistence-cassandra-3x" % "0.6"

This version of akka-persistence-cassandra depends on Akka 2.4 and Scala 2.11.6. It is compatible with Cassandra 3.0.0 or higher.

It implements the following Persistence Queries:

  • allPersistenceIds, currentPersistenceIds
  • eventsByPersistenceId, currentEventsByPersistenceId
  • eventsByTag, currentEventsByTag

Schema changes mean that you can't currently upgrade from a Cassandra 2.x version of the plugin to the Cassandra 3.x version and use existing data.

You should be able to export the data and load it to the new table definition.

Development snapshot

To include a current development snapshot of the Cassandra plugins into your sbt project, add the following lines to your build.sbt file:

resolvers += "OJO Snapshots" at "https://oss.jfrog.org/oss-snapshot-local" 

libraryDependencies += "com.github.krasserm" %% "akka-persistence-cassandra" % "0.7-SNAPSHOT"

This version of akka-persistence-cassandra depends on Akka 2.4 and Scala 2.11.6. It is compatible with Cassandra 2.1.6 or higher (versions < 2.1.6 have a static column bug).

Migrating from 0.3.x (Akka 2.3.x)

Schema and property changes mean that you can't currently upgrade from 0.3 to 0.4 SNAPSHOT and use existing data. This will be addressed in Issue 64.

Journal plugin

Features

  • All operations required by the Akka Persistence journal plugin API are fully supported.
  • The plugin uses Cassandra in a pure log-oriented way i.e. data are only ever inserted but never updated (deletions are made on user request only or by persistent channels, see also Caveats).
  • Writes of messages and confirmations are batched to optimize throughput. See batch writes for details how to configure batch sizes. The plugin was tested to work properly under high load.
  • Messages written by a single processor are partitioned across the cluster to achieve scalability with data volume by adding nodes.

Configuration

To activate the journal plugin, add the following line to your Akka application.conf:

akka.persistence.journal.plugin = "cassandra-journal"

This will run the journal with its default settings. The default settings can be changed with the following configuration keys:

  • cassandra-journal.contact-points. A comma-separated list of contact points in a Cassandra cluster. Default value is [127.0.0.1]. Host:Port pairs are also supported. In that case the port parameter will be ignored.
  • cassandra-journal.port. Port to use to connect to the Cassandra host. Default value is 9042. Will be ignored if the contact point list is defined by host:port pairs.
  • cassandra-journal.keyspace. Name of the keyspace to be used by the plugin. Default value is akka.
  • cassandra-journal.keyspace-autocreate. Boolean parameter indicating whether the keyspace should be automatically created if it doesn't exist. Default value is true.
  • cassandra-journal.keyspace-autocreate-retries. Int parameter which defines a number of retries before giving up on automatic schema creation. Default value is 1.
  • cassandra-journal.table. Name of the table to be used by the plugin. If the table doesn't exist it is automatically created. Default value is messages.
  • cassandra-journal.table-compaction-strategy. Configurations used to configure the CompactionStrategy for the table. Please refer to the tests for example configurations. Default value is SizeTieredCompactionStrategy. Refer to http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html for more information regarding the properties.
  • cassandra-journal.replication-strategy. Replication strategy to use. SimpleStrategy or NetworkTopologyStrategy
  • cassandra-journal.replication-factor. Replication factor to use when a keyspace is created by the plugin. Default value is 1.
  • cassandra-journal.data-center-replication-factors. Replication factor list for data centers, e.g. ["dc1:3", "dc2:2"]. Is only used when replication-strategy is NetworkTopologyStrategy.
  • cassandra-journal.max-message-batch-size. Maximum number of messages that will be batched when using persistAsync. Also used as the max batch size for deletes.
  • cassandra-journal.write-retries. The number of retries when a write request returns a TimeoutException or an UnavailableException. Default value is 3.
  • cassandra-journal.delete-retries. Deletes are achieved using a metadata entry and then the actual messages are deleted asynchronously. Number of retries before giving up. Default value is 3.
  • cassandra-journal.target-partition-size. Target number of messages per cassandra partition. Default value is 500000. Will only go above the target if you use persistAll and persistAllAsync Do not change this setting after table creation (not checked yet).
  • cassandra-journal.max-result-size. Maximum number of entries returned per query. Queries are executed recursively, if needed, to achieve recovery goals. Default value is 50001.
  • cassandra-journal.write-consistency. Write consistency level. Default value is QUORUM.
  • cassandra-journal.read-consistency. Read consistency level. Default value is QUORUM.

The default read and write consistency levels ensure that processors can read their own writes. During normal operation, processors only write to the journal, reads occur only during recovery.

To connect to the Cassandra hosts with credentials, add the following lines:

  • cassandra-journal.authentication.username. The username to use to login to Cassandra hosts. No authentication is set as default.
  • cassandra-journal.authentication.password. The password corresponding to username. No authentication is set as default.

To connect to the Cassandra host with SSL enabled, add the following configuration. For detailed instructions, please refer to the DataStax Cassandra chapter about SSL Encryption.

  • cassandra-journal.ssl.truststore.path. Path to the JKS Truststore file.
  • cassandra-journal.ssl.truststore.password. Password to unlock the JKS Truststore.
  • cassandra-journal.ssl.keystore.path. Path to the JKS Keystore file.
  • cassandra-journal.ssl.keystore.password. Password to unlock JKS Truststore and access the private key (both must use the same password).

To limit the Cassandra hosts this plugin connects with to a specific datacenter, use the following setting:

  • cassandra-journal.local-datacenter. The id for the local datacenter of the Cassandra hosts that this module should connect to. By default, this property is not set resulting in Datastax's standard round robin policy being used.

Caveats

  • Detailed tests under failure conditions are still missing.
  • Range deletion performance (i.e. deleteMessages up to a specified sequence number) depends on the extend of previous deletions
    • linearly increases with the number of tombstones generated by previous permanent deletions and drops to a minimum after compaction
    • linearly increases with the number of plugin-level deletion markers generated by previous logical deletions (recommended: always use permanent range deletions)

These issues are likely to be resolved in future versions of the plugin.

Snapshot store plugin

Features

  • Implements its own handler of the (internal) Akka Persistence snapshot protocol, making snapshot IO fully asynchronous (i.e. does not implement the Akka Persistence snapshot store plugin API directly).

Configuration

To activate the snapshot-store plugin, add the following line to your Akka application.conf:

akka.persistence.snapshot-store.plugin = "cassandra-snapshot-store"

This will run the snapshot store with its default settings. The default settings can be changed with the following configuration keys:

  • cassandra-snapshot-store.contact-points. A comma-separated list of contact points in a Cassandra cluster. Default value is [127.0.0.1]. Host:Port pairs are also supported. In that case the port parameter will be ignored.
  • cassandra-snapshot-store.port. Port to use to connect to the Cassandra host. Default value is 9042. Will be ignored if the contact point list is defined by host:port pairs.
  • cassandra-snapshot-store.keyspace. Name of the keyspace to be used by the plugin. Default value is akka_snapshot.
  • cassandra-snapshot-store.keyspace-autocreate. Boolean parameter indicating whether the keyspace should be automatically created if it doesn't exist. Default value is true.
  • cassandra-snapshot-store.keyspace-autocreate-retries. Int parameter which defines a number of retries before giving up on automatic schema creation. Default value is 1.
  • cassandra-snapshot-store.table. Name of the table to be used by the plugin. If the table doesn't exist it is automatically created. Default value is snapshots.
  • cassandra-snapshot-store.table-compaction-strategy. Configurations used to configure the CompactionStrategy for the table. Please refer to the tests for example configurations. Default value is SizeTieredCompactionStrategy. Refer to http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html for more information regarding the properties.
  • cassandra-snapshot-store.replication-strategy. Replication strategy to use. SimpleStrategy or NetworkTopologyStrategy
  • cassandra-snapshot-store.replication-factor. Replication factor to use when a keyspace is created by the plugin. Default value is 1.
  • cassandra-snapshot-store.data-center-replication-factors. Replication factor list for data centers, e.g. ["dc1:3", "dc2:2"]. Is only used when replication-strategy is NetworkTopologyStrategy.
  • cassandra-snapshot-store.max-metadata-result-size. Maximum number of snapshot metadata to load per recursion (when trying to find a snapshot that matches specified selection criteria). Default value is 10. Only increase this value when selection criteria frequently select snapshots that are much older than the most recent snapshot i.e. if there are much more than 10 snapshots between the most recent one and selected one. This setting is only for increasing load efficiency of snapshots.
  • cassandra-snapshot-store.write-consistency. Write consistency level. Default value is ONE.
  • cassandra-snapshot-store.read-consistency. Read consistency level. Default value is ONE.

To connect to the Cassandra hosts with credentials, add the following lines:

  • cassandra-snapshot-store.authentication.username. The username to use to login to Cassandra hosts. No authentication is set as default.
  • cassandra-snapshot-store.authentication.password. The password corresponding to username. No authentication is set as default.

To connect to the Cassandra host with SSL enabled, add the following configuration. For detailed instructions, please refer to the DataStax Cassandra chapter about SSL Encryption.

  • cassandra-snapshot-store.ssl.truststore.path. Path to the JKS Truststore file.
  • cassandra-snapshot-store.ssl.truststore.password. Password to unlock the JKS Truststore.
  • cassandra-snapshot-store.ssl.keystore.path. Path to the JKS Keystore file.
  • cassandra-snapshot-store.ssl.keystore.password. Password to unlock JKS Truststore and access the private key (both must use the same password).

To limit the Cassandra hosts this plugin connects with to a specific datacenter, use the following setting:

  • cassandra-snapshot-store.local-datacenter. The id for the local datacenter of the Cassandra hosts that this module should connect to. By default, this property is not set resulting in Datastax's standard round robin policy being used.

More Repositories

1

bayesian-machine-learning

Notebooks about Bayesian methods for machine learning
Jupyter Notebook
1,808
star
2

super-resolution

Tensorflow 2.x based implementation of EDSR, WDSR and SRGAN for single image super-resolution
Python
1,496
star
3

perceiver-io

A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training
Python
432
star
4

face-recognition

Deep face recognition with Keras, Dlib and OpenCV
Jupyter Notebook
377
star
5

machine-learning-notebooks

Stanford Machine Learning course exercises implemented with scikit-learn
Jupyter Notebook
341
star
6

fairseq-image-captioning

Transformer-based image captioning extension for pytorch/fairseq
Python
313
star
7

streamz

A combinator library for integrating Functional Streams for Scala (FS2), Akka Streams and Apache Camel
Scala
283
star
8

akka-analytics

Large-scale event processing with Akka Persistence and Apache Spark
Scala
274
star
9

akka-persistence-kafka

A replicated Akka Persistence journal backed by Apache Kafka
Scala
201
star
10

akka-stream-eventsourcing

Event sourcing for Akka Streams
Scala
101
star
11

grails-jaxrs

JAX-RS Plugin for Grails
Groovy
50
star
12

scalaz-camel

A Scala(z)-based DSL for Apache Camel
Scala
50
star
13

ipf

Open eHealth Integration Platform
Java
35
star
14

akka-persistence-testkit

Compatibility testkit for Akka Persistence storage plugins
Scala
21
star
15

bot-with-plan

Separation of planning concerns in ReAct-style LLM agents. Planner fine-tuning on synthetic trajectories.
Python
10
star
16

krasserm.github.io

Jupyter Notebook
9
star
17

camelinaction-appendix-e

akka-camel examples from book Camel in Action - Appendix E (adjusted to the most recent Akka release or development snapshot)
Scala
8
star
18

machine-learning-minis

Minimalistic example code for various machine learning and deep learning topics
Jupyter Notebook
8
star
19

ipf-labs

eHealth Integration Framework Labs
Java
7
star
20

ipf-runtime

OSGi-based runtime environment for IPF applications
Shell
6
star
21

ipf-tools

eHealth Integration Framework Tools
Java
6
star
22

sagemaker-tutorial

Multi-node, multi-GPU training with PyTorch Lightning on SageMaker
Python
5
star
23

eventuate-crdt-example

Example application that uses Eventuate's operation-based CRDTs
Scala
3
star
24

safr

Security Annotation Framework
Java
1
star