(Quick Reference)

3 Mapping to Cassandra Tables - Reference Documentation

Authors: Paras Lakhani

Version: 5.0.8.RELEASE

3 Mapping to Cassandra Tables

Basic Mapping

The way the GORM for Cassandra plugin works is to map each domain class to a Cassandra table. For example given a domain class such as:

class Person {
    String firstName
    String lastName    
}

This will map onto a Cassandra table called "person" and generate the following table if schema creation is on:

CREATE TABLE person (id uuid, firstname text, lastname text, version bigint, PRIMARY KEY (id));

The plugin transparently adds an implicit id property of type UUID which is auto-generated when an entity is saved.

Data types

In general a property's Java type maps onto a CQL3 data type as listed here Some Java types can map onto more than one CQL3 data type, the default mappings are shown in bold:
  • java.util.UUID - CQL uuid or timeuuid
  • java.lang.String - CQL text or ascii or varchar
  • long / java.lang.Long - CQL bigint or counter

Java byte and short map onto CQL int.

To map onto a different CQL type specify the type attribute in the mapping. Example:

class Person {
    String firstName
    String lastName
    UUID timeuuid    
    String ascii
    String varchar
    long counter

static mapping = { timeuuid type:"timeuuid" ascii type:'ascii' varchar type:'varchar' counter type:'counter' } }

Embedded Collections and Maps

You can map embedded lists, sets and maps of standard CQL data types simply by defining the appropriate collection type:

class Person {
    String firstName
    String lastName    
    List<Integer> scores
    Set<String> friends
    Map<String, String> pets	    
}

...

new Person(friends:['Fred', 'Bob'], pets:[chuck:"Dog", eddie:'Parrot']).save()

There are certain limitations on collections and only the standard CQL data types can be stored inside embedded collections and maps.

When persisting a domain class containing embedded collections or maps using the save method, the entire collection or map is saved or updated to Cassandra. This may not be appropriate if you only want to persist the non-collection properties, in which case you can use the updateSimpleTypes instance method. Example:

def person = Person.get(uuid)
person.age = 31
person.updateSimpleTypes(flush:true)

If you want to add or remove an item from a collection or map and only have that change updated to Cassandra you can use the various dynamic methods listed in the "Domain Classes" section of the right nav. Example:

person.prependToScores(5)
Person.appendToFriends(person.id, 'Barney')
Person.deleteFromPets(person.id, 'eddie', [flush:true])

The last flush:true argument causes the session to flush the pending collection updates to the datastore.

Customized Database Mapping

You may wish to customize how a domain class maps onto a Cassandra table. This is possible using the mapping block as follows:

class Person {
    ..
    static mapping = {
        table "the_person"
    }
}

In this example we see that the Person entity has been mapped to a table called "the_person".

You can also control how an individual property maps onto a table column (the default is to use the property name itself):

class Person {
    ..
    static mapping = {
        firstName column:"first_name"
    }
}

3.1 Identity Generation

By default in Cassandra GORM domain classes are supplied with a UUID-based identifier. So for example the following entity:

class Person {}

Has a property called id of type java.util.UUID. In this case GORM for Cassandra will generate a UUID identifier using java.util.UUID.randomUUID(). For a timeuuid it will generate one using the Java driver

You can name the id property something else, in which case you have to set the name attribute in the mapping:

class Person {
    UUID primary

static mapping = { id name:"primary" } }

Assigned Identifiers

If you want to manually assign an identifier, the following mapping should be used:

class Person {
    String firstName
    String lastName

static mapping = { id name:"lastName", generator:"assigned" } }

Note that it is important to specify generator:"assigned" so GORM can work out whether you are trying to achieve an insert or an update. Example:

def p = new Person(lastName:"Wilson")
// to insert
p.save()
// to update
p.save()

An existing manually-assigned entity will only be updated with the save method if it is in the current persistence session. Otherwise GORM will try to insert the entity again, which will result in an upsert to Cassandra (with no version checking if versioning is on). So if the entity is not in the session and you want to explicitly direct an update to Cassandra then use the update method instead of the save method. Example:

def p = new Person(lastName:"Wilson")
// to insert
p.save()
session.clear() or p.discard()
// to update
p.update()

3.2 Compount Primary Keys

In Cassandra, a compound primary key consists of more than one column and treats only one column as the partition key. The other columns are treated as clustering columns. To define a compound primary key on a domain class, each property that is part of the key has to be defined in the mapping block with a primaryKey map. Example:

class Person  {

String lastName String firstName Integer age = 0 String location

static mapping = { id name:"lastName", primaryKey:[ordinal:0, type:"partitioned"], generator:"assigned" firstName index:true, primaryKey:[ordinal:1, type: "clustered"] age primaryKey:[ordinal:2, type: "clustered"] version false } }

The above mapping will generate the following Cassandra table if schema creation is on:

CREATE TABLE person (lastname text, firstname text, age int, location text, PRIMARY KEY (lastname, firstname, age))

Composite Partition Key

A composite partition key consists of multiple columns and treats more than one column as the partition key. The other columns are treated as clustering columns. To define a composite partition key on a domain class, each property that is a part of the key has to have its primaryKey type attribute set to "partitioned". Example:
class Person  {

String lastName String firstName Integer age = 0 String location

static mapping = { id name:"lastName", primaryKey:[ordinal:0, type:"partitioned"], generator:"assigned" firstName index:true, primaryKey:[ordinal:1, type: "partitioned"] age index:true, primaryKey:[ordinal:0, type: "clustered"] version false } } … CREATE TABLE person (lastname text, firstname text, age int, location text, PRIMARY KEY ((lastname, firstname), age))

The mapping block

The first column of the partition key is always mapped using id, and then the name of the actual property.

You should then add the primaryKey map to all columns of the compound/composite primary key. The two attributes are:

  • ordinal - specifies the order of the column in the compound/composite primary key.
  • type - "partitioned" or "clustered". For a compound primary key only one property is type "partitioned" and the rest are type "clustered". For a composite partition key more than one property is type "partitioned".

Persistence and Querying for Compound/Composite Primary Key domain classes

Where you need to pass in an id to a persistence or query method, use a map containing the components of the compound/composite primary key instead.

Example:

def person = Person.get([firstName:"Bob", lastName: "Wilson", age: 25])
Person.updateProperties([firstName:"Bob", lastName: "Wilson", age: 25], [location: "London"], [flush:true])

3.3 Query Indexes

Basics

Cassandra doesn't require that you specify indices to query. Cassandra supports creating an index on most columns, including a clustering column of a compound primary key or on the partition (primary) key itself. Indexing can impact performance greatly. Before creating an index, be aware of when and when not to create an index.

With that in mind it is important to specify the properties you plan to query using the mapping block:

class Person {
    String name
    static mapping = {
        name index:true
    }
}

The above mapping will generate the following Cassandra index if schema creation is on:

CREATE INDEX IF NOT EXISTS  ON person (name)

3.4 Table Properties

You can configure clustering order, caching, compaction, and a number of other operations that Cassandra performs on a new table.

Clustering order

An explanation is provided in the Cassandra docs. Clustering order can only be used on a clustered primary key, to use it set its order attribute. Example:

class Person {
    ..
    static mapping = {
        id name:"lastName", primaryKey:[ordinal:0, type:"partitioned"], generator:"assigned"
        firstName index:true, primaryKey:[ordinal:1, type: "clustered"], order: "desc" //or order: "asc"
    }
}

Setting a table property

Available properties and their descriptions are defined in the Cassandra docs. If you want to set a table property, define a static tableProperties block. Below is an example of the properties you can set with Cassandra GORM:

class Person {
    ..	
    static mapping = {
        id name:"lastName", primaryKey:[ordinal:0, type:"partitioned"]
        firstName primaryKey:[ordinal:1, type: "clustered"], order:"desc"
    }

static tableProperties = { comment "table comment" "COMPACT STORAGE" true //OR "compact_storage" true replicate_on_write false caching "all" bloom_filter_fp_chance 0.2 read_repair_chance 0.1 dclocal_read_repair_chance 0.2 gc_grace_seconds 900000 compaction class: "SizeTieredCompactionStrategy", bucket_high: 2.5, bucket_low: 0.6, max_threshold: 40, min_threshold: 5, min_sstable_size: 60 compression sstable_compression: "LZ4Compressor", chunk_length_kb: 128, crc_check_chance: 0.85 } }

The above mapping will generate the following Cassandra table if schema creation is on:

CREATE TABLE person (lastname text, firstname text, version bigint, PRIMARY KEY (lastname, firstname)) 
WITH CLUSTERING ORDER BY (firstname DESC) AND comment = 'table comment' AND COMPACT STORAGE AND replicate_on_write = 'false' 
AND caching = 'all' AND bloom_filter_fp_chance = 0.2 AND read_repair_chance = 0.1 AND dclocal_read_repair_chance = 0.2 
AND gc_grace_seconds = 900000 
AND compaction = { 'class' : 'SizeTieredCompactionStrategy', 'bucket_high' : 2.5, 'bucket_low' : 0.6, 'max_threshold' : 40, 'min_threshold' : 5, 
                   'min_sstable_size' : 60 } 
AND compression = { 'sstable_compression' : 'LZ4Compressor', 'chunk_length_kb' : 128, 'crc_check_chance' : 0.85 };

Table property options

  • comment : String
  • "COMPACT STORAGE" OR "compact_storage" : boolean
  • replicate_on_write : boolean
  • caching : "all", "keys_only", "rows_only", "none"
  • bloom_filter_fp_chance : double
  • read_repair_chance : double
  • dclocal_read_repair_chance : double
  • gc_grace_seconds : long
  • compaction : class, tombstone_threshold, tombstone_compaction_interval, min_sstable_size, min_threshold, max_threshold, bucket_low, bucket_high, sstable_size_in_mb
  • compression : sstable_compression, chunk_length_kb, crc_check_chance

3.5 Stateless Mode

GORM for Cassandra supports both stateless and stateful modes for mapping domain classes to Cassandra. In general stateful mapping is superior for write heavy applications and stateless mode better for read heavy applications (particularily when large amounts of data is involved).

Stateful mode

Domain classes are by default stateful, which means when they are read from Cassandra their state is stored in the user session (which is typically bound to the request in Grails). This has several advantages for write heavy applications:

  • GORM can automatically detect whether a call to save() is a an update or an insert and act appropriately
  • GORM can store the current version and therefore implement optimistic locking
  • Repeated reads of the same entity can be retrieved from the cache, thus optimizing reads as well

An example of when a stateful domain class is better is batching (TO BE IMPLEMENTED)

Stateless Domain classes

However, stateful domain classes can cause problems for read-heavy applications. Take for example the following code:

def books = Book.list() // read 100,000 books
for(b in books) {
    println b.title
}

The above example will read 100,000 books and print the title of each. In stateful mode this will almost certainly run out of memory as each Cassandra row is stored in user memory as is each book. Rewriting the code as follows will solve the problem:

Book.withStatelessSession {
    def books = Book.list() // read 100,000 books
    for(b in books) {
        println b.title
    }    
}

Alternatively you can map the domain class as stateless, in which case its state will never be stored in the session:

class Book {
    …
    static mapping = {
        stateless true
    }
}

Disadvantages of Stateless Mode

There are several disadvantages to using stateless domain classes as the default. One disadvantage is that if you are using assigned identifiers GORM cannot detect whether you want to do an insert or an update so you have to be explicit about which one you want:

def b = new Book(id:"The Book")
b.insert()
b.revenue = 100
b.update()

In the above case we use the explicit 'insert' or 'update method to tell Cassandra GORM what to do.