Kundera: knight in the shining armor!

[tweetmeme source=”anismiles” only_single=false http://www.URL.com%5D

The idea behind Kundera is to make working with Cassandra drop-dead simple, and fun. Kundera does not reinvent the wheel by making another client library; rather it leverages the existing libraries, and builds – on top of them – a wrap-around API to developers do away with the unnecessary boiler plate codes, and program a neater, cleaner code that reduces code-complexity and improves quality. And above all, improves productivity.

Download Kundera here: http://code.google.com/p/kundera/

Note: Kundera is now JPA 1.0 compatible, and there are some ensuing changes. You should read about it here: https://anismiles.wordpress.com/2010/07/14/kundera-now-jpa-1-0-compatible/

Objectives:

To completely remove unnecessary details, such as Column lists, SuperColumn lists, byte arrays, Data encoding etc.
To be able to work directly with Domain models just with the help of annotations
To eliminate “code plumbing”, so as to keep the flow of data processing clear and obvious
To completely separate out Cassandra and its obvious concerns from application-level logics for robust application development
To include the latest Cassandra developments without breaking anything, anywhere in the business layer

Cassandra Data Models

At the very basic level, Cassandra has Column and SuperColumn to hold your data. Column is a tuple with a name, value and a timestamp; while SuperColumn is Column of Columns. Columns are stored in a ColumnFamily, and SuperColumns in SuperColumnFamily. The most important thing to note is that Cassandra is not your old relational database, it is a flat system. No joins, No foreign keys, nothing. Everything you store here is 100% de-normalized.

Using Kundera

Kundera defines a range of annotations to describe your Entity objects. Kundera is now JPA1.0 compatible. It builds a range of various Annotations, on top of JPA annotations, to suit its needs. Here are the basic rules:

General Rules

Entity classes must have a default no-argument constructor.
Entity classes must be annotated with @CassandraEntity @Entity (@CassandraEntity annotation is dropped in favor of JPA @Entity)
Entity classes for ColumnFamily must be annotated with @ColumnFamily(“column-family-name”)
Entity classes for SuperColumnFamily must be annotated with @SuperColumnFamily(“super-column-family-name”)
Each entity must have a field annotation with @Id
- @Id field must of String type. (Since you can define sorting strategies in Cassandra’s storage-conf file, keeping @Id of String type makes life simpler, you will see later)
- There must be 1 and only 1 @Id per entity.

Note: Kundera works only at property level for now, so all method level annotations are ignored. Idea: keep life simple. 🙂

ColumnFamily Rules

You must define the name of the column family in @ColumnFamily, like @ColumnFamily (“Authors”) Kundera will link this entity class with “Authors” column family.
Entities annotated with @ColumnFamily are scanned for properties for @Colum annotations.
Each such field will qualify to become a Cassandra Column with
1. Name: name of the property.
2. Value: value of the property
By default the name of the column will be the name of the property. However, you fancy changing the name, you can override it like, @Column (name=”fancy-name”)
```
@Column (name="email")          // override column-name
String emailAddress;
```
Properties of type Integer, String, Long and Date are inherently supported, rest all will be serialized before they get saved, and de-serialized while getting read. Serialization has some inherent limitations; that is why Kundera discourages you to use custom objects as Cassandra Column properties. However, you are free to do as you want. Just read the serialization tweaks before insanity reins over you, 😉
Kundera also supports Collection and Map properties. However there are few things you must take care of:
- You must initialize any Collection or Map properties, like
```
List<String> list = new ArrayList<String>();
Set<String> set = new HashSet<String>();
Map<String, String> map = new HashMap<String, String>();
```
- Type parameters follow the same rule, described in #5.
- If you don’t explicitly define the type parameter, elements will be serialized/de-serialized before saving and retrieving.
- There is no guarantee that the Collection element order will be maintained.
- Collection and Map both will create as many columns as the number of elements it has.
- Collection will break into Columns like,
  1. Name~0: Element at index 0
  2. Name~1: Element at index 1 and so on.
  Name follows rule #4.
- Map will break into Columns like,
  1. Name~key1: Element at key1
  2. Name~key2: Element at key2 and so on.

SuperColumnFamily Rules

You must define the name of the super column family in @SuperColumnFamily, like @SuperColumnFamily (“Posts”) Kundera will link this entity class with “Posts” column family.
Entities annotated with @SuperColumnFamily are scanned for properties for 2 annotations:
1. @Column and
2. @SuperColumn
Only properties annotated with both annotations are picked up, and each such property qualifies to become a Column and fall under SuperColumn.
You can define the name of the column like you did for ColumnFamily.
However, you must define the name of the SuperColumn a particular Column must fall under like, @SuperColumn(column = “super-column-name”)
```
@Column
@SuperColumn(column = "post")  // column 'title' will fall under super-column 'post'
String title;
```
Rest of the things are same as above.

Up and running in 5 minutes

Let’s learn by example. We will create a simple Blog application. We will have Posts, Tags and Authors.

Cassandra data model for “Authors” might be like,

ColumnFamily: Authors = {
    “Eric Long”:{		// row 1
        “email”:{
            name:“email”,
            value:“eric (at) long.com”
        },
        “country”:{
            name:“country”,
            value:“United Kingdom”
        },
        “registeredSince”:{
            name:“registeredSince”,
            value:“01/01/2002”
        }
    },
    ...
}

And data model for “Posts” might be like,

SuperColumnFamily: Posts = {
	“cats-are-funny-animals”:{		// row 1
		“post” :{		// super-column
			“title”:{
				“Cats are funny animals”
			},
			“body”:{
				“Bla bla bla… long story…”
			}
			“author”:{
				“Ronald Mathies”
			}
			“created”:{
				“01/02/2010"
			}
		},
		“tags” :{
			“0”:{
				“cats”
			}
			“1”:{
				“animals”
			}
		}
	},
	// row 2
}

Create a new Cassandra Keyspace: “Blog”

<Keyspace Name="Blog">
<!—family definitions-->

<!-- Necessary for Cassandra -->
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
<ReplicationFactor>1</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

Create 2 column families: SuperColumnFamily for “Posts” and ColumnFamily for “Authors”

<Keyspace Name="Blog">
<!—family definitions-->
<ColumnFamily CompareWith="UTF8Type" Name="Authors"/>
<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" Name="Posts"/>

<!-- Necessary for Cassandra -->
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
<ReplicationFactor>1</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

Create entity classes

Author.java

@Entity			// makes it an entity class
@ColumnFamily ("Authors")	// assign ColumnFamily type and name
public class Author {

    @Id						// row identifier
    String username;

    @Column (name="email")	// override column-name
    String emailAddress;

    @Column
    String country;

    @Column (name="registeredSince")
    Date registered;

    String name;

    public Author () {		// must have a default constructor
    }

    ... // getters/setters etc.
}

Post.java

@Entity					// makes it an entity class
@SuperColumnFamily("Posts")			// assign column-family type and name
public class Post {

	@Id								// row identifier
	String permalink;

	@Column
	@SuperColumn(column = "post")	// column 'title' will be stored under super-column 'post'
	String title;

	@Column
	@SuperColumn(column = "post")
	String body;

	@Column
	@SuperColumn(column = "post")
	String author;

	@Column
	@SuperColumn(column = "post")
	Date created;

	@Column
	@SuperColumn(column = "tags")	// column 'tag' will be stored under super-column 'tags'
	List<String> tags = new ArrayList<String>();

	public Post () {		// must have a default constructor
	}

       ... // getters/setters etc.
}

Note the annotations, match them against the rules described above. Please see how “tags” property has been initialized. This becomes very important because Kundera uses Java Reflection to read and populate the entity classes. Anyways, once we have entity classes in place…

Instantiate EnityManager

Kundera now works as a JPA provider, and here is how you can instantiate EntityManager. https://anismiles.wordpress.com/2010/07/14/kundera-now-jpa-1-0-compatible/#entity-manager

EntityManager manager = new EntityManagerImpl(); manager.setClient(new PelopsClient()); manager.getClient().setKeySpace("Blog");
And that’s about it. You are ready to rock-and-roll like a football. Sorry, I just got swayed with FIFA fever. 😉

Supported Operations

Kundera supports JPA EntityManager based operations, along with JPA queries. Read more here: https://anismiles.wordpress.com/2010/07/14/kundera-now-jpa-1-0-compatible/#entity-operations

Save entities
Post post = ... // new post object try { manager.save(post); } catch (IllegalEntityException e) { e.printStackTrace(); } catch (EntityNotFoundException e) { e.printStackTrace(); }
If the entity is already saved in Cassandra database, it will be updated, else a new entity will be saved.
Load entity
try { Post post = manager.load(Post.class, key); // key is the identifier, for our case, "permalink" } catch (IllegalEntityException e) { e.printStackTrace(); } catch (EntityNotFoundException e) { e.printStackTrace(); }
Load multiple entities
try { List posts = manager.load(Post.class, key1, key2, key3...); // key is the identifier, "permalink" } catch (IllegalEntityException e) { e.printStackTrace(); } catch (EntityNotFoundException e) { e.printStackTrace(); }
Delete entity
try { manager.delete(Post.class, key); // key is the identifier, "permalink" } catch (IllegalEntityException e) { e.printStackTrace(); } catch (EntityNotFoundException e) { e.printStackTrace(); }

Wow! Was it fun? Was it easy? I’m sure it was. Keep an eye on Kundera, we will be rolling out sooner-than-you-imagine more features like,

Transaction support
More fine-grained methods for better control
Lazy-Loading/Selective-Loading of entity properties and many more.

Written by Animesh

June 30, 2010 at 7:12 pm

Posted in Technology

Tagged with Cassandra, HPC, Java, Kundera, NoSql

37 Responses

Subscribe to comments with RSS.

[…] Here is how to get started with kundera in 5 minutes -https://anismiles.wordpress.com/2010/06/30/kundera-knight-in-the-shining-armor/ […]

kundera- making life easy for Apache Cassandra users « Sanjay Sharma’s Weblog

July 2, 2010 at 10:24 am

Reply
Great work, thanks

Hernan

July 10, 2010 at 1:46 am

Reply
hi Animesh,
nice work, Any example how can I get all the Entities or all the Keys ?
-Rishi

rishi

July 11, 2010 at 8:48 am

Reply
- Hi Rishi,
  
  In the current release, this is not possible. However, in a day or two, I will release a JPA compliant version with Querying facility, through which you can easily perform things like these.
  
  🙂 thanks for interest, though.
  
  -Animesh
  
  Animesh
  
  July 11, 2010 at 10:44 am
  
  Reply
[…] to try out a persistence API for Cassendra. There’s a JPA implementation for Cassandra: Kundera, as well as JDO implementation, on top (or using) datanucleus: datanucleus-cassandra. Just to […]

Stuff to research: JDO on Cassandra, GIT on Windows, Restlet, VMForce – Gerbrand on ICT

July 31, 2010 at 3:45 am

Reply
Hi Animesh

We have to use Cassendra in my application. I am trying to implement it.
I did not find any downloads related kundera in in the link :
http://code.google.com/p/kundera/

can you help me in using kundera and cassendra

Thaks and regards
Mallik

mallikarjungunda

August 6, 2010 at 12:39 pm

Reply
- Hi Mallik,
  
  You can easily build Kundera on your own using Maven build tool. Here is a small wiki to help you.
  
  http://code.google.com/p/kundera/wiki/BuildSteps
  
  And to understand Cassandra, you can read these blogs:
  https://anismiles.wordpress.com/2010/05/17/cassandra-first-touch/
  https://anismiles.wordpress.com/2010/05/18/cassandra-data-model/
  
  Feel free to write me if you are stuck at any point.
  
  Animesh
  
  August 6, 2010 at 1:24 pm
  
  Reply
  - Apologies for the silly question, but I can’t seem to find the repository location for Kundera!
    
    Arijit
    
    October 13, 2010 at 2:53 pm
  - Extremely sorry, I found the location.
    
    But it appears to be for Cassandra 0.6.3. Is there a newer version of Kundera? I’m using Cassandra 0.6.5 and plan to move on to 0.7 once it’s released.
    
    Arijit
    
    October 13, 2010 at 3:04 pm
  - Kundera should support 0.6.x version seamlessly. Work for 0.7 is in progress. Will soon be rolling out. However, given that 0.7 is still in beta and not stable I would strongly suggest not to use 0.7 in production for some time.
    
    Animesh
    
    October 13, 2010 at 3:19 pm
Great work on this plugin! I’m working on the Datanucleus JDO one with github. In the local dev copy I’m working on, I also tried to use Lucandra. Unfortunately this bug completely killed my plugin since numeric queries don’t work.

https://issues.apache.org/jira/browse/CASSANDRA-1235

I wanted to give you a heads up, this one bit me a a couple of months ago, and it’s still an issue until the fix is released.

Todd Nine

August 10, 2010 at 3:59 am

Reply
Hi Animesh,

This is very nice work. I am amazed at the way kundera is designed.
But, how can we connect to keyspaces with username & password? I mean is this feature already supported / we need to add this?

Dinesh Ilindra

September 30, 2010 at 1:31 am

Reply
- Hey Dinesh,
  
  Glad that you appreciated the effort. I am afraid that Kundera doesn’t yet support binding keyspaces to username/password. We might need to work towards getting this done.
  
  Animesh
  
  September 30, 2010 at 11:27 am
  
  Reply
  - Sure, I am happy to contribute. Let me know if I would be of any help.
    
    Dinesh Ilindra
    
    September 30, 2010 at 9:34 pm
Great work Animesh.

Question for you: How would using JPA 2 affect your work? What still needs to be done to get that to work?

Thanks
bjorn

Bjorn Harvold

October 1, 2010 at 8:27 pm

Reply
- Thanks Bjorn.
  
  JPA2: Though it’s out, I don’t see a substantial movement towards it, possibly because of massive non-relational moves. Idea behind Kundera is to make development with Cassandra easy, joy and fun. I don’t think JPA-2 is what Kundera is looking for.
  
  Suggestions?
  
  Animesh
  
  October 2, 2010 at 10:11 am
  
  Reply
Hi Animesh,
Kundera is indeed very easy to use. Great!
I’m wondering about the following:
We’re having a scenario (using your example) that a Person has millions of Posts. How can I delete a Person efficiently with all its posts? Just call ‘entityManager.remove(person)’? Will all posts be fetched in memory before deleting them?

Regards,
Ramon

Ramon

November 25, 2010 at 5:58 pm

Reply
Hi,

Could you post /update the above example for assigning multiple posts for an Author..

I added the property and the corresponding getter and setters in Author and when i tried to persists an Author instance it persisted the Author Instance and when i retrieved it it didn’t contain any Posts in it !!

thanks

@Column
@SuperColumn(column = “posts”)
List posts;

sateesh

February 9, 2011 at 7:37 am

Reply
Does Kundera build ontop of Hector or does it replace it?

Please help me understand where the two live on the Cassandra food chain.

David Engel

March 17, 2011 at 4:51 am

Reply
- Kundera builds on top of popular Cassandra clients like, Hector and Pelops. Kundera is an ORM and it uses Hector/Pelops to connect to and speak with Cassandra.
  
  Animesh
  
  March 17, 2011 at 7:41 am
  
  Reply
  - So, in other words, instead of dealing with Hashmaps, one deals with User Defined Class objects? But don’t the Hashmaps contain the key and Value (which is User defined object anyway)?
    
    Animesh, I am new to No-SQL stuff, can you please explain the purpose of this layer in greater detail, as David Engel requested.
    
    Thanks.
    
    Vinay Soni
    
    April 5, 2011 at 9:16 pm
  - Vinay,
    
    The idea of Kundera is help people quickly move to non-relational no-sql env fron traditional relation world. In a traditional world, you generally use JPA/Hibernate to abstract your DB, and access your rows/columns from Java objects instead of writing sql queries. right?
    
    what Kundera does is: it implements JPA api for cassandra. so now, you can model your java objects almost in the same fashion you did for mysql or other db. This way you quickly migrate to cassandra.
    
    another simple analogy could be:
    
    JDBC API == Pelops/Hector of cassandra
    Hibernate/JPA == Kundera of Cassandra
    
    hope it helps you grasp the bigger picture.
    
    Animesh
    
    April 5, 2011 at 9:53 pm
  - Nice Explanation. Now that I understand Column, SuperColumn, Column Family, Super column Family, I understand exactly what you mean.
    
    So, the idea is to provide for JPA like ORM access for No-SQL Cassandra. This will involve building automatic load of related associations – by means of JPA like annotations. Also, JPQL like queries (in Kundera) provide for absence of SQL in Cassandra. (Here, I wonder, how you store the access plan, to avoide parsing the query – may be as an AST?).
    
    It is a very interesting project that you have invented/ conceptualized. Much like Gevin King’s Hibernate creation in 2002. I hope you take it to a larger platform like Apache or JBoss, so that you can get support of great minds that flock these OSF groups.
    
    All the very best with the project. I sure am going to dig deeper into Cassandra and Kendera.
    
    BTW: I am not sure if there is a Kundera forum – is there one?
    
    Best Regards,
    
    Vinay
    
    Vinay
    
    April 6, 2011 at 8:17 am
Hi Animesh, Why don’t you make this an Apache project? That way, your project’s lifecycle can be properly ensured.

Best Regards,

Vinay

Vinay Soni

April 5, 2011 at 9:12 pm

Reply
- how to do that?
  
  Animesh
  
  April 5, 2011 at 9:54 pm
  
  Reply
  - You will have to look at the apache incubator project.
    
    The Incubator project is the entry path into The Apache Software Foundation (ASF) for projects and codebases wishing to become part of the Foundation’s efforts.
    
    The Apache Incubator has two primary goals:
    
    * Ensure all donations are in accordance with the ASF legal standards
    * Develop new communities that adhere to our guiding principles
    
    http://incubator.apache.org/
    
    (Hey I would be happy to be a part of your community as a developer. )
    
    One of the first things is to develop a proposal. The members of the ASF then vote to accept or reject the proposal. Meritocracy is one of the key goals of ASF – so they recognize merit.
    
    http://incubator.apache.org/guides/proposal.html
    
    I hope this helps.
    
    Best Regards,
    
    Vinay
    
    Best Regards,
    
    Vinay
    
    Vinay
    
    April 6, 2011 at 7:58 am
Hi

I am trying to build a prototype using Kundera. However, I am unable to build the maven project. I have raised an issue on the project site.
Highly appreciate your help in resolving the build issue.

– Ranga

Ranga

April 28, 2011 at 1:32 am

Reply
- Hi,
  
  Looks like you are trying to build it on Mac. well, i haven’t tried building Kundera on Mac… I will do it and get back to you.
  
  -Animesh
  
  Animesh
  
  April 28, 2011 at 9:20 am
  
  Reply
[…] who are new to Kundera, should read this to get an idea on what all it […]

Working with MongoDB using Kundera « Amresh

May 2, 2011 at 1:22 am

Reply
Can you say something regarding Enverse (http://www.jboss.org/envers) and Kundera: will they work together?

-Peter

Peter

May 31, 2011 at 8:27 pm

Reply
- Peter,
  
  I haven’t tried EVNERS with Kundera, but it should work seamlessly, at least theoretically. Evners is bean auditing library and it stores audit data back into the database. SO, if you can somehow direct Evners to store its stuffs onto Cassandra/Mongo/Hbase instead of Oracle/MySql kind of relational DB… you should be god to go.
  
  -Animesh
  
  Animesh
  
  June 1, 2011 at 11:57 am
  
  Reply
We are happy to announce release of Kundera 2.0.4

Kundera is a JPA 2.0 based, Object-Datastore Mapping Library for NoSQL
Datastores. The idea behind Kundera is to make working with NoSQL Databases
drop-dead simple and fun. It currently supports Cassandra, HBase,
MongoDB and MySql.

Major Changes in this release:
———————–

– Cross-datastore persistence
– support for relational databases
– replace solandra with lucene based indexing.
– Support added for bi-directinal associations.
– Performance improvement fixes.

To download, use or contribute to Kundera, visit:
http://github.com/impetus-opensource/Kundera
An example twitter like application built using Kundera can be found at:
http://github.com/impetus-opensource/Kundera-Examples

NOSQL is as easy as SQL, if done through Kundera !
Happy working with NoSQL!!

Sincerely,
Vivek

mevivs

December 12, 2011 at 1:33 pm

Reply
A Lot of performance improvment done in this release.
As a dry run we were able to process 1 million insert in 6 minutes on an AWS instance.

mevivs

December 12, 2011 at 1:35 pm

Reply
- Wow. that’s quite a figure. Amazing job!
  
  Was this for Cassandra or some other database? What sort of document did you benchmark with?
  
  -Animesh
  
  Animesh
  
  December 12, 2011 at 10:55 pm
  
  Reply
  - This was with cassandra only.
    We have generated a document for this.
    
    We compiled a document , here it is:
    https://github.com/impetus-opensource/Kundera/wiki/Kundera-Performance
    
    Sincerely,
    Vivek
    
    mevivs
    
    December 12, 2011 at 11:10 pm
Hi, I am trying to use Kundera with play framework without maven. But i am always getting exceptions. I really do not have much idea about how and where to put persistence.xml, what persistence.xmxl and application.conf should contain and how to create Entity classes.
Could you please give us an detailed example?
Thank you

Emin

February 2, 2013 at 6:57 pm

Reply
- This link will help you out.
  http://architects.dzone.com/articles/play-nosql-building-nosql
  
  Amresh
  
  July 11, 2013 at 6:11 pm
  
  Reply