animesh kumar

Running water never grows stale. Keep flowing!

Kundera: knight in the shining armor!

with 37 comments

[tweetmeme source=”anismiles” only_single=false http://www.URL.com%5D

The idea behind Kundera is to make working with Cassandra drop-dead simple, and fun. Kundera does not reinvent the wheel by making another client library; rather it leverages the existing libraries, and builds – on top of them – a wrap-around API to developers do away with the unnecessary boiler plate codes, and program  a neater, cleaner code that reduces code-complexity and improves quality. And above all, improves productivity.

Download Kundera here: http://code.google.com/p/kundera/

Note: Kundera is now JPA 1.0 compatible, and there are some ensuing changes. You should read about it here: https://anismiles.wordpress.com/2010/07/14/kundera-now-jpa-1-0-compatible/

Objectives:

  • To completely remove unnecessary details, such as Column lists, SuperColumn lists, byte arrays, Data encoding etc.
  • To be able to work directly with Domain models just with the help of annotations
  • To eliminate “code plumbing”, so as to keep the flow of data processing clear and obvious
  • To completely separate out Cassandra and its obvious concerns from application-level logics for robust application development
  • To include the latest Cassandra developments without breaking anything, anywhere in the business layer

Cassandra Data Models

At the very basic level, Cassandra has Column and SuperColumn to hold your data. Column is a tuple with a name, value and a timestamp; while SuperColumn is Column of Columns. Columns are stored in a ColumnFamily, and SuperColumns in SuperColumnFamily. The most important thing to note is that Cassandra is not your old relational database, it is a flat system. No joins, No foreign keys, nothing. Everything you store here is 100% de-normalized.

Read more details here: https://anismiles.wordpress.com/2010/05/18/cassandra-data-model/

Using Kundera

Kundera defines a range of annotations to describe your Entity objects. Kundera is now JPA1.0 compatible. It builds a range of various Annotations, on top of JPA annotations, to suit its needs. Here are the basic rules:

General Rules

  • Entity classes must have a default no-argument constructor.
  • Entity classes must be annotated with @CassandraEntity @Entity (@CassandraEntity annotation is dropped in favor of JPA @Entity)
  • Entity classes for ColumnFamily must be annotated with @ColumnFamily(“column-family-name”)
  • Entity classes for SuperColumnFamily must be annotated with @SuperColumnFamily(“super-column-family-name”)
  • Each entity must have a field annotation with @Id
    • @Id field must of String type. (Since you can define sorting strategies in Cassandra’s storage-conf file, keeping @Id of String type makes life simpler, you will see later)
    • There must be 1 and only 1 @Id per entity.

Note: Kundera works only at property level for now, so all method level annotations are ignored. Idea: keep life simple. 🙂

ColumnFamily Rules

  1. You must define the name of the column family in @ColumnFamily, like @ColumnFamily (“Authors”) Kundera will link this entity class with “Authors” column family.
  2. Entities annotated with @ColumnFamily are scanned for properties for @Colum annotations.
  3. Each such field will qualify to become a Cassandra Column with
    1. Name: name of the property.
    2. Value: value of the property
  4. By default the name of the column will be the name of the property. However, you fancy changing the name, you can override it like, @Column (name=”fancy-name”)
    @Column (name="email")          // override column-name
    String emailAddress;
    
  5. Properties of type Integer, String, Long and Date are inherently supported, rest all will be serialized before they get saved, and de-serialized while getting read. Serialization has some inherent limitations; that is why Kundera discourages you to use custom objects as Cassandra Column properties. However, you are free to do as you want. Just read the serialization tweaks before insanity reins over you, 😉
  6. Kundera also supports Collection and Map properties. However there are few things you must take care of:
    • You must initialize any Collection or Map properties, like
      List<String> list = new ArrayList<String>();
      Set<String> set = new HashSet<String>();
      Map<String, String> map = new HashMap<String, String>();
      
    • Type parameters follow the same rule, described in #5.
    • If you don’t explicitly define the type parameter, elements will be serialized/de-serialized before saving and retrieving.
    • There is no guarantee that the Collection element order will be maintained.
    • Collection and Map both will create as many columns as the number of elements it has.
    • Collection will break into Columns  like,
      1. Name~0: Element at index 0
      2. Name~1: Element at index 1 and so on.

      Name follows rule #4.

    • Map will break into Columns like,
      1. Name~key1: Element at key1
      2. Name~key2: Element at key2 and so on.
    • Again, name follows rule #4.

SuperColumnFamily Rules

  1. You must define the name of the super column family in @SuperColumnFamily, like @SuperColumnFamily (“Posts”) Kundera will link this entity class with “Posts” column family.
  2. Entities annotated with @SuperColumnFamily are scanned for properties for 2 annotations:
    1. @Column and
    2. @SuperColumn
  3. Only properties annotated with both annotations are picked up, and each such property qualifies to become a Column and fall under SuperColumn.
  4. You can define the name of the column like you did for ColumnFamily.
  5. However, you must define the name of the SuperColumn a particular Column must fall under like, @SuperColumn(column = “super-column-name”)
    @Column
    @SuperColumn(column = "post")  // column 'title' will fall under super-column 'post'
    String title;
    
  6. Rest of the things are same as above.

Up and running in 5 minutes

Let’s learn by example. We will create a simple Blog application. We will have Posts, Tags and Authors.

Cassandra data model for “Authors” might be like,

ColumnFamily: Authors = {
    “Eric Long”:{		// row 1
        “email”:{
            name:“email”,
            value:“eric (at) long.com”
        },
        “country”:{
            name:“country”,
            value:“United Kingdom”
        },
        “registeredSince”:{
            name:“registeredSince”,
            value:“01/01/2002”
        }
    },
    ...
}

And data model for “Posts” might be like,

SuperColumnFamily: Posts = {
	“cats-are-funny-animals”:{		// row 1
		“post” :{		// super-column
			“title”:{
				“Cats are funny animals”
			},
			“body”:{
				“Bla bla bla… long story…”
			}
			“author”:{
				“Ronald Mathies”
			}
			“created”:{
				“01/02/2010"
			}
		},
		“tags” :{
			“0”:{
				“cats”
			}
			“1”:{
				“animals”
			}
		}
	},
	// row 2
}

Create a new Cassandra Keyspace: “Blog”

<Keyspace Name="Blog">
<!—family definitions-->

<!-- Necessary for Cassandra -->
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
<ReplicationFactor>1</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

Create 2 column families: SuperColumnFamily for “Posts” and ColumnFamily for “Authors”

<Keyspace Name="Blog">
<!—family definitions-->
<ColumnFamily CompareWith="UTF8Type" Name="Authors"/>
<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" Name="Posts"/>

<!-- Necessary for Cassandra -->
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
<ReplicationFactor>1</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

Create entity classes

Author.java

@Entity			// makes it an entity class
@ColumnFamily ("Authors")	// assign ColumnFamily type and name
public class Author {

    @Id						// row identifier
    String username;

    @Column (name="email")	// override column-name
    String emailAddress;

    @Column
    String country;

    @Column (name="registeredSince")
    Date registered;

    String name;

    public Author () {		// must have a default constructor
    }

    ... // getters/setters etc.
}

Post.java

@Entity					// makes it an entity class
@SuperColumnFamily("Posts")			// assign column-family type and name
public class Post {

	@Id								// row identifier
	String permalink;

	@Column
	@SuperColumn(column = "post")	// column 'title' will be stored under super-column 'post'
	String title;

	@Column
	@SuperColumn(column = "post")
	String body;

	@Column
	@SuperColumn(column = "post")
	String author;

	@Column
	@SuperColumn(column = "post")
	Date created;

	@Column
	@SuperColumn(column = "tags")	// column 'tag' will be stored under super-column 'tags'
	List<String> tags = new ArrayList<String>();

	public Post () {		// must have a default constructor
	}

       ... // getters/setters etc.
}

Note the annotations, match them against the rules described above. Please see how “tags” property has been initialized. This becomes very important because Kundera uses Java Reflection to read and populate the entity classes. Anyways, once we have entity classes in place…

Instantiate EnityManager

Kundera now works as a JPA provider, and here is how you can instantiate EntityManager. https://anismiles.wordpress.com/2010/07/14/kundera-now-jpa-1-0-compatible/#entity-manager

EntityManager manager = new EntityManagerImpl();
manager.setClient(new PelopsClient());
manager.getClient().setKeySpace("Blog");

And that’s about it. You are ready to rock-and-roll like a football. Sorry, I just got swayed with FIFA fever. 😉

Supported Operations

Kundera supports JPA EntityManager based operations, along with JPA queries. Read more here: https://anismiles.wordpress.com/2010/07/14/kundera-now-jpa-1-0-compatible/#entity-operations


Save entities

Post post = ... // new post object
try {
manager.save(post);
} catch (IllegalEntityException e) { e.printStackTrace(); }
catch (EntityNotFoundException e) { e.printStackTrace(); }

If the entity is already saved in Cassandra database, it will be updated, else a new entity will be saved.
Load entity

try {
Post post = manager.load(Post.class, key); // key is the identifier, for our case, "permalink"
} catch (IllegalEntityException e) { e.printStackTrace(); }
catch (EntityNotFoundException e) { e.printStackTrace(); }

Load multiple entities

try {
List posts = manager.load(Post.class, key1, key2, key3...); // key is the identifier, "permalink"
} catch (IllegalEntityException e) { e.printStackTrace(); }
catch (EntityNotFoundException e) { e.printStackTrace(); }

Delete entity

try {
manager.delete(Post.class, key); // key is the identifier, "permalink"
} catch (IllegalEntityException e) { e.printStackTrace(); }
catch (EntityNotFoundException e) { e.printStackTrace(); }


Wow! Was it fun? Was it easy? I’m sure it was. Keep an eye on Kundera, we will be rolling out sooner-than-you-imagine more features like,

  1. Transaction support
  2. More fine-grained methods for better control
  3. Lazy-Loading/Selective-Loading of entity properties and many more.

Written by Animesh

June 30, 2010 at 7:12 pm

Posted in Technology

Tagged with , , , ,

37 Responses

Subscribe to comments with RSS.

  1. […] Here is how to get started with kundera in 5 minutes -https://anismiles.wordpress.com/2010/06/30/kundera-knight-in-the-shining-armor/ […]

  2. Great work, thanks

    Hernan

    July 10, 2010 at 1:46 am

  3. hi Animesh,
    nice work, Any example how can I get all the Entities or all the Keys ?
    -Rishi

    rishi

    July 11, 2010 at 8:48 am

    • Hi Rishi,

      In the current release, this is not possible. However, in a day or two, I will release a JPA compliant version with Querying facility, through which you can easily perform things like these.

      🙂 thanks for interest, though.

      -Animesh

      Animesh

      July 11, 2010 at 10:44 am

  4. […] to try out a persistence API for Cassendra. There’s a JPA implementation for Cassandra: Kundera, as well as JDO implementation, on top (or using) datanucleus: datanucleus-cassandra. Just to […]

  5. Hi Animesh

    We have to use Cassendra in my application. I am trying to implement it.
    I did not find any downloads related kundera in in the link :
    http://code.google.com/p/kundera/

    can you help me in using kundera and cassendra

    Thaks and regards
    Mallik

    mallikarjungunda

    August 6, 2010 at 12:39 pm

  6. Great work on this plugin! I’m working on the Datanucleus JDO one with github. In the local dev copy I’m working on, I also tried to use Lucandra. Unfortunately this bug completely killed my plugin since numeric queries don’t work.

    https://issues.apache.org/jira/browse/CASSANDRA-1235

    I wanted to give you a heads up, this one bit me a a couple of months ago, and it’s still an issue until the fix is released.

    Todd Nine

    August 10, 2010 at 3:59 am

  7. Hi Animesh,

    This is very nice work. I am amazed at the way kundera is designed.
    But, how can we connect to keyspaces with username & password? I mean is this feature already supported / we need to add this?

    Dinesh Ilindra

    September 30, 2010 at 1:31 am

    • Hey Dinesh,

      Glad that you appreciated the effort. I am afraid that Kundera doesn’t yet support binding keyspaces to username/password. We might need to work towards getting this done.

      Animesh

      September 30, 2010 at 11:27 am

      • Sure, I am happy to contribute. Let me know if I would be of any help.

        Dinesh Ilindra

        September 30, 2010 at 9:34 pm

  8. Great work Animesh.

    Question for you: How would using JPA 2 affect your work? What still needs to be done to get that to work?

    Thanks
    bjorn

    Bjorn Harvold

    October 1, 2010 at 8:27 pm

    • Thanks Bjorn.

      JPA2: Though it’s out, I don’t see a substantial movement towards it, possibly because of massive non-relational moves. Idea behind Kundera is to make development with Cassandra easy, joy and fun. I don’t think JPA-2 is what Kundera is looking for.

      Suggestions?

      Animesh

      October 2, 2010 at 10:11 am

  9. Hi Animesh,
    Kundera is indeed very easy to use. Great!
    I’m wondering about the following:
    We’re having a scenario (using your example) that a Person has millions of Posts. How can I delete a Person efficiently with all its posts? Just call ‘entityManager.remove(person)’? Will all posts be fetched in memory before deleting them?

    Regards,
    Ramon

    Ramon

    November 25, 2010 at 5:58 pm

  10. Hi,

    Could you post /update the above example for assigning multiple posts for an Author..

    I added the property and the corresponding getter and setters in Author and when i tried to persists an Author instance it persisted the Author Instance and when i retrieved it it didn’t contain any Posts in it !!

    thanks

    @Column
    @SuperColumn(column = “posts”)
    List posts;

    sateesh

    February 9, 2011 at 7:37 am

  11. Does Kundera build ontop of Hector or does it replace it?

    Please help me understand where the two live on the Cassandra food chain.

    David Engel

    March 17, 2011 at 4:51 am

    • Kundera builds on top of popular Cassandra clients like, Hector and Pelops. Kundera is an ORM and it uses Hector/Pelops to connect to and speak with Cassandra.

      Animesh

      March 17, 2011 at 7:41 am

      • So, in other words, instead of dealing with Hashmaps, one deals with User Defined Class objects? But don’t the Hashmaps contain the key and Value (which is User defined object anyway)?

        Animesh, I am new to No-SQL stuff, can you please explain the purpose of this layer in greater detail, as David Engel requested.

        Thanks.

        Vinay Soni

        April 5, 2011 at 9:16 pm

      • Vinay,

        The idea of Kundera is help people quickly move to non-relational no-sql env fron traditional relation world. In a traditional world, you generally use JPA/Hibernate to abstract your DB, and access your rows/columns from Java objects instead of writing sql queries. right?

        what Kundera does is: it implements JPA api for cassandra. so now, you can model your java objects almost in the same fashion you did for mysql or other db. This way you quickly migrate to cassandra.

        another simple analogy could be:

        JDBC API == Pelops/Hector of cassandra
        Hibernate/JPA == Kundera of Cassandra

        hope it helps you grasp the bigger picture.

        Animesh

        April 5, 2011 at 9:53 pm

      • Nice Explanation. Now that I understand Column, SuperColumn, Column Family, Super column Family, I understand exactly what you mean.

        So, the idea is to provide for JPA like ORM access for No-SQL Cassandra. This will involve building automatic load of related associations – by means of JPA like annotations. Also, JPQL like queries (in Kundera) provide for absence of SQL in Cassandra. (Here, I wonder, how you store the access plan, to avoide parsing the query – may be as an AST?).

        It is a very interesting project that you have invented/ conceptualized. Much like Gevin King’s Hibernate creation in 2002. I hope you take it to a larger platform like Apache or JBoss, so that you can get support of great minds that flock these OSF groups.

        All the very best with the project. I sure am going to dig deeper into Cassandra and Kendera.

        BTW: I am not sure if there is a Kundera forum – is there one?

        Best Regards,

        Vinay

        Vinay

        April 6, 2011 at 8:17 am

  12. Hi Animesh, Why don’t you make this an Apache project? That way, your project’s lifecycle can be properly ensured.

    Best Regards,

    Vinay

    Vinay Soni

    April 5, 2011 at 9:12 pm

    • how to do that?

      Animesh

      April 5, 2011 at 9:54 pm

      • You will have to look at the apache incubator project.

        The Incubator project is the entry path into The Apache Software Foundation (ASF) for projects and codebases wishing to become part of the Foundation’s efforts.

        The Apache Incubator has two primary goals:

        * Ensure all donations are in accordance with the ASF legal standards
        * Develop new communities that adhere to our guiding principles

        http://incubator.apache.org/

        (Hey I would be happy to be a part of your community as a developer. )

        One of the first things is to develop a proposal. The members of the ASF then vote to accept or reject the proposal. Meritocracy is one of the key goals of ASF – so they recognize merit.

        http://incubator.apache.org/guides/proposal.html

        I hope this helps.

        Best Regards,

        Vinay

        Best Regards,

        Vinay

        Vinay

        April 6, 2011 at 7:58 am

  13. Hi

    I am trying to build a prototype using Kundera. However, I am unable to build the maven project. I have raised an issue on the project site.
    Highly appreciate your help in resolving the build issue.

    – Ranga

    Ranga

    April 28, 2011 at 1:32 am

    • Hi,

      Looks like you are trying to build it on Mac. well, i haven’t tried building Kundera on Mac… I will do it and get back to you.

      -Animesh

      Animesh

      April 28, 2011 at 9:20 am

  14. […] who are new to Kundera, should read this to get an idea on what all it […]

  15. Can you say something regarding Enverse (http://www.jboss.org/envers) and Kundera: will they work together?

    -Peter

    Peter

    May 31, 2011 at 8:27 pm

    • Peter,

      I haven’t tried EVNERS with Kundera, but it should work seamlessly, at least theoretically. Evners is bean auditing library and it stores audit data back into the database. SO, if you can somehow direct Evners to store its stuffs onto Cassandra/Mongo/Hbase instead of Oracle/MySql kind of relational DB… you should be god to go.

      -Animesh

      Animesh

      June 1, 2011 at 11:57 am

  16. We are happy to announce release of Kundera 2.0.4

    Kundera is a JPA 2.0 based, Object-Datastore Mapping Library for NoSQL
    Datastores. The idea behind Kundera is to make working with NoSQL Databases
    drop-dead simple and fun. It currently supports Cassandra, HBase,
    MongoDB and MySql.

    Major Changes in this release:
    ———————–

    – Cross-datastore persistence
    – support for relational databases
    – replace solandra with lucene based indexing.
    – Support added for bi-directinal associations.
    – Performance improvement fixes.

    To download, use or contribute to Kundera, visit:
    http://github.com/impetus-opensource/Kundera
    An example twitter like application built using Kundera can be found at:
    http://github.com/impetus-opensource/Kundera-Examples

    NOSQL is as easy as SQL, if done through Kundera !
    Happy working with NoSQL!!

    Sincerely,
    Vivek

    mevivs

    December 12, 2011 at 1:33 pm

  17. A Lot of performance improvment done in this release.
    As a dry run we were able to process 1 million insert in 6 minutes on an AWS instance.

    mevivs

    December 12, 2011 at 1:35 pm

  18. Hi, I am trying to use Kundera with play framework without maven. But i am always getting exceptions. I really do not have much idea about how and where to put persistence.xml, what persistence.xmxl and application.conf should contain and how to create Entity classes.
    Could you please give us an detailed example?
    Thank you

    Emin

    February 2, 2013 at 6:57 pm


Leave a comment