animesh kumar

Running water never grows stale. Keep flowing!

Fiddling with Cassandra 0.7-beta2

with 11 comments

[tweetmeme source=”anismiles” only_single=false http://www.URL.com%5D

I have been dilly-dallying with Cassandra 0.7 for quite some time. My intensions were to build Cassandra 0.7 support into Kundera (a JPA 1.0 compliant ORM library to work with Cassandra). I must admit that often times I was very upset about the lack of documentation on Cassandra and libraries that I had planned to use, Pelops and Hector. So I decided that I should post my findings for your help.

Now since Cassandra 0.7 beta-2 has been released, I will concentrate my talk around this release.

Installing Cassandra 0.7

  • Download 0.7.0-beta2 (released on 2010-10-01) from here: http://cassandra.apache.org/download/
  • Extract the jar to some location say, D:\apache-cassandra-0.7.0-beta2
  • Set CASSANDRA_HOME environment variable to D:\apache-cassandra-0.7.0-beta2
  • You can also update you PATH variable to include $CASSANDRA_HOME/bin
  • Now, to start the server you would need to run this command:
    > cassandra -start

That’s it.

Okay, since you’ve gotten the basics right. I would like to tell you few important things about this new Cassandra release.

  1. Unlike .6.x versions, 0.7.x employs YAML instead of XML, that is, you are going to find cassandra.yaml instead of storage-conf.xml.
  2. 0.7 allows you to manage entire cluster, Keyspaces, Column Families everything from Thrift API.
  3. There is also support for Apache Avro. (I haven’t explored this though, so no more comment)
  4. 0.7 comes with secondary index features. What does it mean? It means, you can look for your data not just by Row Identifier, but also by Column Values. Interesting huh?

If you look into cassandra.yaml, you will find a default Keyspace1 and few Column Families too, but Cassandra doesn’t load them. I am not sure why. Theoretically, everything defined in the yaml file should have been created at the start. I am going to dig around this. Anyways for now, let’s create some Keyspaces and few Column Families ourselves. We can use Thrift API (and Cassandra client which uses Thrift itself) or JMX interface.

Dealing with Cassandra Client

Cassandra comes with a command line interface tool cassandra-cli. This tool is really really impressive. You should certainly spend some time with it.

  • Start the client,
    > cassandra-cli
  • Connect to server,
    > [default@unknown] connect localhost/9160
  • Create a new keyspace, (I picked this up from cassandra.yaml)
    > [default@unknown] create keyspace Keyspace1 with replication_factor=1
  • Create Column Families,
    > [default@unknown] use Keyspace1
    > [default@Keyspace1] create column family Standard1 with column_type = ‘Standard’ and comparator = ‘BytesType’
  • Describe keyspace,
    > [default@Keyspace1] describe keyspace Keyspace1

And so on. Use ‘help’ to learn more about cassandra-cli.

JConsole

As I mentioned above, you can also use JMX to check what Keyspaces and Column Families exist in your server. But there is a little problem. Cassandra does not come with the mx4j-tools.jar, so you need to download and copy this jar to Cassandra’s lib folder. Download it from here:  http://www.java2s.com/Code/Jar/MNOPQR/Downloadmx4jtoolsjar.htm

Now, just run ‘jconsole’ and pick ‘org.apache.cassandra.thrift.CassandraDaemon’ process.

Java clientèle

Well, there are two serious contenders, Pelops and Hector. Both have released experimental support for Version 0.7. I had worked with Pelops earlier, so I thought this is time to give Hector a chance.

  • Download Hector (Sync release with Cassandra 0.7.0-beta2) from here: http://github.com/rantav/hector/downloads
    You can also use ‘git clone‘ to download the latest source.
  • Hector is a maven project. To compile the source into ‘jar’, just extract the release and run,
    > mvn package

My first program

To start with Hector, I thought to write a very small code to insert a Column and then later fetch it back. If you remember, in the previous section, we already created a keyspace ‘Keyspace1‘ and a Column Family ‘Standard1‘, and not we are going to make use of them.

import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.exceptions.HectorException;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.mutation.Mutator;
import me.prettyprint.hector.api.query.ColumnQuery;
import me.prettyprint.hector.api.query.QueryResult;

public class HectorFirstExample {

	public static void main(String[] args) throws Exception {

		String keyspaceName = "Keyspace1";
		String columnFamilyName = "Standard1";
		String serverAddress = "localhost:9160";

		// Create Cassandra cluster
		Cluster cluster = HFactory.getOrCreateCluster("Cluster-Name", serverAddress);
		// Create Keyspace
		Keyspace keyspace = HFactory.createKeyspace(keyspaceName, cluster);

		try {
			// Mutation
			Mutator mutator = HFactory.createMutator(keyspace, StringSerializer.get());
			// Insert a new column with row-id 'id-1'
			mutator.insert("id-1", columnFamilyName, HFactory.createStringColumn("Animesh", "Kumar"));

			// Look up the same column
			ColumnQuery columnQuery = HFactory.createStringColumnQuery(keyspace);
			columnQuery.setColumnFamily(columnFamilyName).setKey("id-1").setName("Animesh");
			QueryResult> result = columnQuery.execute();

			System.out.println("Read HColumn from cassandra: " + result.get());
		} catch (HectorException e) {
			e.printStackTrace();
		}
	}
}

That was simple. By the way, ‘Nate McCall‘ has written a set of example classes to help us understand Hector with Cassandra 0.7. Check it out here: http://github.com/zznate/hector-examples

I am working towards introducing Cassandra 0.7 support in Kundera, and will be publishing my findings intermittently.

Written by Animesh

October 14, 2010 at 9:26 pm

Posted in Technology

Tagged with , , , , ,

11 Responses

Subscribe to comments with RSS.

  1. Awesome .. I am currently trying to get Cassandra-0.7-beta2 & 0.6.5(with hector) to work with Lucandra and found the lack of documentation,etc. troubling as well…but blogs like this help!

    Jeryl Cook

    October 14, 2010 at 11:31 pm

    • Jeryl, problem is that Lucandra uses thrift interface to connect to Cassandra. And 0.7 has major changes in the the API, hence Lucandra won’t work unless you make relevant changes in CassandraUtils.java

      -Animesh

      Animesh

      October 15, 2010 at 9:23 am

  2. Very informative.

    Javier Sotelo

    October 15, 2010 at 10:04 pm

  3. Can already do JPA with Cassandra, using a plugin for DataNucleus written by Todd Nine
    http://www.datanucleus.org/products/accessplatform_2_2/datastores_thirdparty.html

    and that is JPA2.

    Andy Jefferson

    October 23, 2010 at 7:35 pm

  4. Animesh Nerurkar (Goa) demos Adriane Knoppix…

    I found your entry interesting thus I’ve added a Trackback to it on my weblog :)…

    Darel Philip

    November 12, 2010 at 6:24 am

  5. I am new to cassandra db. I want to use the datanucleus plugin to connect with cassandra,Moreover I am not getting relevant samplecode which uses datanucleus. I have already used the hector plugin,I want to try it out with datanucleus.

    Raj

    November 29, 2010 at 4:34 pm

    • You might want to try Kundera instead or may be go to datanucleus site to find some samples.

      Animesh

      November 29, 2010 at 5:59 pm

  6. Hi, Animesh!
    Thanks for the great blog.
    I’m just curious, when do you plan to commit a new version of Kundera with Cassandra 0.7 and secondary indexes support? It is going to be amazing feature. Many people are waiting for it.
    Thanks a lot!

    stoweesh

    February 15, 2011 at 7:13 pm

  7. hi Animesh. to load the keyspaces in your yaml you need to open a console, start the jconsole application, and connect to Cassandra via JMX. Then, execute the operation loadSchemaFromYAML.
    hope this helps

    david lee

    March 30, 2011 at 6:40 am

  8. Hi,
    Kundera code base and tests have been updated. It is now tested for compatibility with Cassandra 0.7.x version onwards.
    For Sample tests, please refer to TestCassandra , QueryTest junits.

    Git repo is
    https://github.com/impetus-opensource/Kundera.git

    mevivs

    June 15, 2011 at 4:13 pm

    • This is freaking awesome Vivek (mevivs). Thanks for the update here. I am gonna tell everyone who are using Kundera to give it a try.

      -Animesh

      Animesh

      June 15, 2011 at 10:18 pm


Leave a reply to Animesh Cancel reply