animesh kumar

Running water never grows stale. Keep flowing!

Connecting to Cassandra – 1

with 13 comments

Cassandra uses the Apache Thrift framework as its client API. Apache Thrift is a remote procedure call framework “scalable cross-language services development”. You can define data types and service interfaces in a thrift definition file, through which the compiler generates the code in your chosen languages. Effectively, it combines a software stack with a code generation engine to build services that work efficiently and seamlessly between a numbers of languages.

Apache Thrift – though is a state of art engineering feat – is not the best choice for a client API, especially for Cassandra.

  1. Cassandra supports multiple nodes, and you can connect to any node anytime. And this is an amazing thing, because if a node falls down, a client can connect to any other node available without pulling system down. Alas! Apache Thrift doesn’t support this inherently, you need to make you client aware of node-failures and write a strategy to pick up a next alive node.
  2. Thrift doesn’t support connection pooling. So, either you connect to the server every time, or keep a connection alive for a longer period of time. Or, perhaps, write a connection pool engine. Sad!

There are few clients available which make these things easier for you. They are like wrapper over Thrift to save you from a lot of nuisance. Anyhow, since even those clients work on top of Thrift, it makes sense to learn Thrift: to make our foundation strong.

Let’s first create a dummy Keyspace for ourselves:

<Keyspace Name="AddressBook">
<ColumnFamily CompareWith="UTF8Type" Name="Users" />

<!-- Necessary for Cassandra -->
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy
</ReplicaPlacementStrategy>
<ReplicationFactor>1</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

We created a new Keyspace “AddressBook” which has a ColumnFamily “Users” with sorting policy of “UTF8Type” type.

Connect to Cassandra Server:

private TTransport transport = null;
private Cassandra.Client client = null;

public Cassandra.Client connect(String host, int port) {
    try {
        transport = new TSocket(host, port);
        TProtocol protocol = new TBinaryProtocol(transport);
        Cassandra.Client client = new Cassandra.Client(protocol);
        transport.open();
        return client;
    } catch (TTransportException e) {
        e.printStackTrace();
    }
    return null;
}

The above code is pretty fundamental:

  1. Opens up a Socket at the given host and port.
  2. Defines a protocol, in this case, it’s binary.
  3. And instantiates the client object.
  4. Returns client object for further operations.

Note: Cassandra uses “9160” as its default port.

Disconnect from Cassandra Server:

public void disconnect() {
    try {
        if (null != transport) {
            transport.flush();
            transport.close();
        }
    } catch (TTransportException e) {
        e.printStackTrace();
    }
}

To close the connection in a descent way, you should invoke “flush” to take care of any data that might still be there in the transport buffer.

Store a data object:

Let’s say, our User object is something like below:

public class User {
    // unique
    private String username;
    private String email;
    private String phone;
    private String zip;

    // getter and setter here.
}

To model one User to Cassandra, we need 3 columns to store email, phone and zip and the name of the row would be username. Right? Let’s create a list to store these columns.

List<ColumnOrSuperColumn> columns = new ArrayList<ColumnOrSuperColumn>();

The List contains ColumnOrSuperColumn objects. Cassandra gives us an aggregate object which can contain either a Column or a SuperColumn. You wonder why? Because, Apache thrift doesn’t support inheritance. Anyways, now we will create columns and store them in this list.

// generate a timestamp.
long timestamp = new Date().getTime();
ColumnOrSuperColumn c = null;

// add email
c = new ColumnOrSuperColumn();
c.setColumn(new Column("email".getBytes("utf-8"), user.getEmail().getBytes("utf-8"), timestamp));
columns.add(c);

// add phone
c = new ColumnOrSuperColumn();
c.setColumn(new Column("phone".getBytes("utf-8"), user.getPhone().getBytes("utf-8"), timestamp));
columns.add(c);

// add zip
c = new ColumnOrSuperColumn();
c.setColumn(new Column("zip".getBytes("utf-8"), user.getZip().getBytes("utf-8"), timestamp));
columns.add(c);

Okay, so we have the list of columns populated. Now, we need a Map which will hold the rows, that is list of columns. Key to this map will be the name of the ColumnFamily.


Map<String, List<ColumnOrSuperColumn>> data = new HashMap<String, List<ColumnOrSuperColumn>>();
data.put("Users", columns); // “Users” is our ColumnFamily Name.

Great. We have everything in place. Now, we will use client.batch_insert to store everything at once. This will create row in the ColumnFamily identified by the given key.


client.batch_insert( "AddressBook",          // Keyspace
                      user.getUsername(),    // Row identifier key
                      data,                  // Map which contains the list of columns.
                      ConsistencyLevel.ANY   // Consistency level. Explained below.
);

ConsistencyLevel parameter is used for both read and write operations to determine when the request made by the client is successful. ConsistencyLevel.ANY means that a write action is successful when it has been written to at least one node. Read Cassandra Wiki for a detailed information.

In the next blog, we will see how to delete and update a record in Casandra.

Advertisements

Written by Animesh

May 24, 2010 at 10:42 am

Posted in Technology

Tagged with , , ,

13 Responses

Subscribe to comments with RSS.

  1. […] https://anismiles.wordpress.com/2010/05/24/connecting-to-cassandra-1/ Possibly related posts: (automatically generated)Bookies on Cassandra Kings WebsiteHow??Link Wray – Apache Categories: Cassandra Tags: Apache Cassandra, Cassandra, Cassandra Database, Cassandra examples, Introduction to Cassandra Comments (0) Trackbacks (0) Leave a comment Trackback […]

  2. Hi,

    I have a requirement of using Cassandra in my application. In my application there is one table with lot of data and most of my application uses that table. Due to lot of data,performance of the application is decreasing when i use that table is in Oracle.

    So, I have decided to use the Cassandra database for that one table and all other tables in oracle. Lot of business logic is dependent on that table.

    No my question is, Can I use the Cassandra for a table which has lot of business logic.

    I am unable to implement lot of where clauses for Cassandra database.

    Is there any supporting tool to use Cassandra in an efficient way?

    Please let me know…
    i am in urgency..

    Thanks in advance

    By Mallik

    mallikarjungunda

    August 18, 2010 at 1:18 pm

  3. hi
    when i want to connect to my cassandra(i use ur code) an error apears, it is Connect.Java:18 IllegalStateException. what is this? and how can i solve it. u know i’m a new programmer and really need ur help.
    and may i ask u to mail the answer to me? please
    i’m really in hurry, in danger of loosing my new work
    oh my god

    saber

    March 10, 2011 at 9:20 pm

  4. oh thanks i solved it, but i’ll ask u more in future.
    thanks for ur post

    saber

    March 10, 2011 at 9:33 pm

  5. Saber, you can shoot any queries you might have anytime. 🙂

    Animesh

    March 10, 2011 at 9:43 pm

  6. is there a follow up of this article, now who can i read data from column family, complete row or some selective columns ?

    Agito

    June 4, 2011 at 11:57 pm

  7. Your entire blog, “Connecting to Cassandra – 1 animesh kumar” was
    in fact definitely worth commenting down here in the comment section!
    Just simply desired to point out u did a fantastic work.
    Thanks for your effort ,Sterling

  8. Good Morning,
    I read your post first time,it’s great.Here i am trying to connect to the cassandra dtabase through java code using Apache Thrift / Astyanax.But i failed to do it at NetBeans.I also follow your code but i cant understand how should i configure it on netbeans please help me to guide proper step.Is there any jar file should attatch to library in netbeans.I have already download ApacheThrift 0.9.0.
    It’ll be very greatfull for you.
    Thank You

    Pankaj kumar

    April 4, 2013 at 9:36 am

    • Pankaj: You should try CQL-3 with Cassandra-1.2, you will love it.

      Animesh

      April 29, 2013 at 8:13 am

  9. can you give me your gmail-id or phone number.

    Pankaj kumar

    April 4, 2013 at 9:52 am

  10. […] anismiles.wordpress.com […]

    Cassandra | Annotary

    July 30, 2013 at 7:45 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: