Accessing Cassandra from Pharo

NoSQL databases are the topic of the day anywhere in the web.

So this is good time to put a tutorial for accessing a Cassandra database from a Pharo Smalltalk image using the Thrift interface (there isn’t a high-level client for accessing Cassandra from Pharo yet). Following instructions were tested on a Debian GNU/Linux Squeeze (testing) amd64 laptop.

Install the required dependencies

As root:

aptitude install libboost-dev automake libtool flex bison pkg-config g++ build-essential ruby-dev python-dev

Create a working directory

As normal user create a working directory (I use my home directory)

mkdir /home/miguel/cassandra
cd /home/miguel/cassandra

Get the thrift svn trunk source code.

The current tar.gz package on the download page of Thift doesn’t include the necessary fixes.

svn co http://svn.apache.org/repos/asf/incubator/thrift/trunk thrift

Update: When this post was originally written, the patch I did for generating correct code for smalltalk wasn’t part of a released version of thrift, that is the reason you had to get it from subversion trunk. But now is integrated and proper releases are out so there is no need to get thrift from svn, you can just get the tar.gz package from the thift download page (currently version 0.4.0):

http://incubator.apache.org/thrift/download/

uncompress the tar.gz and you’ll get a folder named (in my case):

thrift-0.4.0/

Get the cassandra code

Go to http://cassandra.apache.org and download 0.5.1 version of Cassandra (here is the mirror I got, yours will likely be different):

wget http://www.devlib.org/apache/cassandra/0.5.1/apache-cassandra-0.5.1-bin.tar.gz
tar zxf apache-cassandra-0.5.1-bin.tar.gz

Get a Pharo image

Go to http://www.pharo-project.org/pharo-download/ and download a Pharo dev or a PharoCore image. I use a PharoCore RC3 image:

wget https://gforge.inria.fr/frs/download.php/26668/PharoCore-1.0-10515rc3.zip
unzip PharoCore-1.0-10515rc3.zip

You now have Thrift, Cassandra and Pharo ready to use.

Compile the Thrift source code

cd thrift/


cd thrift-0.4.0/
./bootstrap.sh
./configure
make

Generate the Smalltalk Thrift code for accessing Cassandra

cd ..
./thrift/compiler/cpp/thrift --gen st apache-cassandra-0.5.1/interface/cassandra.thrift
./thrift-0.4.0/compiler/cpp/thrift --gen st apache-cassandra-0.5.1/interface/cassandra.thrift

This will generate the file:

gen-st/cassandra.st

in the /home/miguel/cassandra directory (your working directory).

You now have two Smalltalk files:

thrift/lib/st/thrift.st
gen-st/cassandra.st

Load the Smalltalk Thrift code in the Pharo image

Open the Pharo image and file-in the two previous files in that order (first thrift.st and then cassandra.st)

Start and test the Cassandra server

If you have already a Cassandra node, skip this step. If you are testing, stay with me.

cd apache-cassandra-0.5.1/

Edit conf/log4.properties, change the line:

log4j.appender.R.File=/var/log/cassandra/system.log

to:

log4j.appender.R.File=/home/miguel/cassandra/var/log/cassandra/system.log

Edit conf/storage-conf.xml, change the lines:

<CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>/var/lib/cassandra/callouts</CalloutLocation>
<StagingFileDirectory>/var/lib/cassandra/staging</StagingFileDirectory>

to:

<CommitLogDirectory>/home/miguel/cassandra/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/home/miguel/cassandra/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>/home/miguel/cassandra/var/lib/cassandra/callouts</CalloutLocation>
<StagingFileDirectory>/home/miguel/cassandra/var/lib/cassandra/staging</StagingFileDirectory>

Then start the Cassandra server:

./bin/cassandra -f

Connect with the Cassandra provided client (Cassandra started on port 9160):

./bin/cassandra-cli --host localhost --port 9160

Insert a value:

set Keyspace1.Standard1['jsmith']['first'] = 'John'

Read back the value:

get Keyspace1.Standard1['jsmith']

Connect from Pharo to the Cassandra server

Open a workspace and try inserting 10000 values in the Cassandra server:


"Insert 10000 values"
[| cp result client |
client := CassandraClient binaryOnHost: 'localhost' port: 9160.
cp := ColumnPath new
columnFamily: 'Standard1';
column: 'col1'.
1 to: 10000 do: [ :i |
result := client insertKeyspace: 'Keyspace1'
key: 'row', i asString
columnPath: cp
value: 'v', i asString
timestamp: 1
consistencyLevel: ((Cassandra enums at: 'ConsistencyLevel') at: 'QUORUM').]] timeToRun

Select the code and “print it”. It took 7326 milliseconds in my laptop.

Now read the values from the Cassandra server:


"Read 10000 values"
[| cp result client |
client := CassandraClient binaryOnHost: 'localhost' port: 9160.
cp := ColumnPath new
columnFamily: 'Standard1';
column: 'col1'.

1 to: 10000 do: [ :i |
result := client getKeyspace: 'Keyspace1'
key: 'row', i asString
columnPath: cp
consistencyLevel: ((Cassandra enums at: 'ConsistencyLevel') at: 'QUORUM').]] timeToRun

Select it and "print it". It took 7977 milliseconds to read back the 10000 values.

Read a value from the cassandra-cli interface:

get Keyspace1.Standard1['row999']

you should get:

cassandra> get Keyspace1.Standard1['row999']
=> (column=col1, value=v999, timestamp=1)
Returned 1 results.

That is it. Adapt the code to your needs.

Cheers


4 Responses to “Accessing Cassandra from Pharo”

Leave a Reply