Accessing Cassandra from Pharo

NoSQL databases are the topic of the day anywhere in the web.

So this is good time to put a tutorial for accessing a Cassandra database from a Pharo Smalltalk image using the Thrift interface (there isn’t a high-level client for accessing Cassandra from Pharo yet). Following instructions were tested on a Debian GNU/Linux Squeeze (testing) amd64 laptop.

Install the required dependencies

As root:

aptitude install libboost-dev automake libtool flex bison pkg-config g++ build-essential ruby-dev python-dev

Create a working directory

As normal user create a working directory (I use my home directory)

mkdir /home/miguel/cassandra
cd /home/miguel/cassandra

Get the thrift svn trunk source code.

The current tar.gz package on the download page of Thift doesn’t include the necessary fixes.

svn co http://svn.apache.org/repos/asf/incubator/thrift/trunk thrift

Get the cassandra code

Go to http://cassandra.apache.org and download 0.5.1 version of Cassandra (here is the mirror I got, yours will likely be different):

wget http://www.devlib.org/apache/cassandra/0.5.1/apache-cassandra-0.5.1-bin.tar.gz
tar zxf apache-cassandra-0.5.1-bin.tar.gz

Get a Pharo image

Go to http://www.pharo-project.org/pharo-download/ and download a Pharo dev or a PharoCore image. I use a PharoCore RC3 image:

wget https://gforge.inria.fr/frs/download.php/26668/PharoCore-1.0-10515rc3.zip
unzip PharoCore-1.0-10515rc3.zip

You now have Thrift, Cassandra and Pharo ready to use.

Compile the Thrift source code

cd thrift/
./bootstrap.sh
./configure
make

Generate the Smalltalk Thrift code for accessing Cassandra

cd ..
./thrift/compiler/cpp/thrift --gen st apache-cassandra-0.5.1/interface/cassandra.thrift

This will generate the file:

gen-st/cassandra.st

in the /home/miguel/cassandra directory (your working directory).

You now have two Smalltalk files:

thrift/lib/st/thrift.st
gen-st/cassandra.st

Load the Smalltalk Thrift code in the Pharo image

Open the Pharo image and file-in the two previous files in that order (first thrift.st and then cassandra.st)

Start and test the Cassandra server

If you have already a Cassandra node, skip this step. If you are testing, stay with me.

cd apache-cassandra-0.5.1/

Edit conf/log4.properties, change the line:

log4j.appender.R.File=/var/log/cassandra/system.log

to:

log4j.appender.R.File=/home/miguel/cassandra/var/log/cassandra/system.log

Edit conf/storage-conf.xml, change the lines:

<CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>/var/lib/cassandra/callouts</CalloutLocation>
<StagingFileDirectory>/var/lib/cassandra/staging</StagingFileDirectory>

to:

<CommitLogDirectory>/home/miguel/cassandra/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/home/miguel/cassandra/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>/home/miguel/cassandra/var/lib/cassandra/callouts</CalloutLocation>
<StagingFileDirectory>/home/miguel/cassandra/var/lib/cassandra/staging</StagingFileDirectory>

Then start the Cassandra server:

./bin/cassandra -f

Connect with the Cassandra provided client (Cassandra started on port 9160):

./bin/cassandra-cli --host localhost --port 9160

Insert a value:

set Keyspace1.Standard1['jsmith']['first'] = ‘John’

Read back the value:

get Keyspace1.Standard1['jsmith']

Connect from Pharo to the Cassandra server

Open a workspace and try inserting 10000 values in the Cassandra server:


"Insert 10000 values"
[| cp result client |
client := CassandraClient binaryOnHost: 'localhost' port: 9160.
cp := ColumnPath new
columnFamily: 'Standard1';
column: 'col1'.
1 to: 10000 do: [ :i |
result := client insertKeyspace: 'Keyspace1'
key: 'row', i asString
columnPath: cp
value: 'v', i asString
timestamp: 1
consistencyLevel: ((Cassandra enums at: 'ConsistencyLevel') at: 'QUORUM').]] timeToRun

Select the code and “print it”. It took 7326 milliseconds in my laptop.

Now read the values from the Cassandra server:


"Read 10000 values"
[| cp result client |
client := CassandraClient binaryOnHost: 'localhost' port: 9160.
cp := ColumnPath new
columnFamily: 'Standard1';
column: 'col1'.

1 to: 10000 do: [ :i |
result := client getKeyspace: 'Keyspace1'
key: 'row', i asString
columnPath: cp
consistencyLevel: ((Cassandra enums at: 'ConsistencyLevel') at: 'QUORUM').]] timeToRun

Select it and “print it”. It took 7977 milliseconds to read back the 10000 values.

Read a value from the cassandra-cli interface:

get Keyspace1.Standard1['row999']

you should get:

cassandra> get Keyspace1.Standard1['row999']
=> (column=col1, value=v999, timestamp=1)
Returned 1 results.

That is it. Adapt the code to your needs.

Cheers


2 Responses to “Accessing Cassandra from Pharo”

Leave a Reply