Mar 31 2010

Accessing Cassandra from Pharo

NoSQL databases are the topic of the day anywhere in the web.

So this is good time to put a tutorial for accessing a Cassandra database from a Pharo Smalltalk image using the Thrift interface (there isn’t a high-level client for accessing Cassandra from Pharo yet). Following instructions were tested on a Debian GNU/Linux Squeeze (testing) amd64 laptop.

Install the required dependencies

As root:

aptitude install libboost-dev automake libtool flex bison pkg-config g++ build-essential ruby-dev python-dev

Create a working directory

As normal user create a working directory (I use my home directory)

mkdir /home/miguel/cassandra
cd /home/miguel/cassandra

Get the thrift svn trunk source code.

The current tar.gz package on the download page of Thift doesn’t include the necessary fixes.

svn co http://svn.apache.org/repos/asf/incubator/thrift/trunk thrift

Get the cassandra code

Go to http://cassandra.apache.org and download 0.5.1 version of Cassandra (here is the mirror I got, yours will likely be different):

wget http://www.devlib.org/apache/cassandra/0.5.1/apache-cassandra-0.5.1-bin.tar.gz
tar zxf apache-cassandra-0.5.1-bin.tar.gz

Get a Pharo image

Go to http://www.pharo-project.org/pharo-download/ and download a Pharo dev or a PharoCore image. I use a PharoCore RC3 image:

wget https://gforge.inria.fr/frs/download.php/26668/PharoCore-1.0-10515rc3.zip
unzip PharoCore-1.0-10515rc3.zip

You now have Thrift, Cassandra and Pharo ready to use.

Compile the Thrift source code

cd thrift/
./bootstrap.sh
./configure
make

Generate the Smalltalk Thrift code for accessing Cassandra

cd ..
./thrift/compiler/cpp/thrift --gen st apache-cassandra-0.5.1/interface/cassandra.thrift

This will generate the file:

gen-st/cassandra.st

in the /home/miguel/cassandra directory (your working directory).

You now have two Smalltalk files:

thrift/lib/st/thrift.st
gen-st/cassandra.st

Load the Smalltalk Thrift code in the Pharo image

Open the Pharo image and file-in the two previous files in that order (first thrift.st and then cassandra.st)

Start and test the Cassandra server

If you have already a Cassandra node, skip this step. If you are testing, stay with me.

cd apache-cassandra-0.5.1/

Edit conf/log4.properties, change the line:

log4j.appender.R.File=/var/log/cassandra/system.log

to:

log4j.appender.R.File=/home/miguel/cassandra/var/log/cassandra/system.log

Edit conf/storage-conf.xml, change the lines:

<CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>/var/lib/cassandra/callouts</CalloutLocation>
<StagingFileDirectory>/var/lib/cassandra/staging</StagingFileDirectory>

to:

<CommitLogDirectory>/home/miguel/cassandra/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/home/miguel/cassandra/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>/home/miguel/cassandra/var/lib/cassandra/callouts</CalloutLocation>
<StagingFileDirectory>/home/miguel/cassandra/var/lib/cassandra/staging</StagingFileDirectory>

Then start the Cassandra server:

./bin/cassandra -f

Connect with the Cassandra provided client (Cassandra started on port 9160):

./bin/cassandra-cli --host localhost --port 9160

Insert a value:

set Keyspace1.Standard1['jsmith']['first'] = ‘John’

Read back the value:

get Keyspace1.Standard1['jsmith']

Connect from Pharo to the Cassandra server

Open a workspace and try inserting 10000 values in the Cassandra server:


"Insert 10000 values"
[| cp result client |
client := CassandraClient binaryOnHost: 'localhost' port: 9160.
cp := ColumnPath new
columnFamily: 'Standard1';
column: 'col1'.
1 to: 10000 do: [ :i |
result := client insertKeyspace: 'Keyspace1'
key: 'row', i asString
columnPath: cp
value: 'v', i asString
timestamp: 1
consistencyLevel: ((Cassandra enums at: 'ConsistencyLevel') at: 'QUORUM').]] timeToRun

Select the code and “print it”. It took 7326 milliseconds in my laptop.

Now read the values from the Cassandra server:


"Read 10000 values"
[| cp result client |
client := CassandraClient binaryOnHost: 'localhost' port: 9160.
cp := ColumnPath new
columnFamily: 'Standard1';
column: 'col1'.

1 to: 10000 do: [ :i |
result := client getKeyspace: 'Keyspace1'
key: 'row', i asString
columnPath: cp
consistencyLevel: ((Cassandra enums at: 'ConsistencyLevel') at: 'QUORUM').]] timeToRun

Select it and “print it”. It took 7977 milliseconds to read back the 10000 values.

Read a value from the cassandra-cli interface:

get Keyspace1.Standard1['row999']

you should get:

cassandra> get Keyspace1.Standard1['row999']
=> (column=col1, value=v999, timestamp=1)
Returned 1 results.

That is it. Adapt the code to your needs.

Cheers


Sep 22 2009

Deploying Seaside: Prepare the images

You have a working squeak vm install. Now we will create the directories we’ll use.

Create directories

export WORK=/home/miguel/work
mkdir -p $WORK

export DEPLOY=/home/miguel/example
mkdir -p $DEPLOY/{pharo,magma,backup,logs,scripts,website}

We will use two directories, put them where you want. I chose to put them on my home directory but you can use any other if you wish. The important thing is to have the environment variables correctly assigned. This will ease the following steps.

The work directory is to hold temporary files as we setup the deploy directory. At the end will be discarded.

The deploy directory is what we will populate with the images and other useful scripts to host our Seaside application and data. As you can see has directories for the images (pharo), for the database (magma), for the magma backups (backups), for the logs (currently only has the output of the nohups used to start the images), for the scripts (guess, scripts!) and for the static content of your application, that you wisely have the webserver to serve and not the Seaside server (website). More on this later.

Download PharoCore

Now go to the Pharo download page look for the section “Sources files” and download the Sources zip file. At the moment is:

SqueakV39.sources.zip

Now go to the bottom of the page and follow the link that says “Pharo-core images and other files”. Find the most recent PharoCore zip file. Currently is:

PharoCore-1.0-10451-BETA.zip

but any newer will do.

Unzip this two zip files. The PharoCore will create a directory and inside it will be the image and changes files. The Sources zip contains one file. Now, copy this three files to the $WORK directory:

cp PharoCore-1.0-10451-BETA.image $WORK
cp PharoCore-1.0-10451-BETA.changes $WORK
cp SqueakV39.sources $WORK

So far so good. You have a PharoCore image with its changes file and a sources file. This, together with the virtual machine gives you a complete Pharo environment to work. You can try it:

cd $WORK
$VM PharoCore-1.0-10451-BETA.image

You should see the PharoCore image running. Quit the image WITHOUT saving.

Save scripts

Save the following scripts to the $WORK directory.

magma-image.st:

“Install Magma Server on a PharoCore image”

“Set some preferences”
Preferences enable: #fastDragWindowForMorphic.
Preferences disable: #windowAnimation.
Preferences enable: #updateSavesFile.
Preferences disable: #windowAnimation.

“Update from pharo update stream (only works/recomended for PharoCore)”
Utilities updateFromServer.

“Your name, like MiguelCoba or VincentVanGogh, dont  use spaces or accents, just ASCII”
Author fullName: ‘FirstnameLastname’.

“Install Installer”
ScriptLoader
loadLatestPackage: ‘Installer-Core’ fromSqueaksource: ‘Installer’.

“RFB”
Installer lukas project: ‘unsorted’;
install: ‘RFB’.

“Magma server”
Installer ss project: ‘Magma’;
install: ‘1.0r42 (server)’.

“Configure the packages”
RFBServer current
initializePreferences;
allowEmptyPasswords: false;
allowLocalConnections: true;
allowRemoteConnections: false;
allowInteractiveConnections: true;
connectionTypeDisconnect;
configureForMemoryConservation;
setFullPassword: ‘useyourownpasswordhere’.

“Save with a new name”
SmalltalkImage current saveAs: ‘magma’.
SmalltalkImage current snapshot: true andQuit: true.

This script when executed on a PharoCore image, will set some preferences, apply updates from the Pharo project if available and then install some packages directly from their repositories. The packages installed are RFBServer (a VNC server for Squeak and Pharo) and the Magma server. Finally it configures the packages and saves the image with a new name: magma. Be sure to change your full name and the RFBServer before saving the file.

magma-run.st:

“When a file named magma.shutdown is found on the same directory as the image
this process is triggered and the image is shutdown without saving”
[
[
[ 60 seconds asDelay wait.
(FileDirectory default fileOrDirectoryExists: 'magma.shutdown')
ifTrue: [ SmalltalkImage current snapshot: false andQuit: true ].
(FileDirectory default fileOrDirectoryExists: ‘magma.startvnc’)
ifTrue: [ Project uiProcess resume.  RFBServer start:0 ].
(FileDirectory default fileOrDirectoryExists: ‘magma.stopvnc’)
ifTrue: [ RFBServer stop. Project uiProcess suspend ].
] on: Error do: [ :error | error asDebugEmail ]
] repeat
] forkAt: Processor systemBackgroundPriority.
“To save CPU cycles”
Project uiProcess suspend.

I will explain this script later.

seaside-image.st:

“Install Seaside on a PharoCore image”

“Install packages”

“Comanche”
Installer ss project: ‘KomHttpServer’;
install: ‘DynamicBindings’;
install: ‘KomServices’;
install: ‘KomHttpServer’.

“Seaside”
Installer ss project: ‘Seaside’;
answer: ‘.*username.*’ with: ‘admin’;
answer: ‘.*password.*’ with: ’seaside’;
install: ‘Seaside2.8a1′;
install: ‘Scriptaculous’.

“Seaside Jetsam”
Installer ss project: ‘Jetsam’;
install: ‘Seaside28Jetsam-kph.67′.

“Seaside helper”
Installer ss project: ‘MagmaTester’;
answer:’username’ with:’admin’;
answer:’password’ with:’seaside’;
install: ‘Magma seasideHelper’.

“SeasideProxyTester”
Installer ss project: ‘SeasideExamples’;
install: ‘SeasideProxyTester’.

“Configure the packages”
“Start Seaside”
WAKom startOn: 9001.

“Unregister example apps”
WADispatcher default trimForDeployment.

“Unregister deployed apps”
WADispatcher default
unregister: (WADispatcher default entryPointAt: ‘/browse’);
unregister: (WADispatcher default entryPointAt: ‘/config’).

“Save with a new name”
SmalltalkImage current saveAs: ’seaside’.
SmalltalkImage current snapshot: true andQuit: true.

This script install Seaside 2.8, Magma seasideHelper and the SeasideProxyTester app that we will use to test the setup. Besides, start Seaside on port 9001, unregister unnecessary apps from the Seaside dispatcher and save the image as seaside.

seaside-run.st:

“When a file named seaside.shutdown is found on the same directory as the image
this process is triggered and the image is shutdown without saving”
[
[
[ 60 seconds asDelay wait.
(FileDirectory default fileOrDirectoryExists: 'seaside.shutdown')
ifTrue: [ SmalltalkImage current snapshot: false andQuit: true ]
] on: Error do: [ :error | error asDebugEmail ]
] repeat
] forkAt: Processor systemBackgroundPriority.
“To save CPU cycles”
Project uiProcess suspend.

I will explain this script later.

start_app.sh:

#!/bin/sh

HOME=”/srv/example”
NOHUP=”/usr/bin/nohup”
VM=”/opt/pharo/squeak -mmap 100m -vm-sound-null -vm-display-null”
IMAGES=”$HOME/pharo”
SCRIPTS=”$HOME/scripts”
LOGS=”$HOME/logs”
START_PORT=9001
END_PORT=9004

# Delete command files

[ -f $IMAGES/magma.shutdown ] && rm $IMAGES/magma.shutdown
[ -f $IMAGES/magma.startvnc ] && rm $IMAGES/magma.startvnc
[ -f $IMAGES/magma.stopvnc ] && rm $IMAGES/magma.stopvnc
[ -f $IMAGES/seaside.shutdown ] && rm $IMAGES/seaside.shutdown

# Start the Magma image
echo “Starting Magma image”
$NOHUP $VM $IMAGES/magma.image $SCRIPTS/magma-run.st >> $LOGS/magma.nohup &

# To give Magma time to open the repository
sleep 5

# Start the Seaside images
for PORT in `seq $START_PORT $END_PORT`; do
echo “Starting Seaside image on port: $PORT”
$NOHUP $VM $IMAGES/seaside.image $SCRIPTS/seaside-run.st port $PORT >> $LOGS/seaside$PORT.nohup &
done

I will explain this script later.

Prepare images

Make sure that the previous scripts are saved to the $WORK directory. Then build the images:

cd $WORK

# Build magma image from PharoCore image
$VM PharoCore-1.0-10451-BETA.image $WORK/magma-image.st

# Build seaside image from the magma image
$VM magma.image $WORK/seaside-image.st

This will take the PharoCore image and, by using the scripts given, will build the magma image. Then, using the magma image, will build the seaside image.

The build scripts are based on the scripts included in the pharo-dev and pharo-web images created by Damien Cassou.

The magma-run.st and seaside-run.st scripts are based on the ones Ramon Leon posted on his blog.


Sep 22 2009

Deploying Seaside: Install the Squeak VM

We are going to deploy a Seaside application that uses Magma seasideHelper to connect to a Magma database in a production server. The production server will have an image for the Magma server and serveral Seaside images with the application code. All the images will be PharoCore images with just the necessary packages loaded and nothing else. The users’ sessions will be load balanced and proxied by a web server. The Magma image, the Seaside images and the webserver will be hosted on the production server. For this tutorial we will use the SeasideProxyTester application from the SeasideExamples project on SqueakSource.com. You can use your own code given that you do a few minor modifications that I will detail later.

Although this tutorial host everything on a single production machine, you can use as many servers as you want. For example, you can use a server for Magma, a group of servers for the Seaside images and a hardware load balancer instead of the webserver. Is up to you. I want to keep things simple here and I show a setup that involves the minimal set of parts but that explains the way to deploy a Seaside app to production.

You can follow this instructions directly on the production server and get exactly the same results but I recommend to first do it on your development machine and, when is working correctly, stop it, zip it, copy it to the production server, unzip it and start it. But as you want. I will do it on my development machine and the last part, the copying to production server, is left as an exercise for you (but it is very simple, really).

I will work with a development machine that has Debian GNU/Linux Testing (Squeeze), but my production server has Debian GNU/Linux 5.0 (Lenny). The instructions apply to both unchanged.

You can install and configure your production server following this instructions. So to the first step.

Install the Squeak VM

The version of squeak installed on Debian Lenny and Debian Squeak with the following command:

# aptitude install squeak-vm

installs a vm that is not closure enabled (for explanations search the Squeak or Pharo mailing lists). As we are going to use the PharoCore we need VM that support closures. So we won’t install squeak from the Debian repositories (at least until they ship a closure vm, in which case I will update this instructions) but we will download the VM from the Pharo project download page. There you’ll find a section named Virtual Machines. Download the Linux/Unix VM. Currently is

pharo-vm-0.15.2d-linux.zip

but any newer version will be ok.

Now, as the root user, copy the zip file to /opt, unzip it and make a symbolic link:

cd /opt
unzip pharo-vm-0.15.2d-linux.zip
ln -s pharo-vm-0.15.2d-linux pharo

Back as your normal user, test it:

export VM=/opt/pharo/squeak
$VM -help

This will show the help of the squeak vm executable. Read it carefully to understand the options we will use to start our images.


Sep 18 2009

Deploying Seaside applications

This is the first of a several post about Seaside.

The goal: to deploy a Seaside application to the real world.

Ingredients:

I must say that this is not the only way to do this kind of deployment. Ramon Leon’s post explain other setup and also the Seaside book has a section on scaling Seaside. This is the way I do it.

Some things to discuss first. I will be using Debian GNU/Linux Squeeze but the procedure is the same for other versions of Debian and for other distributions. I think that this setup can be done in Windows too but I don’t plan to check. I will be using Seaside 2.8 because is the current stable version. When the 3.0 release is announced I will update the instructions accordingly. The Magma version will be the 1.0r42 (server) that was recently released and that since this version runs also in Pharo. I will be using Magma seasideHelper that although for the moment isn’t being mantained, the version used here works very well. This is used for easy integration of Magma and Seaside. I will use lighttpd because that is the webserver I use on my sites but the equivalent Apache configuration it is very easy and sometimes shorter that the ones I show. nginx or pound can be used too. The webserver will be used to proxy the requests from the users to the Seaside server images. It will be doing load balancing too. In order to test the setup a simple Seaside application has been developed and uploaded to the SeasideExamples project on SqueakSource. This can be used to verify that the webserver is doing a sticky session kind of load balancing. This is necessary because for a setup of proxied Seaside images a session must be routed always to the same image where it was created. This is not necessary on applications deployed to Gemstone/S or GLASS because on them the session data is persisted to the OODB and restored on subsecuent request no matter what stone the request is handled by. But, in a setup that doesn’t use Gemstone/S the proxy/load balancing server must guarantee that a session is always routed to the same image. Of course, on the first request of a session, the server choose a random image to handle the session so that the users are distributed uniformly between the images.

The setup will be tested with ab, the tool from the apache project for load testing websites. This is a very simple test, as only request the first page of the application. This shows how many new sessions can be created in the cluster of Seaside images. It doesn’t test inner navigation of the application. Of course, the same setup can be tested with tools like Selenium, JMeter, httperf, siege or jcrawler. This is left as an exercise to the reader. One cool feature of ab is that can output the results in a format understood by GNUPlot so that they can be plotted and visualized more easily.

Scripts to automate image setup are provided too. They update a PharoCore image and prepare Magma and Seaside images with just the required packages. Also, they configure an RFBServer (VNC server) so you can connect to the Magma image to do routine tasks as database backup (this at least until you integrate that tasks to your app so you can launch them from Seaside without entering the image).

Finally a script to start the images and procedures to stop them on demand are provided.

So, a lot of steps, but at the end you’ll have a fully deployed app in a server using a domain name and suitable of scaling by adding more images or physical servers behind the proxy/load balancing server. Also, with the current high availability features of Magma you can also increase the persistence performance by adding more magma servers to the setup.

Ready? Lets go…  to the Seaside.


Sep 16 2009

Pharo project

Pharo is a new FOSS Smalltalk. It is a fork of Squeak and since has had a lot of improvements. It was forked after lenghty and sometimes harsh discussions in the squeak mailing list. Looking backwards it seems that was a good decision, but at the time it should be hard to decide to fork a project that has so many hackers/programmers contributing to it and spending effort and energy coding in it. This of course gave the forkers the stigma of traitors, even if not with these words. The fact is that they forked it. The reasons? Among other things, the perceived impossibility to reach a real agreement over important aspects and the sometimes inflexible positions with respect to changes affecting more than the superficial levels of squeak. I don’t claim that these are the main or the principals reasons to fork squeak, but to me, are the most visible ones that improved in the Pharo fork. I’ll explain, but from now on I will call it Pharo, as that is its name. The fork part, well, it is less important each day.

So, the number of changes that Pharo has experienced is amazing, just check the bugtrack and you’ll see that 1125 bugs have been reported to date. From those, more than 900 have been fixed and closed. In fact for the upcoming 1.0 release only 25 remains open. Most of these bugs show that it really were a lot of things that the squeak community wanted to change. Pharo is the place where this is happening. The most visible of all of this things is the user interface. There are a lot of opinions here, but to many people, the colorful, sometimes childish and maybe playschool look & feel of Squeak was a factor against the adoption of Squeak in more “enterprise” (whatever that means) environments. Pharo doesn’t look as a toy anymore. It has a look more akin to the MacOS X look & feel, and that by itself, avoids pointless questions the first time you open an image in front of new potential users. But the changes are not only in the surface but also in the very foundations of the image. Pharo had full closures way before Squeak decided to include them (in fact they are not included in the official image, but in the ongoing unstable image, the trunk image) and, until squeak decides to release a new version, Pharo will be the only (of the two) with full closures available to the masses. Other thing that is remarkable is the policy of unloading all but the fundamental packages from the image and load them only when necessary. This makes the Pharo image very slim (8 MB in the current PharoCore image) and ideally suited to deploy your apps with it. Other things that were purged from the image are the etoys part that not only lack a mantainer, but that already forked years ago to build their own distribution as squeakland.org.

But I don’t want to bash Squeak here, as they are currently having a lot of activity and a lot of things will improve. What I want to say is that Pharo is a refreshing, clean, lean, small and propositive option to develop Smalltalk projects.

Pharo has to versions, PharoCore and Pharo properly. Both of them are Pharo, but they have different audiences.

PharoCore is the image that the Pharo core developers work with and improve. This is the minimal image that runs (by loading the necessary dependencies first) your Smalltalk application. This is also the image that receives updates by means of the Software Update option on the menu inside the image. The test are run over this image also and the results are published on the pharo page for comparison and regression testing. This is the image that the Pharo project works on everyday.

Pharo is an image built using PharoCore as base and loading several handy and useful packages in order to create an appealing, nice, useful product for the people to use. There are several distributions of Pharo that load different packages aimed to distinct people. There is the pharo-dev that includes syntax-coloring, autocompletition, several browsers for the source code and in general packages aimed to developers of Smalltalk applications. There is also the pharo-web that is a pharo-dev but also adds packages as Seaside and Pier and is intended to web developers. These two are built by Damien Cassou. Other new distribution is the one from Torsten Bergmann, the pharo ready made setup that tries to ease the user experience by downloading just one archive to test and try Pharo (currently only for windows users). Also, in the works is a one-click distribution that can run in Windows, GNU/Linux and MacOS X. All of these can be downloaded from the pharo download page.

The ParoCore is, as I already said, perfect for deploying applications to production, as it has a very small footprint. In subsecuent posts I will show how to deploy an app in a PharoCore image with just the required dependencies and nothing else.

Pharo, despite the number of changes that has integrated, it is very stable. As the 1.0 release aproaches, the pharo core developers focus in bug fixing and aren’t allowing new drastic changes. They are reserved for the 1.1 release, that will begin to prepare the next day following the 1.0 announce.

Other good sign is that the Seaside core developers are already working daily on Pharo and some of then even have production sites running on Pharo without any problem. So if you are developing Seaside applications, as I am, this shows that the integration of Seaside with Pharo is very well tested.

Also, a book for Pharo will be announced in the next days, called Pharo by Example, by the same authors of Squeak by Example.

Don’t wait for the upcoming announce of the 1.0 release of Pharo go there right now and become a Pharoer.