Oct 2 2009

Deploying Seaside: Adding SSL to your site

Well, lets add SSL to your site. This step is tricky as you must have a domain registered to your name and a public IP in order to get a SSL certificate. Here we’ll generate a self-signed certificate and will configure lighttpd to use it to encrypt all the traffic between the webserver and the web browser clients.

As you will see you don’t have to configure SSL on the Pharo image. In fact Seaside doesn’t even know anything about SSL or encryption. It is the webserver the responsible of isolate the Seaside images (that in fact aren’t even know by the web browser clients, as they only interact with the webserver. This last one is proxying each request to the Seaside images). The only thing that Seaside must do is to guarantee that every link generated specifies the https protocol. But this is only HTML generation. Isn’t encryption. The encryption is made by the webserver by using of the SSL certificate.

We are going to show the process with the seaside.example.com. The procedure is the same for the magma.example.com but remember, each certificate must use its own IP. So you can’t test both on 127.0.0.1 for example. In a production site with several hosted sites, each one will have its own public IP.

First the prerequisites. Be sure to have a lighttpd with SSL support. As root execute:

laptop:~# lighttpd -v
lighttpd/1.4.23 (ssl) - a light and fast webserver
Build-Date: Aug 17 2009 21:46:24

the (ssl) indicates that lighttpd has ssl support compiled in.

Then install, as root, OpenSSL, if you don’t already have it:

# aptitude install openssl

Now as root, create and install the self-signed certificate:

# openssl req -new -x509 -keyout /etc/lighttpd/seaside.example.com.pem -out /etc/lighttpd/seaside.example.com.pem -days 365 -node

Answer the questions:

Generating a 1024 bit RSA private key
…………………….++++++
……………++++++
writing new private key to ‘/etc/lighttpd/seaside.example.com.pem’
—–
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter ‘.’, the field will be left blank.
—–
Country Name (2 letter code) [AU]:MX
State or Province Name (full name) [Some-State]:Mexico City
Locality Name (eg, city) []:Mexico City
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Example Corp
Organizational Unit Name (eg, section) []:TI
Common Name (eg, YOUR name) []:seaside.example.com
Email Address []:you@example.com

Change the permissions to something more secure like 440

# chown 440 /etc/lighttpd/seaside.example.com.pem

Now to configure Seaside to emit correct URLs. Be sure that the images are shutdown. Open the seaside image:

/opt/pharo/squeak /srv/example/pharo/seaside.image

Open the initialize method on the class side of SPTApplication and change:

“Server protocol”
application preferenceAt: #serverProtocol put: #http.

to:

“Server protocol”
application preferenceAt: #serverProtocol put: #https.

Change:

“Server port”
application preferenceAt: #serverPort put: 80.

to:

“Server port”
application preferenceAt: #serverPort put: 443.

and change:

“Base URL for resources: images, styles, etc”
application preferenceAt: #resourceBaseUrl put: ‘http://seaside.example.com/resources/’.

to:

“Base URL for resources: images, styles, etc”
application preferenceAt: #resourceBaseUrl put: ‘https://seaside.example.com/resources/’.

Now open a workspace and reinitialize the application by executing:

SPTApplication initialize.

That is all on the Seaside side. Save the image and quit.

Now to configure the webserver. Change the host line for seaside on lighttpd.conf from:

$HTTP["host"] == “seaside.example.com” {
server.document-root = “/srv/example/website/”

to:

$HTTP["host"] == “seaside.example.com” {
$HTTP["scheme"] == “http” {
url.redirect = ( “^/(.*)” => “https://seaside.example.com/$1″ )
}
}
$SERVER["socket"] == “127.0.1.1:443″ {
ssl.engine = “enable”
ssl.pemfile = “/etc/lighttpd/seaside.example.com.pem”

server.name = “seaside.example.com”
server.document-root = “/srv/example/website/”

Be sure to use your own IP (unless you’re testing on localhost like me) and the correct path to the pem file. Also note that this setup will redirect every request arriving on http to the https port. So ALL the application will be on https. This can be or not what you want. If you only want a part of your site under https, you must configure lighttpd accordingly and make sure that the application emits https URLs only when you need it. That is up to you.

Restart lighttpd:

# /etc/init.d/lighttpd restart

and point your browser to:

http://seaside.example.com

it should redirect to:

https://seaside.example.com

Of course, as you are using a self-signed certificate, the web browser will shout a warning about the certificate verification. Accept it unless you don’t trust yourself :).

After that you should see the summary page of the seaside.example.com and everything should work as before, just encrypted.

Really easy, don’t you think.


Oct 2 2009

Deploying Seaside: load testing results

I made a series of test with different values of the parameters. All of them are for reference purpose only. As with any benchmark, a lot of factors affect the results. The advice is, try to isolate your environment so that the results are meaningful

My machine is

  • Intel(R) Core(TM)2 Duo CPU T6400 @ 2.00GHz
  • Cache: 2048 KB
  • RAM: 4GB

with a lot of processes running and the same machine hosting lighttpd, JMeter and the images.

There are two series of tests. The first one test the seaside.example.com. That is the application that stores everything on memory in each image. So this will be very fast.
The second one test the magma.example.com. So each request is accessing the Magma database. As we increase the number of images, the database will be the bottleneck. Keep that in mind when comparing results. Also, don’t bash Magma as magma is very fast. It is just that the SeasideMagmaTester isn’t optimized yet. It is just a simple application for measuring a *simple* Magma-Seaside integration. In a real production environment you’ll put the database in the most powerful server (at least for writing performance), and for read performance, you can add several servers to the magma node and get a lot of reads/sec. But that is another problem. We just want to test the SeasideProxyTester as is. Optimize your application as you see fit.

First the seaside.example.com results:

Magma Images Seaside Images mmap (MB) JMeter Users JMeter Ramp-Up (seconds) JMeter Loop counter Samples (Requests) Throughput (Req/sec) Error % (requests that failed)
1 1 100 10 10 500 5010 113 0
1 1 100 100 100 500 49971 114 1.85
1 1 100 400 100 500 200400 117 60.03
1 2 100 10 10 500 5010 140 0
1 2 100 100 100 500 50100 137 0.97
1 2 100 400 100 500 200400 158 39.56
1 10 100 10 10 500 5010 154 0
1 10 100 100 100 500 50100 154 0
1 10 100 400 100 500 200400 170 0.43
1 30 100 10 10 500 5010 118 0
1 30 100 100 100 500 50100 129 0
1 30 100 400 100 500 200400 118 0
1 30 100 4000 100 100 600600 187 47.49
1 2 100 600 30 1000 600600 205 76.12

Now the magma.example.com results:

Magma Images Seaside Images mmap (MB) JMeter Users JMeter Ramp-Up (seconds) JMeter Loop counter Samples (Requests) Throughput (Req/sec) Error % (requests that failed)
1 1 100 10 10 500 5010 50 0
1 1 100 100 100 500 50100 75 1.11
1 1 100 400 100 500 200400 102.9 78.32
1 2 100 10 10 500 5010 57 50
1 2 100 100 100 500 50100 120 73.21
1 2 100 400 100 500 197935 160 91.64
1 10 100 10 10 500 5010 44 89.28
1 10 100 100 100 500 50100 167 97.24
1 10 100 400 100 500 200400 206 95.59
1 30 100 10 10 500 5010 45 89.38
1 30 100 100 100 500 50100 150 98.72
1 30 100 400 100 500 179686 255 99.99
1 30 100 100 100 2 300 3 0

Those results as I said, are just a reference. YMMV.

Comments:

The seaside.example.com results are very varying. With one seaside image, you get 113 requests/second. Thats a lot of requests. Really. I hope someday I have a site that receive that number of requests. But have in mind that the seaside.example.com application it is just storing the counters in memory. Also, the server (that is, my laptop) is just handling 1 process for the magma image (not used), 1 process for the seaside image (heavily used), 1 process for the lighttpd server (heavily used) and 1 process for JMeter. Not a lot of work for the cpu and the Linux process scheduler.
But if you see the results for 2 seaside images, the best you get is 140 requests/second without getting errors. That is unexpected, because if 1 image can handle 113, 2 images should handle at least 200 request. That is even more notorious when you use 10 or 30 Seaside images. The best you get is 154 requests/second. As I said before a lot of things affect this results. First my CPU isn’t as powerful as the ones from real servers. My laptop is doing a lot of other processes (webbrowser, gaim, JMeter on GUI mode, the GNOME desktop, the wireless, the music). In a dedicated server more resources are reserved for the Seaside images. In the worst case, with 30 Seaside images (each of them doing a lot of work by itself) the laptop CPU is doing a lot of process context switching giving each image a slice of processor time. Each image, in turn, is doing its own process scheduling between Komanche, Seaside and the others processes that run in a Pharo image. If you consider this you can explain why there isn’t a linear scaling in the requests/second as you increment the number of images. The best, appears to be, is to use different servers for the webserver and for the images. Also, distributing the images on two or more small servers (as my laptop is), can get the best of the images and from the balancer.

Now for the Magma results. They are ugly and disappointing. But, remember, isn’t optimized yet. It is just the simplest way of getting Magma and Seaside working. For example, if your app reads a lot more than writes to the database (as most application are, unless you are storing the results of subatomic collisions ;)) you can add more read only server to a magma node to improve the read performance. Besides, you can use different read strategies and use Magma Collections to store your data. The PROBLEM WITH THIS PARTICULAR APPLICATION is that all the images are trying to write to the same slots on the dictionary that holds the counters. This, when you have a lot of processes trying to write, necessarily results in a lot of commit errors. Suppose session 1 reads the current counter value in order to increase it. Before it can commit the new value (current value + 1) the Pharo scheduler switches to other session on the same image or the OS scheduler switches to other Pharo image. The new scheduled session (session 2) reads the current value (not yet updated by the first session) and if not uncheduled like the session 1, successfully commits the new value. Some time later the session 1 get scheduled again and tries to resume from the exact same place where it left. So update the value and send the commit to magma. Magma notes that the value has changed since it was read and marks a dirty object (that is, the client must do an abort to get the new value) and a magma conmit conflict. The error is arrives to the final user and is counted by JMeter as an error because of the status 503 from the headers.
In a real application, the common scenario is that each user writes to its own section of the database or to different parts of a common collection, this is handled very well by Magma, even better if you use Magma Collections. So in a more realistic scenario, you won’t have that many commit conflicts, if any. But that is Magma optimization and you know better than anybody your own application. Maybe Chris Muller (Magma creator) or Keith Hodges (Magma seasideHelper creator) can replicate this results and suggest better ways to test Magma and use Magma seasideHelper. I repeat: the apparent errors are a consequence of the application tested and not from Magma. How do you know? Because in every case we get a response from the Magma server, that is, a commit conflict error response. So the server is alive and healthy, responding appropriately every request made by a Seaside image. Keep that in mind before bashing Magma.
One better way to test this application is to give each session its own counter on the database (as if each user were getting its own private data) and all of those private counters being held on a Magma Collection (that is, a collection of user data). This way each session will update its own data and that by its own nature, won’t produce commit errors. But that is left as an exercise to the reader.

So, to test your apps.


Oct 1 2009

Deploying Seaside: load testing the setup

I have used several tools to load test websites and webapps: ab, siege, httperf. They are very good tools but their are designed to test static URLs like:

http://mysite.com/somepath

http://mysite.com/someotherpath?parameter=value

and not the kind of URLs generated by Seaside.

Some of them can be given a list of URLs and will hammer your server with requests to each of those URLs. Others can also accept and resend cookies with each request. Some have a concept of a session (mostly with a cookie holding the session key value) that can be used to send a series of requests as part of the same session. Few of them can follow a redirect automatically. Well, isn’t a fault on the tools, but that they are created to test static web pages.

Of course there are commercial tools that (I think) can be scripted to simulate a real session by logging in to the webapp, navigating on it and putting values on forms and fields and running them thousand of times. I have never used them.

Other more recent tool is Selenium, that build test cases by using a web browser. You start Selenium and it records all the actions you perform on a website/webapp and then can repeat them with just a click. Everything runs on the web browser. Also it is very slow when you want to repeat the test 1000 times. But it is great to test complex scenarios having several link and button clicks and form submissions.

We can use ab or httperf on the SeasideProxyTester applications. This can be useful even if they can’t navigate the application or “click” the links on the application. If you run a command like:

ab -c 10 -n 100 http://seaside.example.com/

it will create 10 concurrent connections, each of them requesting 100 times exactly the same URL. The problem with this kind of request is that you aren’t testing correctly your application. Here you are testing how many sessions can be created by Seaside under load. Why? As this URL hasn’t a _s parameter on it (the session key), each time that a request like this arrives to Seaside, a new session will be created for it. And then it is forgotten and never used again.  Also, this command will hit a new Seaside image each time because as isn’t accepting the cookie, the webserver is doing a round-robin balancing for each request that doesn’t include the cookie.

With the following httperf command, things are a little better:

httperf –hog –session-cookie –server=seaside.example.com –wsess=10,100,1 –rate 2

This creates 10 threads at a rate of 2 per second, each of them will send 100 requests spaced out by 1 second. It also accepts and holds a cookie containing the session key for the session. This command is better in that the requests are more uniformly created over a period of time and not all of them at the start of the command, as the ab command does. The problem is that only can hold one cookie and the SeasideProxyTester is already using a cookie for storing the server that is handling the request. So the httperf command shown isn’t really sending requests as part of the same Seaside session but it guarantees that the same httperf thread always send its requests to the same Seaside image by sending the cookie value received on the first request. This is, as I said, a little better than ab, but not what we want.

The tool I will be using is JMeter from the Apache Foundation. This tool can be scripted and made to follow links on the application being tested. Also, it can graph the results or save them to a file. It can handle cookies and check for strings on the response for testing purposes. Lets begin testing the SeasideProxyTester apps.

Download JMeter and unzip it. Change to the unzipped directory and run:

miguel@laptop:~$ cd jakarta-jmeter-2.3.4
miguel@laptop:~/jakarta-jmeter-2.3.4$ ./bin/jmeter

JMeter opens with two panels. To the left the tree of elements that you use to test your site. On the right, the panel where you configure the component selected on the left. Initially you have two components, a Test Plan and a Workbench. We will not use the Workbench.

Select the Test Plan and in the right panel configure:

Name: SeasideProxyTester

In JMeter the changes are saved when you select a different element. So if you select the Workbench, the Test Plan name will change.

Now, select the SeasideProxyTester test plan and right-click on it to open a pop up menu. Select Add -> Thread Group. Configure the following options:

Name: Users

Number of threads (users): 1

Ramp-Up Period (in seconds): 1

Loop counter: (leave unchecked the Forever checkbox): 1

The Thread Group simulates the users that will use the webapp. This means simulate 1 user that will be created on a timespan of 1 second. That is, after 1 second, you’ll have 1 user ready to do what the Test Plan specifies. For the moment we’ll work with one user. Also, the plan will be ran just one time (Loop counter). When the Test Plan is ready we will increment the number of users to load on the application.

Next, select the Users thread group and right-click Add -> Config element -> HTTP Cookie Manager. Nothing to configure for this element.

Select the Users thread group and right-click Add -> Config element -> HTTP Request Defaults.  Configure:

Server Name or IP: seaside.example.com

Path: /

This will establish the default parameters for HTTP requests so that you don’t have to set them everywhere on your Test Plan (useful if you are doing a lot of different HTTP requests).

Again, right-click on the Users thread group and Add -> Listener -> View Results Tree. Nothing to configure.

This element will show the individual requests done by JMeter for you to review. DON’T include this element on test plans that make a lot of requests or your client machine will be using a lot of RAM just to update this element. We’ll use it only for testing that the plan is correctly configured. When doing the real load testing, we’ll remove it.

On the Users thread group do Add -> Listener -> Summary Report. Nothing to configure.

This element will show the number of requests done, the time to receive responses, errors and other useful data obtained when executing the Test Plan. This is the element that will tell us how many requests our application is capable of process.

Lets add the elements that the Test Plan will execute to test our web application. Select the Users thread group and Add -> Sampler -> HTTP Request. Configure:

Name: Index page

Uncheck: Redirect automatically.

Check: Follow redirects.

This establish that the Test Plan must make one request to the “/” of the seaside.example.com  server (specified by the HTTP request defaults).

At this step you can already run the Test Plan. Either press Ctrl + R or in the menu Run -> Start. Now select the Result Tree.

You’ll see that JMeter has made two requests although the Test Plan only specifies one request. The first one is to:

GET http://seaside.example.com/

[no cookies]

Request Headers:
Connection: keep-alive

The full dialog for the first request is:

Thread Name: Users 1-1
Sample Start: 2009-09-30 11:15:41 CDT
Load time: 6
Latency: 6
Size in bytes: 0
Sample Count: 1
Error Count: 0
Response code: 302
Response message: Found

Response headers:
HTTP/1.1 302 Found
Server: KomHttpServer/7.1.2 (unix)
Location: http://seaside.example.com/?_s=A1oTuGM8OQmGMbZ5&_k=aI9HNGnu&_c
Date: Wed, 30 Sep 2009 11:15:41 GMT
Session: A1oTuGM8OQmGMbZ5
Set-Cookie: server=app9003; path=/
Content-type: text/html;charset=utf-8
Content-length: 0
HTTPSampleResult fields:
ContentType: text/html;charset=utf-8
DataEncoding: utf-8

Here we can see the redirect that the Seaside image that processed the request (at port 9003 as we can see from the Set-Cookie header) is responding. So JMeter does a second request:

GET http://seaside.example.com/?_s=A1oTuGM8OQmGMbZ5&_k=aI9HNGnu&_c

Cookie Data:
$Version=0; server=app9003; $Path=/

Request Headers:
Connection: keep-alive

Check that the cookie is being added to all the subsequent requests by JMeter (just as the web browser does). This will result in the request being forwarded to the same image that received the first request.

The full dialog for the second request is:

Thread Name: Users 1-1
Sample Start: 2009-09-30 11:15:41 CDT
Load time: 7
Latency: 7
Size in bytes: 1516
Sample Count: 1
Error Count: 0
Response code: 200
Response message: OK

Response headers:
HTTP/1.1 200 OK
Expires: Wed, 11 Jun 1980 12:00:00 GMT
Server: KomHttpServer/7.1.2 (unix)
Pragma: no-cache
Date: Wed, 30 Sep 2009 11:15:41 GMT
Session: A1oTuGM8OQmGMbZ5
Cache-Control: no-cache, must-revalidate
Content-type: text/html;charset=utf-8
Content-length: 1516
HTTPSampleResult fields:
ContentType: text/html;charset=utf-8
DataEncoding: utf-8

Until now everything is ok. Now lets add elements that will follow the links on the SeasideProxyTester summary page.

We have received a page with the HTML of the response. Inside this HTML there is a link that makes a new request in the same Seaside session (it sends the _s and _k parameters on it). We must find this link and make a request for it. After that, Seaside will respond a new HTML page with a new link on it, although with other values for _s and _k. We must again find the link and make a request for it. We must do this every time we need to follow the link. For this to work we need to add a post-processor (an element that triggers after each sampler execution, in this case, our HTTP requests). This post-processor will find the link and will save it to a variable in order to make the next request to the application. Then we will setup a loop controller to make the session counter increase in the SeasideProxyTester.

Select the Users thread Group and  Add -> Post processor -> Regular Expression Extractor. Configure:

Uncheck: Body

Check: Body (unescaped)

Reference Name: newRequestURL

Regular Expression: \?(_s=[^&]+?)&(_k=[^&]+)&2

Template: ?$1$&$2$&2

Match No. (0 for Random): 1

Default Value: REGEX_FAILED

Then add a loop element to queue several requests one after other. Select the Users thread group and Add -> Logic Controller -> Loop Controller. Configure:

Loop count (leave Forever unchecked): 10

Select the Loop Controller and Add -> Sampler -> HTTP Request. Configure:

Name: Follow link

Path: /${newRequestURL}

Uncheck: Redirect automatically

Check: Follow redirects

That is it. Run it again and you’ll see that now there are eleven requests. One for the Index page, that is responsible of following the redirect and accepting the server cookie and ten for simulating clicking ten times the link “Make a new request inside this session”. Review the View Results Tree to see the individual requests.

What have we done. We have simulated a user accessing the seaside.example.com, receiving a redirect and a cookie as response. After following the redirect and using the cookie in any subsecuent request, it follows a link to simulate clicking it ten times. The View Results Tree page shows the responses and that the last HTML has a counter of 11 in the number of requests of the session.

So we are ready for the last step. Load testing the application.

First, remove the View Results Tree. We don’t want the JMeter client lost time trying to allocate memory just for storing and updating this element. Those CPU cycles and memory should be used to stress the application. Now update the Users thread group:

Number of threads: 100

Ramp-Up Period: 20

And update the Loop Controller:

Loop Count: 500

And run it again. This time, after 20 seconds you’ll have 100 users (threads) created, each of them running the Test Plan, that is, requesting the Index Page and then following 500 times the link.

This will put a real load on your server by doing 100 * 501 = 50100 requests.

Try changing the parameters involved:

  • Number of images started (scripts/start_app.sh)
  • Number of concurrent users created by JMeter
  • Total time to create the users (so that the load is uniformly distributed over a bigger timespan)
  • Total number of requests made by each user.
  • Changing the number of processes running on the production server.
  • Distributing the images (magma and seaside) on different machines.
  • Putting the webserver on a different machine.
  • Starting several instances of JMeter on several distinct machines.

Well, you get an idea. Only after varying the environment you’ll get the correct and optimal setup.

This test can be reproduced by everyone so that you can check your setup and compare it to others.
Finally read the posts from Dale Henrichs blog about Seaside scaling but using Gemstone/S 64.

The next post I will show the results on my machine both for the magma.example.com and for the seaside.example.com.