open thought and learning

Tuning a JVM for Berkeley DB Java Edition

leave a comment »

For those who not have heard about Berkeley DB (called BDB): it is a transactional storage engine with basic key/value pairs, very agile and highly performance-oriented, with no SQL-like engine. Compared to it’s native version, the Java Edition has quite a few differences and is useful when it is to be integrated with a basic Java application.

The aim of the database is to be available in RAM all the time as much as possible, so that all query responses are fast. Based on this, here’s my take on tuning the JVM that hosts the BDB:

  • JVM heap size should be around the same size as the data store
  • Use the Concurrent Mark/Sweep GC algorithm to have low-pause GC times
  • Since most of the objects are going to be living ‘forever’, it’ll make sense to have a huge tenured generation
  • If the DB size can vary, refrain from giving Xmx and Xms the same values. Give a huge difference so that the JVM can manage it as your data grows

This is what CATALINA_OPTS might look like (includes a lot of debug flags as well):

CATALINA_OPTS="-server -Xms1024m -Xmx4096m -XX:+UseMembar -XX:+PrintGCDetails -
XX:+PrintGCApplicationStoppedTime -XX:NewRatio=4 -XX:+UseConcMarkSweepGC -verbos
e:gc -Xloggc:/appl/tomcat/logs/gcdata.txt"

-XX+UseMemBar is there to accomodate for the high IO waits I had been seeing – I think there’s a problem in linux with the JDK using memory barriers. I read about the bug here.

BDB Java Edition is not a replacement for a traditional database, but is a means to have almost immediate results for things like look-up data, subscriptions and most frequently-used information. There are quite a number of on-line resources available to help you set it up and use it – native or Java, whichever your flavor is.

memcached is another such tool that is useful when improving performance for an application-database connection.  More on it in another post some other time.


Written by mohitsuley

August 8, 2008 at 9:30 pm

Making SSI work on a JSP response

leave a comment »

If you need to parse SSI from a JSP response, there are two simple ways to do it:

1. Use the SSIServlet and handle it within tomcat
2. If you have a separate web server like Apache in front of tomcat, and you want that web server to do it, the plot thickens.

If you ask, ‘why, when you are already using Java? You can do all that you can do with SSI in a JSP, right?‘, you might be surprised. Let’s just say the reason is out-of-scope for this post.

So, you have a three-tier architecture with web servers spread across the world and app/DB servers local to certain data-centers. Naturally, you might want to ‘assimilate’ content on the web servers (closest to local users based on 3DNS/similar) where it’s already present instead of shuttling bytes back-and-forth between the web and app layers. That’s the reason. And did I say earlier it was out of scope? My bad.

The way you would do it is set up Apache on a specific Location to grab for, put an AddOutputFilterByType statement with the MIME type as text/x-server-parsed-html and finally, on the JSP itself, you will set the MIME type using setContentHeader for the response.

Your Location section might look like this:

<Location /application/ssiparser >
Options +Includes
AddOutputFilterByType INCLUDES;DEFLATE text/x-server-parsed-html

In an ideal scenario, everything should have been hunky-dory, but life isn’t so simple. At least it didn’t happen so easily for me.

What I had done earlier was, in order to make certain performance improvements, added a CompressionFilter on tomcat to gzip all responses from it so that the app-web performance improves as well. This meant that once the response reached Apache it would already be gzipped and SSI parsing would not be possible. Mind you, this is Apache 2.0.x and not 2.2.x where you can actually set up FilterDeclare and such.

There are two ways to get around this problem:

1. Get the CompressionFilter to exclude the Location you have on for SSI, and then pass on INCLUDES;DEFLATE to AddOutputFilterByType.
2. Or, unset the Accept-Encoding header on the request first so that it doesn’t take gzip and the CompressionFilter doesn’t compress it at all. If I try to deflate it again now, it doesn’t happen.

The problem with (2) is that you end up sending decompressed data across. Option (1) would be the right way to go.

(1) will entail a change on the web.xml for your application.

(2) will look like this:

<Location /application/ssiparser >
Options +Includes
RequestHeader unset Accept-Encoding
RequestHeader set Accept-Encoding deflate
AddOutputFilterByType INCLUDES;DEFLATE text/x-server-parsed-html

The JSP will start with:

<!--#include virtual="/static/content/news.html"-->
<!--#include virtual="/static/content/weather.html"-->
<!--#include virtual="/static/content/media.html"-->

Most folks do not upgrade Apache as they do with other kinds of software, just because it's so damn stable and fulfills your requirements very well. However, I feel if you need to work with filters and play around with them, 2.2 will be the way to go.

Written by mohitsuley

August 7, 2008 at 1:15 am

OpenDeploy rollback across a WAN

leave a comment »

While working on Interwoven OpenDeploy I came across the following problem:

Large deployments or file-pushes spanning a WAN or a continent used to sometimes time-out or roll back. The problem was noted where there was a significant difference of size between file lists.

This is what happens:

  1. OD starts n threads based on the n lists of files to be deployed.
  2. Thread 1 finishes and the remaining n-1 threads continue file transfer.
  3. After exactly 5 minutes, thread 1 times out (shows a TCP packet with RST flag set on tcpdump) and after all threads finish, the deployment fails and rolls back the transaction.

Root cause:
Some network device on the way times out TCP idle sessions more than 300 seconds and sends an RST flag, dropping the connection essentially. When this happens, OpenDeploy considers the transaction corrupt and rolls it back.


  1. Get the firewall to extend the timeout to a more reasonable time (perhaps similar to the default tcp_keepalive_time of 7200 seconds?) – not practically possible if a number of teams are involved.
  2. Change tcp_keepalive_time to ~200 seconds
  3. If the keepalive change does not help alone, try http://libkeepalive.sourceforge.net . Works like a charm!

Generally speaking, and not being ‘opendeploy-centric’, I did learn the importance of keepalive packets and how the default value of 7200 seconds might not be practical when an application talks to servers across network borders.

Thanks to my colleague Prajal Sutaria for working on this!

Written by mohitsuley

August 6, 2008 at 9:04 pm

Posted in linux, sysadmin

Tagged with , , ,

Caching problems with SAML

leave a comment »

Anyone who has worked with SAML knows very well how effective and simple it is to achieve federated services with your own authentication mechanism. What needs to be remembered, though, is that end-users might very well be behind firewalls. And with that come proxies; and those proxies open up the Pandora’s box aka cache.

Proxies can cache POST response from the authentication user agent and make user1 see a page which says ‘Welcome user2’. Do a forced-refresh (Ctrl-F5, Cmd-R) on the browser, and you can see your own ID again.

1. Ensure proxies don’t cache any content for your authentication domain.
2. Pass a ‘random’ value like the timestamp using Javascript to the URL (to make it unqiue)
3. Force the content-provider’s web server, and the user agent web server to set Cache-Control to max-age=0 and proxy-revalidate.
4. Make sure you’re sending an invalidation string in the packet as well.

Clearing proxies in a company with about ~100 proxy servers might not be the right choice. The onus should lie on the development and the sysadmin team to make sure important pages are non-cacheable. Never trust proxy servers is the motto here.

Written by mohitsuley

August 1, 2008 at 4:18 pm

Posted in linux, sysadmin

Tagged with , , ,

Pinging hostnames from /etc/hosts

leave a comment »

Problem Statement: Ability to ping a user-defined hostname with a valid IP address
Solution: Simple, put it in the /etc/hosts file and you’re done.

You still can’t do it; did you check nsswitch.conf? This is what should be there: hosts: files dns .
So, with the right /etc/nsswitch.conf and /etc/hosts, should it work?

root@treebeard:~# cat /etc/hosts localhost treebeard mithrandir

root@treebeard:~# ping mithrandir
PING mithrandir ( 56(84) bytes of data.
64 bytes from mithrandir ( icmp_seq=1 ttl=64 time=0.092 ms
64 bytes from mithrandir ( icmp_seq=2 ttl=64 time=0.067 ms

It works!

root@treebeard:~#sudo su - mohit
mohit@treebeard:~$ ping mithrandir
ping: unknown host

It seems when I switch to a non-root user, entries in /etc/hosts fail to take effect.

The problem is with the read attributes on /etc/nsswitch.conf. I hadn’t noticed that it was world-unreadable.

root@treebeard:~# chmod o+r /etc/nsswitch.conf
root@treebeard:~#sudo su - mohit
mohit@treebeard:~$ ping mithrandir
mohit@treebeard:~$ ping mithrandir
PING mithrandir ( 56(84) bytes of data.
64 bytes from mithrandir ( icmp_seq=1 ttl=64 time=0.092 ms
64 bytes from mithrandir ( icmp_seq=2 ttl=64 time=0.067 ms

Worked, finally. The weird thing is I would have assumed ping to complain that it wasn’t able to read a file or something, but there was nothing of that sort. This means you can actually force a user to stick to DNS resolution and all daemons and root-owned processes to leverage /etc/hosts.

Bad idea I’d say. This might be a ticking time-bomb. I faced this problem when configuring two nodes for a 10g RAC cluster. The DB runs as a user, and the DBA had a tough time getting the private interconnect working – thanks to nsswitch.conf.

Lesson learnt.

Written by mohitsuley

August 1, 2008 at 4:26 am

Posted in linux, sysadmin

Tagged with ,

My mod_python 101

leave a comment »

After having built mod_python.so and doing a LoadModule modules/mod_python.so I expected everything to work fine, assuming I did a SetHandler python-program and a PythonHandler helloworld within an Apache virtualhost.

I could get Hello World! on the page then, with this following snippet:

$cat helloworld.py
from mod_python import apache
def handler(req):
req.content_type = “text/plain”
req.write(”Hello World!”)
return apache.OK

That’s when the problems began.

1. I couldn’t put in another .py file in the same directory that could run successfully – 404 returned
2. All HTML files returned a 404 as well
3. There was nothing on the error logs

After much reading around, here’s the final configuration that worked:

LoadModule modules/mod_python.so
PythonDebug On
AddHandler python-program .py
PythonHandler mod_python.publisher
PythonPath "['/appl/python2.5/lib/','/appl/webdocs/','/appl/python2.5/bin/python/','/appl/python2.5/lib/python2.5/site-packages/mod_python/','/appl/python2.5/lib/python2.5/site-packages/'] + sys.path"

And the URL that worked was http://myhost/helloworld.py/handler

Apparently, mod_python (also) has three kinds of default handlers – cgi, publisher and PSP.

cgihandler – to work with existing CGI scripts by creating a false environment
psp – Python Server Pages, just like JSP
publisher – The preferred way to handle python pages, using page/method

publisher is recommended for newer applications, and as you move forward and your application increases in complexity and entry points, you can make your own handlers.

Like I said, 101 and basics, but what the heck, I get to write something here. Itchy fingers.

More later.

Written by mohitsuley

July 31, 2008 at 1:28 am

Posted in code, linux

Tagged with ,

Automator on Google Code

leave a comment »

Finally, I have a home for Automator on Google Code at sysadmin-automator. I will start uploading existing scripts tomorrow.

Written by mohitsuley

July 27, 2008 at 3:56 am

Posted in linux, sysadmin

Tagged with ,