Oct 8, 2012

Setting ulimits for jsvc

One of my Thrift based Java services was recently rolled out to a new set of clients, resulting in a 2000% increase in requests. This was all planned for and the transition was 'fairly' smooth. However, I noticed that the service would run out file descriptors periodically. This would result in the following exception:

  java.io.IOException: Too many open files

Now when we had planned to deploy the service we had considered the number of concurrent connections and set the open file limit for the user under which the service would run. So, the presence of this error was quite confusing. I initially suspected a file leak so started graphing the number of files the process had open using a simple script to capture the data - note that as I'm using jsvc, it's fairly easy to identify the process ID:

  #!/bin/bash
  PID=`cat /var/run/service-name/service-name.pid`
  echo `date` $PID `ls -la /proc/$PID/fd | wc -l`

Sure enough I found a leak but it was tiny and wouldn't account for the resource usage that I was seeing. I identified the source of the leak by correlating it with events on other monitoring graphs and fixed it anyway. But the graph showed something far more interesting - the file descriptors would max out consistently at ~1000 - a value far below our configured limit. I then remembered that jsvc is actually started by the root user and then drops down to the service user - perhaps it was using the open file limit for root - it looked suspiciously so:

  root@server:/ ulimit -n
  1024

This was exactly the same behaviour I had seen when configuring large pages - which was unfortunately forgotten! Resource limits for jsvc based processes must be set for the user with which jsvc is started - not the user it downgrades to. This is a known issue that has been raised with Apache. For a quick workaround I set the limit in my init.d script like but this feels a bit hidden and I worry that this approach would not work well when running multiple services with their own particular resource limits.

  ulimit -n 15000

Fortunately I don't have to worry about that right now but it would be more convenient to assign a user per service and then have user based limits. Hopefully I'll now remember this jsvc behaviour so I don't get caught out by it a third time.

No comments:

Post a Comment