java.io.IOException: Too many open files
Now when we had planned to deploy the service we had considered the number of concurrent connections and set the open file limit for the user under which the service would run. So, the presence of this error was quite confusing. I initially suspected a file leak so started graphing the number of files the process had open using a simple script to capture the data - note that as I'm using jsvc, it's fairly easy to identify the process ID:
#!/bin/bash
PID=`cat /var/run/service-name/service-name.pid`
echo `date` $PID `ls -la /proc/$PID/fd | wc -l`
Sure enough I found a leak but it was tiny and wouldn't account for the resource usage that I was seeing. I identified the source of the leak by correlating it with events on other monitoring graphs and fixed it anyway. But the graph showed something far more interesting - the file descriptors would max out consistently at ~1000 - a value far below our configured limit. I then remembered that jsvc is actually started by the root user and then drops down to the service user - perhaps it was using the open file limit for root - it looked suspiciously so:
root@server:/ ulimit -n
1024
This was exactly the same behaviour I had seen when configuring large pages - which was unfortunately forgotten! Resource limits for jsvc based processes must be set for the user with which jsvc is started - not the user it downgrades to. This is a known issue that has been raised with Apache. For a quick workaround I set the limit in my init.d script like but this feels a bit hidden and I worry that this approach would not work well when running multiple services with their own particular resource limits.
ulimit -n 15000
Fortunately I don't have to worry about that right now but it would be more convenient to assign a user per service and then have user based limits. Hopefully I'll now remember this jsvc behaviour so I don't get caught out by it a third time.
No comments:
Post a Comment