Tuesday 17 September 2013

Apache Web Server, Tomcat AJP: ajp_read_header: ajp_ilink_receive failed

Problem:

Apache Web Server configured to proxy requests to web application running on Tomcat (7.0.39) over AJP. The applications were installed on virtual machines in a cloud environment. On the completion of load tests (more than 24 hours), the system became unresponsive. Whenever a request was made to Apache an HTTP 503 status code was returned. Looking at the Tomcat logs showed no errors and requests could still be sent over the HTTP channel directly to Tomcat. CPU and memory consumption was also very low. Looking at the Apache log files showed errors of the following nature:


[error] ajp_read_header: ajp_ilink_receive failed
[error] (70007)The timeout specified has expired: proxy: read response failed from 10.1.3.3:8009 (10.1.3.3)



Tomcat could no longer handle any more requests from Apache over AJP and required a restart.

Analysis:

Doing a 'netstat' for port 8009 on the app server VM showed that there were 200 connections still in an ESTABLISHED state, with 100 connections in a CLOSE_WAIT state.
 From this initial analysis, a number of questions arose:

  • How did the number of AJP connections grow so large?
  • Why did the number of connections not close after a period of inactivity?
  • What was causing Tomcat from accepting any more requests?
Reading the AJP documentation confirmed that by default the AJP connection pool is configured with a size of 200 and an accept count (request queue when all connections are busy) of 100. To confirm the findings, Tomcat was configured with a smaller AJP connection pool (20) and as expected the errors in Apache occurred sooner and more frequently. 
To address the issue, Apache (MaxClients) and Tomcat (maxConnections) were both configured to support 25 concurrent requests. This worked perfectly (Apache no longer returned 503 responses and the log files no longer showed the ajp_link errors). The test was then repeated after increasing the connection pool to 50. Running a load test for an hour showed the servers working well, response times improved and no 503 responses. However, after the test completed, the Tomcat VM still showed 50 connections in an ESTABLISHED state. A further read of the Tomcat AJP documentation revealed that the connections remain open indefinitely until the client closes them. The next thing to try was the 'keepAliveTimeout' on the AJP connection pool. This had the effect of closing the connections after a period of inactivity and therefore seems to have resolved the issue. Ideally, the AJP connections should grow as load increases and then reduce back to an optimal number when load decreases. The 'keepAliveTimeout' has the effect of closing all connections that are inactive.

Solution:

Configure Apache 'MaxClients' to be equal to the Tomcat AJP 'maxConnections' configuration.
Configure Tomcat AJP 'keepAliveTimeout' to close connections after a period of inactivity.

References:
Tomcat AJP: http://tomcat.apache.org/tomcat-7.0-doc/config/ajp.html
Apache MPM Worker: http://httpd.apache.org/docs/2.2/mod/worker.html

11 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. 0
    down vote
    I fixed this error as folow:


    AddDefaultCharset Off

    Order deny,allow

    Allow from all



    ProxyPass / ajp://backend.example.com:8009/ retry=0 timeout=10 ttl=60

    ProxyPassReverse / ajp://backend.example.com:8009/

    ReplyDelete
    Replies
    1. Worked, you saved me. I didn't the begining, only the ProxyPass part, I think worked mainly the "retry=0" on ProxyPass. Thank you!

      Delete

  3. AddDefaultCharset Off

    Order deny,allow

    Allow from all



    ProxyPass / ajp://backend.example.com:8009/ retry=0 timeout=10 ttl=60

    ProxyPassReverse / ajp://backend.example.com:8009/

    ReplyDelete
  4. In which module need to define "MaxClients" in apache?

    ReplyDelete
  5. If we do not configure "maxConnections" then it will take default value as maxThread (which is 200) and "keepAliveTimeout" will take default value of connectionTimeout (which is 60 seconds) then what is a impact of configuring these parameteres?

    ReplyDelete
    Replies
    1. Very late to the party but according to the documentation, connectionTimeout is -1 by default (infinite), not 60 seconds (at least in the current version)

      Delete
  6. The impact will depend on your application and these settings should be configured as such. Recently I've moved away from using AJP connector and using the Http connector (NIO) as it makes it easier to debug issues.

    ReplyDelete
  7. It was of great help!!
    Thanks a ton

    ReplyDelete