User sessions in J2EE and LAMP stacks have traditionally been handled in memory by the application server handling the user request. Because of that, load balancers have been configured to use sticky sessions. By sticky sessions we mean that once the user has visited the site, they will be assigned an app server and will return to that server for subsequent requests. The load balancers typically handle that by referencing the users session cookie.
Elastic cloud environments differ from traditional server configurations in that they have a variable number of servers based on traffic loads whereas traditional configurations had a fixed number of servers. When traffic volumes decline it is necessary to vaporize servers. In doing so, we would lose user sessions (essentially forcing a logout) unless we come up with a new strategy for session management.
After much research, it is clear that the best approach to solving this problem is through externalizing the user session so that it can survive server reductions. This has some interesting implications.
- Sessions must be serializable. By serializable we mean that the data must be transferable to another machine by encoding it into a JSON string or byte stream.
- Servers that are crashing or misbehaving can be vaporized without affecting the user experience.
- Traffic loads can be spread across new servers added to the cluster.
- When traffic wanes, we can easily remove machines from the cluster without affecting user sessions.
While I initially had concerns about the performance implications of managing sessions outside of the application server, my concerns have been allayed by performance tests indicating that there is negligible performance degredation and in some cases, performance can actually improve. The recommended approach is to essentiallly...
- Disable sticky sessions at the Elastic Load Balancer.
- Create a memcached node or cluster using Elasticache.
- Use memcached-session-manager (http://code.google.com/p/memcached-session-manager) to manage sessions.
- Read and write all session data to the Elasticache nodes instead of the local Tomcat instance.
While I have not done extensive research on PHP implementations, the approach is the same and there is documentation here (http://www.dotdeb.org/2008/08/25/storing-your-php-sessions-using-memcached/) that will get you up and running.
There were a few other alternative stores such as Redis, DynamoDB, and even databases such as MySQL, Oracle, and SQL Server. So why did I choose memcached and Elasticache?
- Memcached is old, crusty, good technology that everybody well understands.
- Elasticache provides managed memcached so that they handle patching, relaunching failed instances, etcetera.
- Elasticache provides support for clusters of servers for greater fault tolerance. (the software above mentions this here https://github.com/magro/memcached-session-manager/wiki/ConfigureElastiCache)
- Elasticache should be cost effective. Even at 500K user sessions @ 5K of memory per session, the cluster would need 2G. This should enable us to run at a medium instance which is $0.155/hr at spot rates which is $112/month. Smaller sites could either turn off autoscaling or use a small instance at $0.075/hr which comes to $54/month.
- PHP has good support for storing sessions in memcached.
- Memcache automatically purges old sessions so we don't have to worry about cleanup and memory leaks from expired sessions. They essentially use a "Least Recently Used" purging strategy while session handlers "touch" sessions to keep them up to date each time a user accesses the session.
- Dynamo is cost effective but you have to set the write capacity. Once you exceed that requests get denied. That scares me because one we hit the peak, the system is broken. We would have to set all of the thresholds high and then the cost benefit goes away.
- Evan (of BeanStalk fame) has a blog entry on Dynamo and Tomcat at the URL here. But the software is very new, low support, and a lot of people complaining about problems. http://blogs.aws.amazon.com/application-management/post/Tx1YJTV0E8W5XKI/Elastic-Beanstalk-and-the-DynamoDB-Session-Manager-for-Tomcat
- The implementation cited by Evan has an issue whereby you have to periodically go in and dump old sessions and I believe this locks the table temporarily and causes slowdowns.
- DynamoDB is a promising new tech but it is untested from our perspective. Once we have a greater level of comfort with teh reliability with DynamoDB we would more willing to entertain this option.
- Redis is atually my first choice because it is AWESOME! Unfortunately, I couldn't find a good set of tools and instruction to help persist Tomcat sessions into Redis.
- Once these tools are available and more stable, this would be more first choice.
- I actually think that using a SQL store may make sense in some cases. I know that .NET frameworks tend to use this approach by default. I am not opposed to this approach if it makes sense for the agency.
- SQL databases tend to be slow. Where DynamoDB, Redis, and memcached can return a respons in a few millis, a DB call would likely be a minimum of 30ms.
- Adding load to the database can add unexpected load to the app database. In addition, the app would need to make changes to add schema support for the sessions.
- Mechanisms have to be put in place to periodically clear out the expired session data.
In order to test performance, I devised a small web app that did some session management. When the user firsts hits the site, a session is created with a 2K payload. One time in every five requests, the payload is changed and pushed into the user session. This simulates a change to the session data that would likely occur on occasion. I then ran 1000 back to back curl requests to the server to simulate load.
As you can see from the numbers below, using memcached-session-manager with external session management is actually faster than Tomcat native sessions. The reason is because disk access in EBS volumes can take up to 10ms while network latency in AWS can be as low as 1ms.
The main issue with my test is that it did not exercise concurrency. To do this would require more testing effort. I recommend that this more in-depth testing be done before major apps are on-boarded with this approach.
- By default Tomcat stores sessions to disk
- ELB stickyness set to true
- Using memcached session manager with aync session writes set to on.
- ELB stickyness set to false
- Using memcached session manager with aync session writes set to off.
- Timeout set to 100ms.
- ELB stickyness set to false.
Approach | Requests | Time(ms) per request avg |
---|---|---|
Native Tomcat | 1000 | 124ms |
memcached async | 1000 | 112ms |
memcached sync | 1000 | 121ms |
####Rendered
- Recommendation is to use memcached sessions with non-sticky ELBs.
- Recommendation is to use a cluster of memcached nodes when appropriate. Otherwise a single instance will suffice.
- Recommendation is to use synchronous writes by default, but fall back to asynchronous when performance is critical.
- You can add and remove servers at will without interruption to users.
- You can actually do a new deployment and retain sessions.
- Adding new machines will always improve performance for all users, new and existing.
- Because Java serialization is used, objects that have different Serial UIDs across deployments would cause session deserialization errors. In the event that session objects change across deployments, the cache should be cleared.
The following XML is the Tomcat context.xml for the recommended async solution.
You should have the following file in your project
.ebextensions/memcached-session-manager-1.7.0.jar
.ebextensions/spymemcached-2.10.2.jar
.ebextensions/memcached-session-manager-tc7-1.7.0.jar
.ebextensions/sessions.config
.ebextensions/context.xml
The JAR files can be found on the net.
sessions.config should look like this...
container_commands:
copy-jars:
command: "cp .ebextensions/*.jar /usr/share/tomcat7/lib/"
update-tomcat:
command: "cp .ebextensions/context.xml /usr/share/tomcat7/conf/context.xml"
And context.xml will look like the following if you use the recommended solution.
<?xml version='1.0' encoding='utf-8'?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!-- The contents of this file will be loaded for each web application -->
<Context>
<!-- Default set of monitored resources -->
<WatchedResource>WEB-INF/web.xml</WatchedResource>
<Manager className="de.javakaffee.web.msm.MemcachedBackupSessionManager"
memcachedNodes="n1:your-elasticache-server-name-here.amazonaws.com:11211"
memcachedProtocol="binary"
sticky="false"
sessionBackupAsync="true"
requestUriIgnorePattern=".*\.(gif|jpg|jpeg|png|wmv|avi|mpg|mpeg|mp4|htm|html|js|css|mp3|swf|ico|flv)$"
/>
</Context>
If you want to run the session stuff synchronously, you can do it this way.
<?xml version='1.0' encoding='utf-8'?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!-- The contents of this file will be loaded for each web application -->
<Context>
<!-- Default set of monitored resources -->
<WatchedResource>WEB-INF/web.xml</WatchedResource>
<Manager className="de.javakaffee.web.msm.MemcachedBackupSessionManager"
memcachedNodes="n1:connor-cache.ekhzte.cfg.usw2.cache.amazonaws.com:11211"
memcachedProtocol="binary"
sticky="false"
sessionBackupAsync="false"
sessionBackupTimeout="100"
requestUriIgnorePattern=".*\.(gif|jpg|jpeg|png|wmv|avi|mpg|mpeg|mp4|htm|html|js|css|mp3|swf|ico|flv)$"
/>
</Context>
And here is the sessions.config file that will tell Elastic Beanstalk where to put your junk.
container_commands:
copy-jars:
command: "cp .ebextensions/*.jar /usr/share/tomcat7/lib/"
update-tomcat:
command: "cp .ebextensions/context.xml /usr/share/tomcat7/conf/context.xml"
This example used a single memcached instance. Elasticache supports clusters of servers and they have forked spymemcached-2.10.2.jar to handle this. If you plan to use this technology then you would need to use the forked instance and change the pool configuration in the context.xml file.
Thanks for writing this. Since you went through the effort of publishing this code, I thought I'd share what I ended up doing to solve the same problem. I tried Amazon's DynamoDB solution but found intractable bugs and other issues with that code, so I could not use it. I've got an Elastic Beanstalk cluster of Tomcat instances that had session state. Here's what I did:
First I rewrote my code to push most of the session state either to the browser or the database, save one last bit of session state needed by my servlet code: The user ID associated with a signed in session.
To externalize this last bit of state, I created a dedicated session-state server, also running on Tomcat, that essentially plays the role of one memcached node - it's an in-memory hash table mapping HttpSession.getIds() to my userIds, fronted by a very simple web service to set (sign in), clear (sign out) lookup (is signed In) values.
Inside Tomcat, I also store as HttpSession state the userId. In a Servlet listener, before my Get/Post function is called, I check to see whether the session has a userId. If not, I make a call out to my dedicated session state server to see if it has one. If so, I get it and populate the tomcat session state with the same Id. If not, I assume the user was signed out and I redirect to my sign out logic. This lazily loads the session state into tomcat servers as needed. Also here's a tip: At the same time, I store the the userId and the request and response objects in the thread local store so all my Java code has access to that without having to pass that down, but I digress.
I also track last update time for sessions in the state server and added a task to expire idle sessions after 30 minutes.
With all this, I could turn off sticky sessions.
To eliminate the session state server from being a single point of failure, the tomcat servers use multicast to update the session state server to eliminate cache inconsistency, and elastic load balancer to balance TCP calls to the web service to lookup the session state.
All in, it took me about 2 days to write and deploy everything from beginning to end, and I believe it's significantly simpler and probably faster than memcached, at least for my purposes.