Tribune DataViz

Matters of interest, from the data reporters and developers across Tribune Publishing

The impossibility of running a private staging environment behind an Amazon EC2 Elastic Load Balancer (ELB)

with 11 comments

As part of our advocating best practices we encourage anyone developing software to have a proper staging environment. What that means, in a nutshell, is an environment that is in every way identical to your production environment, but is not accessible by the public. This allows the developer to thoroughly test a site—bug testing, usability testing, load testing—before it is deployed to the public, and reasonably expect the behavior of both sites to be identical.

Normally, this is a straight-forward task, but we have discovered a fundamental problem when trying to setup a staging environment behind an Elastic Load Balancer—Amazon’s fully automated pseudo-server that magically routes traffic amongst any number of application servers.

Here is the rub: requests to the app servers will appear to come from the ELB. (The original client’s IP address is added as an X-Forwarded-For header.) ELB’s do not exist within a security group—they are a completely independent entity. Thus the ELB must be specified as allowed within the staging security group in order for traffic to reach the server. In a production environment this isn’t an issue—traffic from any IP address is allowed on port 80. However, this causes a big problem for staging, because we only allow traffic for specific IPs. If we add the ELB’s IP to the security group then we have effectively punched a hole in our firewall—all traffic appears to come from the ELB, so all traffic is allowed.

Our resident artist, Brian, has illustrated it thusly:

This means that we can not control access to our staging environment at the security group level. Of course, we could use Apache filtering on the X-Forwarded-For header, but this (and similar solutions) must be applied at the server level and would therefore cause our staging and production environments to diverge.

What options does this leave us if we insist on maintaining parity between production and staging? Well, we could dispense with the ELB and implement our own load balancing, but that requires making a change to a production setup that is working fine. More to the point, this seems like an EC2 design flaw: none of us can contrive a reason why the ELBs aren’t embedded in security groups.

And while we are on the subject of ELB problems: why can’t they have Elastic IP addresses? Even if we could just add the ELB to the security group, it wouldn’t be stable because its IP address is not static. This is another problem with Amazon’s implementation.

(Incidentally, there isn’t even a straightforward way to determine the appropriate internal IP of the ELB. It is not the same IP that nslookup resolves, either within Amazon’s network or from outside. The only way I have found it is by examining the traffic logs on the application server.)

It is probably reasonable to assume that Elastic Load Balancers have these problems because Amazon implemented them in the simplest way possible and that precluded access to all the rest of the magic in their infrastructure. Certainly, they work fine without all that in the typical case. However, the need for a controlled environment that is functionally identical to our production environment (especially for load testing) really makes this a significant issue.

This all leaves us feeling a bit exasperated. Does anyone have a better solution for these issues? How do you handle your EC2 staging environment?


Written by Christopher Groskopf

March 18, 2010 at 9:57 am

Posted in Infrastructure

11 Responses

Subscribe to comments with RSS.

  1. I’ve discussed this with people at Amazon. I think they understand the shortcoming and are looking at appropriate solutions.

    Steve Evans

    March 18, 2010 at 10:29 am

    • Steve, let me know if you hear any sort of ETA on a solution to this problem, though I strongly suspect its not a high enough priority to get much attention.

      Christopher Groskopf

      March 18, 2010 at 11:38 am

  2. ELB is a strange beast. Yes, it’s got limitations that don’t make it appropriate for all environments, and its reliance on DNS and lack of security group integration mean its not appropriate for private applications.

    I wish I had a good recommendation for you. The best you can do today is choose between two not-perfect options:

    1. Keep your application hard to find. Don’t put in a DNS CNAME entry for the ELB into your DNS, instead directly use the ELB’s DNS name. And, run your application on oddball ports. Obscurity. It’s not security, but it’s the best ELB will allow you to do today.

    2. Avoid ELB, and a software load balancer (such as HAProxy) running on an EC2 instance. This will give you the security group control. If you configure HAProxy using the “least connection” algorithm then you will approximate ELB’s behavior. The downside is that HAProxy won’t scale elastically, so if you do heavy load testing you might overload it. It’s quite resilient, though – you’ll need to work hard to stress it.

    I’d love to hear what you try and the results.

    Shlomo Swidler

    March 18, 2010 at 11:14 am

    • Shlomo,

      Thanks for the suggestions.

      Regarding (1): it’s always an option. Our staging URLs aren’t public knowledge, but given the sort of work we are doing relying on obscurity really isn’t a good habit to get into. More to the point, running on oddball ports again causes our config to diverge in a somewhat ugly way (though admittedly its not a functional divergence).

      As for (2), I don’t mind running our own LB, but as I mentioned in the post I feel pretty ridiculous updating a production rig that works fine simply for the sake of our testing environment. Given the other limitations of ELBs I suppose it may just be a matter of time before we we have to do this anyway, but it rubs me the wrong way for this to be the reason.

      Thanks again for the feedback,

      Christopher Groskopf

      March 18, 2010 at 11:37 am

      • what amazon is doing is scaling…most developers and server admins dont have a good understanding of scaling network…and what it takes to support large datacenters as amazon cannot scale providing source ip addresses it means inserting a ELB in front of server and traffic must go in and out of elb…reason you see the elb ip and not the ip address is that the elb can be anywhere..and ec2 can move and grow elb wihout affecting you the client…
        so they give you you can use that for access everyone is clamoring about spoofing that header and yes it can…but remeber ELB inserts the source address in this field,,you should trust the last ip in the x-forwader-for because it connot be is inserted BY ELB…


        February 9, 2011 at 9:49 pm

  3. This problem also prevents applications from reliably determining the client IP address, since anyone can connect directly to the application and forge an X-Forwarded-For header.

    Michael Leonhard

    June 27, 2010 at 10:15 pm

  4. I know this article is a little dates but did you guys ever work this out, I am in the same boat?


    February 12, 2011 at 5:49 am

  5. As of 24 May 2011 you can configure your instances to accept traffic from one (or many) ELB(s). Each ELB has its own security group, visible in the ELB’s description, and you can add it to your member instances. Check out the docs for more details:

    Shlomo Swidler

    May 29, 2011 at 8:27 am

    • That’s great, thanks for the update Shlomo!

      Christopher Groskopf

      May 30, 2011 at 9:58 am

    • This does not fully solve the problem. At this time, traffic destined to the ELB URL continues to be unfiltered. Restricting traffic on the front end of your ELB is simply not an option.

      So that…if you’re wanting to load balance, say, a MySQL cluster, doing so via the ELB URL opens up your MySQL completely to the outside world.

      Perry Whelan

      August 22, 2011 at 9:44 am

  6. I’ll bump this, now what say you all about the Internal ELB available via EC2 today? I’m crossing my fingers that anyone would reply to this 4 year old post.


    March 7, 2015 at 9:46 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: