View Full Version : HALB Appliance
EricT
10-07-2008, 05:07 PM
This thread is dedicated to questions and comments related to the HALB - Session-aware HTTP Load Balancer Appliance.
The HALB data sheet can be found at:
AppLogic 2.4.x: http://doc.3tera.net/AppLogic24/CatSwitchesHalb.html
AppLogic 2.7.x: http://doc.3tera.net/AppLogic27/CatSwitchesHalb.html (page will be published concurrently with 2.7 release)
Please post any questions and/or comments here.
PeterNic
01-10-2009, 07:17 PM
FYI: Early notes and suggestions, some of which still quite valid, are also available in the original thread at http://forum.3tera.com/showthread.php?p=942#poststop.
David Crane
10-12-2009, 06:44 AM
There are a couple of more advanced HAproxy usages described in http://haproxy.1wt.eu/download/1.3/doc/architecture.txt that do not seem possible with the out-of-the-box HALB. These would be nice to have:
----------------------------
(1) Is it possible to achieve a rolling restart, or "Soft-stop for application maintenance"? It seems that we could do "Soft-stop using backup servers" if this HAproxy options could be set in /appliance/haproxy.cfg:
option persist
HALB already has the option redispatch. By combining that with option persist, we should be able to gently force any HTTP server appliance out of the HAproxy rotation for new sessions, as long as iptables is available on that server. This type of rolling restart is described at http://www.igvita.com/2008/12/02/zero-downtime-restarts-with-haproxy/
The option persist is important so that existing user sessions do continue to be served by the same server even after we use iptables to have HAproxy drop it. We could then wait for a while before restarting the server to pick up the newly deployed application update.
----------------------------
(2) Is it possible to achieve "HTTP load-balancing with cookie prefixing and high availability"? It seems almost, by setting these HALB properties:
mode: synch
cookie_name: JSESSIONID
If that is done on 2 HALB instances, each configured identically, they would be redundant to each other. All that remains is some way for incoming traffic to failover, but INSSLR doesn't seem to support that, having only a single http output terminal.
PavelGeorgiev
10-12-2009, 08:19 PM
David,
Regarding your second suggestion:
You can achieve that by having two instances of INSSLR configured to serve the same IP address (using failover) each connected to its own HALB. You can also configure INSSLR to do checks on some URL on the backend. This will not only test the load balancer but the actual backend servers behind it and if you craft the test page yourself you can have it verify various subsystems of your application (like database, storage server, etc.), not just the load balancer.
The use case that you refer to would provide just HALB redundancy, while if you put two INSSLR appliances each connected to its own HALB, you would have better redundancy.
Ideally, you would just run two copies of the whole application and use INSSLR for gateway redundancy, MYSQLR for database redundancy and NASR for storage and run then on separate servers/grids.
David Crane
02-11-2010, 08:50 PM
Is there a way to do a graceful restart of haproxy? The /appliance/haproxy_restart.pl script does look like it would supply the necessary -sf flag to the haproxy executable. But I cannot figure out how to call haproxy_restart.pl (I believe the the tmpwatch remover deleted the necessary /tmp/haproxy.pid file long ago). It also doesn't seem to be called in a straight-forward manner from any of the /etc/init.d service scripts.
Our haproxy been running it around the clock for 115 days now and has performed flawlessly, However, memory usage on the box has been slowly growing from 44MB to 191 MB. At this rate the total RSS memory would exceed 95% of the 256MB allocated to the component by the end of the month.
Date | Process Mem | Shared Mem | File Cache | I/O Buffers | Free Memory
-------------------------------------------------------------------------------
10/17 | 17% (44M) | 0% (10K) | 9% (23M) | 2% (4M) | 72% (185M)
2/11 | 75% (192M) | 0% (10K) | 10% (27M) | 2% (6M) | 12% (31M)
David Crane
02-12-2010, 06:00 AM
I'm sure I can just restore the /tmp/haproxy.pid file to run /appliance/haproxy_restart.pl. I'll test that in qa. If I do a weekly restart, the tmpwatch won't molest it. It's admin console reports the pid:
General process information
HAProxy pid = 2249 (nbproc = 1)
uptime = 118d 6h26m02s
system limits : memmax = 680 MB ; ulimit-n = 21004
That maxmem setting looks odd, since this host only has 256MB. Someone might be interested in our load statistics for this run:
Proxy instance load : 1 conns (maxconn=10492), 60006247 total conns
Definitely a cool piece of software. Thanks for bundling it into AppLogic.
David Crane
02-13-2010, 10:35 AM
Now I realize that /appliance/haproxy_restart.pl is an angel process always running that will restart the haproxy server based on a timestamp file. I am able to do this now.
First, I had to reconstruct /tmp/haproxy.pid using the pid I found on the web console (halb-stats.php) or through:
ps -ef | grep haproxy
Then I was able to force a restart with:
touch /appliance/haproxy.timestamp
As long as I do that weekly, that touch will restart haproxy, resetting its RSS memory as desired. It also resets all of the counters on the web console. A weekly restart is necessary because the /etc/cron.daily/tmpwatch is using 240 hours (10 days) as the remove threshold.
PeterNic
02-13-2010, 04:48 PM
David, thanks for the info. I am glad it is functioning well for you -- I am much happier with the HAProxy-based HALB than I was with HLB, too.
The problem with the pid file is that it is in the temp directory which is being cleaned periodically; newer versions of the appliance have this problem fixed. Our support may be able to help you with a recommended change (e.g., branch - fix the script/config - keep in your user catalog until you are ready to upgrade your grid and get the new catalog); this way you won't need the weekly exercise just to keep the pid file alive.
Have you seen the appliance running completely out of memory or are you just concerned with the increasing amount of memory (the latter may be benign as the Linux memory manager is optimizing the allocations, since there are no other apps running in that appliance's OS). Do you want us to look more carefully into this?
Best regards,
-- Peter
David Crane
03-11-2010, 08:25 PM
It's definitely running out of memory, although truthfully I cannot tell where. This is from top, clearly showing the system is running out of memory:
Mem: 262300k total, 258872k used, 3428k free, 4588k buffers
Swap: 0k total, 0k used, 0k free, 16152k cached
That's 91% of physical memory now, up from 27% when we started it 145 days ago. And those are of course consistent with what /usr/bin/free and /proc/meminfo show. However, no process seems to have a large RSS allocation. There is a very large virtual allocation for something called /appliance/vmad, 115m, but it's actual resident set is only 1720 (which makes no sense to me).
I will be restarting the haproxy instance this weekend, so you should look at this soon if you want to do forensics. If the memory doesn't recover from the haproxy restart, I will restart the entire component (comp restart www:main.load). I do have a spreadsheet that shows the steady growth, if you are interested. (Contact me at davidc at donorschoose.org, since I might not check back here timely.)
I strongly suspect that I really need to upgrade to AppLogic 2.8.9. You backported HALB for us. I'm sure you're aware we're on AppLogic 2.1.1 still, with a grid uptime of 745 days. Not too shabby, but showing its age. I totally forgot our 2 year anniversary! (Now that's shabby.)
PeterNic
03-11-2010, 08:45 PM
David,
Thanks -- and happy 2nd birthday to your AppLogic grid's uptime and of your "cloudification". You guys are doing great work at Donorschoose.org and we're honored to be a small part of it.
Can you please PM me with the name of the app and appliance -- I'll have someone look at the memory. Normal buffering/caching behavior may actually use up the majority of the memory, in a way that it can be released when needed. It doesn't look like this is the case here, so I would like one of our guys to take a quick look.
(also, can you please capture /proc/meminfo and /usr/bin/free for us before and after restart)
Thanks for your continued support!
-- Peter
vBulletin® v3.7.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.