PDA

View Full Version : GSC/cPanel blues - failure to start or login


PeterNic
08-19-2007, 12:33 PM
Hi all,

I want to share some of the common problems our customers report with cPanel. These problems are easily fixable, and, overall, cPanel is working quite well in AppLogic.

The problems are not cPanel specific, so if you are seeing similar problem with any software you install -- especially on a GSC-type application -- do try some of the following fixes.


1. cPanel application fails to start - due to long startup sequence

cPanel comes with a lot of services installed and starting during boot. On some larger cPanel installations, the start takes too long and AppLogic decides that the boot has failed/hung and shuts down the cPanel.

Quick workaround: use the --debug option to start the app, then use continue:

app start mycpanel1 --debug
app continue mycpanel1


(note: when using --debug in AppLogic 1.2, beware of defect SCR 1418, described in http://support.3tera.net/showthread.php?t=38. Simply don't use the restart operation there, use stop and start instead. This problem has been resolved in AppLogic 2.0.2 and later).

There are two possible fixes, choose which one is more appropriate for your needs:

increase the boot timeout - open the application, right-click on the cPanel appliance, select Attributes, check 'boot time override' and specify larger timeout value (in seconds) -- e.g., 600. Verify that the application now starts successfully. This method of fixing the problem still allows you to see a failure on the shell in case cPanel fails to start for some other reason.
move the reporting of 'boot completed' earlier in the boot process, so the appliance reports 'boot completed' prior to trying to start the numerous slow starting services installed by cPanel. This method frees your grid shell faster (the app start command completes faster) and also makes the appliance start even in case of certain failures, so it is easier to login and fix them; the drawback is that unless you check the cPanel's start, you may miss a later-stage startup failure.


2. cPanel application fails to start - due to incorrect firewall settings

AppLogic appliances need to be able to communicate with the controller; if you have installed a firewall that prevents this communication, the appliance will not be able to report it has started successfully, in which case AppLogic will shut it down after the boot timeout.

Quick workaround: like in the #1 case above

Solution: make sure that the eth1 (actually, the 'default') interface is not firewalled.

3. cPanel application fails to start - due to missing agent

AppLogic appliances use a small VM agent to send the started_ok event to AppLogic, so that AppLogic knows that the appliance has completed its startup successfully.

There are two steps:

the vmad agent must be started as early in the boot as the network is available
when the appliance considers itself started OK, it should send an event using the vme utility with event ID started_ok (the upside of learning about this is that you can now instrument the appliance to report all kinds of failures and events -- the vme utility can be used to report the reason for start failure, as well as to send critical notifications to the grid dashboard, which, in turn, can be sent as e-mail notifications).


3. Cannot ssh into cPanel from the controller - due to incorrect firewall settings

Incoming tcp port on eth1 (default interface) must be enabled on the appliance, otherwise the controller will not be able to ssh in.

Workaround: ssh through the external (public) IP address

Solution:

if you can login through the external (public) IP address, you can fix the firewall
if you can't login, then stop the cPanel application, mount its volume and edit the firewall using the volume access methods described in the documentation (and elsewhere on this forum)


4. Cannot ssh into cPanel from the controller - due to added terminals to cPanel

Some older cPanel templates (and GSC, on which cPanel templates are usually based) don't have a critical network setting that is required for appliances with terminals to work. The lack of this setting does not cause any problems until the first terminal is added to GSC or cPanel (usually going to a MySQL server appliance or a backup appliance).

See http://support.3tera.net/showthread.php?p=179 for details

5. cPanel cannot communicate with other appliances - due to added terminals to cPanel

The reason is the same as #4 above. The fix is the same, too.

6. cPanel appliance is slow, logs big banner in the log during boot - due to incompatible glibc library

AppLogic appliances use Xen domU kernel. Many older glibc libraries use a method of accessing application memory that is incompatible with the Xen kernel; in order to allow such appliances to work, Xen takes some extra virtualization steps, which result in slower performance.

Look at the boot log (/var/log/messages) during the boot's early stages. If you see a big, star-bordered banner, stating that the tls library should be disabled, you are likely getting slightly worse performance that you could.

Solutions:
- follow that banner's advice, and disable the TLS library (mv /lib/tls /lib/tls.disabled). Note that this disables the NPTL threading library, reverting to the older linuxthreads threading package.
- use a virtualization-friendly glibc (post to the forum if you need more details)

Note: sometimes your cPanel/GSC starts off with the TLS library OK, but later upgrades (e.g., updates) replace it with the broken library. Be sure to check for the banner after any significant update.

---
Regards,
-- Peter