PDA

View Full Version : error logs filling up on instances


Anthony
08-14-2007, 08:46 AM
While developing, I soon noticed that instances such as the HLB, IN, and OUT began having volume space issues on its /boot volume ( No space left on device ). It happened one by one as I would stop/start the app while developing.

As a temporary fix, I would simply branch the HLB and IN, then resize the boot volume, but when it happened to the OUT as well, I decided to see why this was happening. I tracked it down to the error logs becoming large, as much as 22 megs. The error logs in question are located at /var/log/cce/core/error.log and /var/log/cce/um-error.log.

The contents of error.log are repeating this:
[Tue Aug 14 07:10:06 -0700 2007][2806] [cc_base.c:cc_base_reply:128] Failed to send pool for app [23:0:0]
[Tue Aug 14 07:10:06 -0700 2007][2806] [um_app.c:um_send_pool:197] Failed passing pool address over [/tmp/cce-2807_ca9d4] returns [No such file or directory]

The contents of um-error.log are repeating this:
[Tue Aug 14 07:09:58 -0700 2007][3150] Failed to get the init info. Error code [510]
[Tue Aug 14 07:09:59 -0700 2007][3149] Failed to get the init info. Error code [510]
[Tue Aug 14 07:10:00 -0700 2007][2807] Failed to get the init info. Error code [510]
[Tue Aug 14 07:10:03 -0700 2007][3151] Failed to get the init info. Error code [510]
[Tue Aug 14 07:10:03 -0700 2007][3098] Failed to get the init info. Error code [510]
[Tue Aug 14 07:10:04 -0700 2007][3150] Failed to get the init info. Error code [510]
[Tue Aug 14 07:10:05 -0700 2007][3134] Failed receiving signal information from CCE core. Using default values. Error code [510]

Just wanted to let everyone know of this. It may cause additional downtime on a production app when all you meant to do was a simple restart. A possible fix would be for 3tera to put in a script to clear part of the error logs upon component shutdown, but leave just enough for any troubleshooting. Version is 2.0.2. Thanks.

PeterNic
08-14-2007, 10:14 AM
Anthony,

Thank you for posting it to the forum and sorry for the trouble. This is a bug in the 2.0.2 release; you may be able to alleviate it -- do you have the mon terminal of the appliances connected to a MON appliance; if yes, is the MON appliance running?

I'll also check if a hotfix for 2.0.2 is available; if not, you can branch the offending appliances, set up a cron job to delete the log periodically and put in the user catalog.

This issue is resolved in all appliances of the upcoming 2.0 production release.

Regards,
-- Peter

Anthony
08-14-2007, 10:38 AM
Ah I see it in bugzilla now.

The MON appliance isn't running but they are connected.

Its no problem, at least I know whats going on now. Thanks.