PDA

View Full Version : Restart of NET appliance fails


kapow
03-25-2008, 09:53 AM
In our application, we have a standard NET appliance which is not branched.

Doing a restart of the application today post some config changes in other appliances, the NET appliance fails to start. This application has been restarted 100s of times with no issues until now.

In /var/log/appliance/log, I see this:

WARNING: Can't read module /lib/modules/2.6.16.33-xenU/kernel/net/netfilter/x_ta
bles.ko: Input/output error
WARNING: Can't read module /lib/modules/2.6.16.33-xenU/kernel/net/netfilter/xt_C
LASSIFY.ko: Input/output error
WARNING: Can't read module /lib/modules/2.6.16.33-xenU/kernel/net/netfilter/xt_C
ONNMARK.ko: Input/output error
WARNING: Can't read module /lib/modules/2.6.16.33-xenU/kernel/net/netfilter/xt_M
ARK.ko: Input/output error
FATAL: Error inserting ip_tables (/lib/modules/2.6.16.33-xenU/kernel/net/ipv4/ne
tfilter/ip_tables.ko): Unknown symbol in module, or unknown parameter (see dmesg
)
Failed to set up backup rule set (exit code 0)
iptables failed to start

Going into that directory and doing a listing reveals:



drwxr-xr-x 2 root root 1024 Aug 3 2007 .
drwxr-xr-x 7 root root 1024 Aug 3 2007 ..
-rw-r--r-- 1 root root 7659 Aug 2 2007 nfnetlink.ko
-rw-r--r-- 1 root root 11985 Aug 2 2007 nfnetlink_log.ko
-rw-r--r-- 1 root root 13394 Aug 2 2007 nfnetlink_queue.ko
?--------- ? ? ? ? ? x_tables.ko
?--------- ? ? ? ? ? xt_CLASSIFY.ko
-rw-r--r-- 1 root root 3106 Aug 2 2007 xt_comment.ko
-rw-r--r-- 1 root root 3744 Aug 2 2007 xt_connbytes.ko
-rw-r--r-- 1 root root 3183 Aug 2 2007 xt_connmark.ko
?--------- ? ? ? ? ? xt_CONNMARK.ko
-rw-r--r-- 1 root root 3700 Aug 2 2007 xt_conntrack.ko
-rw-r--r-- 1 root root 4909 Aug 2 2007 xt_dccp.ko
-rw-r--r-- 1 root root 3552 Aug 2 2007 xt_helper.ko
-rw-r--r-- 1 root root 3253 Aug 2 2007 xt_length.ko
-rw-r--r-- 1 root root 3947 Aug 2 2007 xt_limit.ko
-rw-r--r-- 1 root root 3529 Aug 2 2007 xt_mac.ko
-rw-r--r-- 1 root root 3126 Aug 2 2007 xt_mark.ko
?--------- ? ? ? ? ? xt_MARK.ko
-rw-r--r-- 1 root root 3635 Aug 2 2007 xt_NFQUEUE.ko
-rw-r--r-- 1 root root 3478 Aug 2 2007 xt_NOTRACK.ko
-rw-r--r-- 1 root root 3202 Aug 2 2007 xt_pkttype.ko
-rw-r--r-- 1 root root 3185 Aug 2 2007 xt_realm.ko
-rw-r--r-- 1 root root 4606 Aug 2 2007 xt_sctp.ko
-rw-r--r-- 1 root root 3403 Aug 2 2007 xt_state.ko
-rw-r--r-- 1 root root 3554 Aug 2 2007 xt_string.ko
-rw-r--r-- 1 root root 3881 Aug 2 2007 xt_tcpmss.ko
-rw-r--r-- 1 root root 5157 Aug 2 2007 xt_tcpudp.ko
[test-kapow-net:main.net netfilter]# df .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda1 79327 64481 10750 86% /

My first thought was a corrupt volume, so I did a volume repair --all but that returned that there were no volumes in need of repair.

I could easily remove the appliance and put a new instance of NET in, but before I do that, I would like to know why this happened?

My grid provider has also been alerted.

PeterNic
04-04-2008, 07:42 PM
Kapow,

Looks like a corrupted file system. The volume repair fixes the mirrors at block level and doesn't deal with the filesystem.

Considering that this is the boot volume, the easiest way to fix it is simply to do "app stop" and "app clean" -- the latter will force all appliance class volumes to be refreshed from the catalog. You may want to grab /var/log/* from the appliance before destroying it.

Regards,
-- Peter