PDA

View Full Version : Grid Controller not recognising appliance?


NVickers
03-10-2009, 04:58 PM
Hi,

I've run through the Windows Installation Reference guide for configuring a windows appliance (w2003 std R2 32bit).

After completing the iso2class procedure, the controller times out waiting for the appliance to boot. The following error is recorded in the log:

[233:error s:81]: srv2: VM 'vm.srv2.win03_install.main.iso2class' failed to boot within boot timeout period.

However, the appliance graphic login console shows the normal windows login screen! After the start timeout, it reports the start failure then takes an age to stop the appliance. In the meantime I logged in and saw all the new hardware requests for the additional vNICs - which completed OK.

To make matters a little more interesting, I noticed that when telling the windows VM to shutdown via the graphic console, the VM would only reboot, not shut down.

The APK was Server_windows_1.0.6.msi.

Any ideas?

Nathan.

PeterNic
03-10-2009, 08:22 PM
Nathan,

I may not have the full answer, but I can at least explain what's going on.

When AppLogic starts an appliance, it lets the appliance boot and then waits for the appliance to tell it it has finished the boot process successfully (or not). If the appliance doesn't say anything, AppLogic will eventually time it out -- and the start/stop timeouts are huge to prevent spurious failures.

This is why even though you can connect to the graphical console, AppLogic eventually shuts it down. Similar reasons for the long timeout on shutdown.

You can avoid the auto-stop on failed start by using the --debug option on start: "start app myapp --debug" -- it will still timeout but it will go to failed state and not try to stop the appliance. This will give you the time to do the fixes in the appliance (install the APK, etc.)

You can force kill the appliance by using --force on stop; this works as power-off. Ideally, if you install the APK correctly, you won't have to --force it -- in Windows, the shutdown request is initiated through the appliance, so if you have APK, it should be shutting down normally.

Finally, if you log into the appliance and shut it down, it reboots instead of staying down. This is an artifact of the high availability provisions -- under normal circumstances, you control appliances through the AppLogic interface (GUI or shell). If an appliance reboots spontaneously, this is most likely due to kernel crash or a similar reason. The appliance will reboot -- but there is another provision that will make it stop; the anti-flapping provision will prevent automatic restart of the appliance if it fails more than 3 times in any 24-hour period. (To prevent this from being awkward when you shutdown the appliance from inside as a human, we're thinking of providing a signal to AppLogic that this is an expected shutdown/reboot, so it won't be treated as a failure.)

All of these funky behaviors are most likely due to APK not being installed or not working properly (or network not working correctly to allow APK to work).

I hope the above helps. Let me know if you could move further with the Windows appliance setup. If not, let's connect through the helpdesk.

Best regards,
-- Peter

LeoKalev
03-13-2009, 03:50 AM
... In the meantime I logged in and saw all the new hardware requests for the additional vNICs - which completed OK.

If you have installed the Paravirtual drivers (e.g., Halsign), they will not auto-start until you log in interactively and see all the "new hardware detected" dialogs. This may also explain why your appliance did not connect to AppLogic on the first start. If using the PV drivers, the new OS must be restarted at least once manually and you have to log in via the graphical console (as Administrator) before it starts working normally.

This should be explained in the AppLogic documentation, I think.

PeterNic
03-13-2009, 09:29 AM
Nathan, were you able to get the Windows appliance working properly?

maiko
03-18-2009, 11:19 PM
Hi,

I'm in a sort of similar situation...

On AppLogic 2.4.8 grid, I've created iso image of Windows Server 2003 (32bit) and completed installing cygwin (full) and APK.

After that, I set Field Engineering Code "off" in order to change the appliance "managed" to "unmanaged", then the application didn't start up and got following error message.

[233:error s:81]: srv1: VM 'vm.srv1.my_new_windows.main.iso2class' failed to boot within boot timeout period.

# I also tried Windows Server 2008, but the result was the same.


I found cygwin sshd service hadn't been started in the application, and when trying "cygrunsrv -S sshd", I received following message:

cygrunsrv: Error starting a service: QueryServiceStatus: Win32 error 1062:

Is this related to the issue we are talking about?

Regards,

-- Maiko

NVickers
03-24-2009, 05:51 AM
Nathan, were you able to get the Windows appliance working properly?

Yes I was thank you Peter, although I must admit it took several attempts and required piecing together the various documentation pages that speak about creating a Windows appliance - I found I needed to mix the order of steps up a bit.

I also had my browser crash on me and I lost the iso2class window, and had to finish the process manually.

PeterNic
03-26-2009, 03:13 AM
Nathan,

I am glad it worked out.

If you have suggestions on how to improve the docs (e.g., combine two docs, expand certain steps, etc.) -- I will be grateful.

Best regards,
-- Peter

NVickers
03-30-2009, 10:06 PM
Hi Peter,

I'd be happy to do that, and I'll come back to you via the helpdesk. I can't promise it will be this week however.

Cheers,
Nathan.

PeterNic
03-30-2009, 10:14 PM
Much appreciated.

-- Peter