SOS: Production VM not starting!

List overview All Threads
Download

newer

older

trying to get the debug version of...

Re: [CentOS] CONFIG_ARPD turned on...

Nikolaos Milas

10 Dec 2012 10 Dec '12

10:57 p.m.

I am using a VM with CentOS 5.8 x86_64 under KVM. I only have console access to the VM through a virtual console (web based).

Tonight, after a routine "yum update", I did a "shutdown -r now" due to kernel update and the VM won't start. See console screenshot vm1.png:

https://vmail.noa.gr/files/vm1.png

There is an error (which I haven't seen before):

type=1404 audit (...): selinux=0 auid=... ses= ...

(see vm1.png above)

(The system cannot load even with the old kernel; the same error occurs.)

Note that SElinux is disabled on this system. I booted in rescue mode, and auto mount was unsuccessful (see https://vmail.noa.gr/files/vm2.png). The log of the rescue process shows some error (see https://vmail.noa.gr/files/vm4.png)

However, later I successfully mounted it using:

mount -t ext3 /dev/vda3 /mnt/sysimage

I then did a umount and:

fsck.ext3 /dev/vda3 which found it clean (see https://vmail.noa.gr/files/vm3.png).

What is wrong there?

Can you please guide me on how to make it work again?

This is a production ftp machine. Please help to revive.

Thanks, Nick

Show replies by date

Eero Volotinen

10 Dec 10 Dec

11:07 p.m.

2012/12/11 Nikolaos Milas nmilas@noa.gr:

...

I am using a VM with CentOS 5.8 x86_64 under KVM. I only have console access to the VM through a virtual console (web based).

Tonight, after a routine "yum update", I did a "shutdown -r now" due to kernel update and the VM won't start. See console screenshot vm1.png:

https://vmail.noa.gr/files/vm1.png

There is an error (which I haven't seen before):

type=1404 audit (...): selinux=0 auid=... ses= ...

Is this really error? I

...

(see vm1.png above)

(The system cannot load even with the old kernel; the same error occurs.)

Note that SElinux is disabled on this system. I booted in rescue mode, and auto mount was unsuccessful (see

maybe you need to disable selinux before trying to mount rescue environment?

...

https://vmail.noa.gr/files/vm2.png). The log of the rescue process shows some error (see https://vmail.noa.gr/files/vm4.png)

How about installing new vm and just copying files and settings to it?

Can you boot this vm to single user mode ?

-- Eero

Nikolaos Milas

11:24 p.m.

On 11/12/2012 1:07 πμ, Eero Volotinen wrote:

...

Is this really error? I

Thanks for replying.

Don't know, but it hangs there forever (at least it appears so - haven't waited more than half an hour, but it's already too much).

...

maybe you need to disable selinux before trying to mount rescue environment?

Hmm, selinux is already disabled. How can I adjust selinux settings in the rescue environment?

...

How about installing new vm and just copying files and settings to it?

I would like to avoid it, if possible. I would still need to somehow make this one visible on the network, to be able to copy large data files.

My next bet would be to restore from backup, but I would rather make the current VM work.

...

Can you boot this vm to single user mode ?

No, it gets stuck at the same point as well.

Any ideas why it keeps waiting forever at that point?

Thanks, Nick

Nikolaos Milas

11 Dec 11 Dec

1:01 a.m.

On 11/12/2012 1:24 πμ, Nikolaos Milas wrote:

...

Any ideas why it keeps waiting forever at that point?

After having left it alone for an hour or so, I found it had booted successfully. Didn't find anything serious in /var/log/messages.

I still wonder what caused that delay.

So, red alarm is over.

Regards, NIck

Markus Falb

2:33 a.m.

On 11.12.2012 02:01, Nikolaos Milas wrote:

...

On 11/12/2012 1:24 πμ, Nikolaos Milas wrote:

...
Any ideas why it keeps waiting forever at that point?

After having left it alone for an hour or so, I found it had booted successfully. Didn't find anything serious in /var/log/messages.

I had a look at your sreenshot. Output stops at the moment init is taking over. I suspect that console output is going elsewhere, maybe to a serial console. That way it could well be that the machine is doing something but you just can not see it.

My first bet would have been a fsck

-- Kind Regards, Markus Falb

Nikolaos Milas

12 Dec 12 Dec

10:51 a.m.

On 11/12/2012 4:33 πμ, Markus Falb wrote:

...

I had a look at your sreenshot. Output stops at the moment init is taking over. I suspect that console output is going elsewhere, maybe to a serial console. That way it could well be that the machine is doing something but you just can not see it.

My first bet would have been a fsck

Thanks,

I think you are probably right. This VM features a large (virtual) data hard disk, and I found that it was mounted (in /etc/fstab) with autocheck options. Therefore, to avoid this problem in the future, I changed to "0 0" options.

I had already suspected this (an auto fsck) might be the case, but in such cases in the past (with other VMs), the process was visible in the virtual console, while in this case apparently it was not.

However, I did not find in /var/log/messages any instance of fsck checks during loading.

Thanks again.

Regards, Nick

Markus Falb

5:35 p.m.

On 12.12.2012 11:51, Nikolaos Milas wrote:

...

On 11/12/2012 4:33 πμ, Markus Falb wrote:

...
I suspect that console output is going elsewhere, maybe to a serial console. That way it could well be that the machine is doing something but you just can not see it.

My first bet would have been a fsck

...

However, I did not find in /var/log/messages any instance of fsck checks during loading.

You will never find fscks in /var/log/messages.

fsck happens too early in the boot process, syslog is not yet running. There is a mechanism to log this early stuff though. What you could have seen at the console while booting is also in /var/log/boot.log. With CentOS 6 this is working.

Sadly, boot.log on my CentOS 5 machines is empty and so will be yours. https://bugzilla.redhat.com/show_bug.cgi?id=223446

-- Kind Regards, Markus Falb

Nikolaos Milas

7:19 p.m.

On 12/12/2012 7:35 μμ, Markus Falb wrote:

...

Sadly, boot.log on my CentOS 5 machines is empty and so will be yours.

Yes, I had checked already, it's always 0 size...

Thanks for your info.

Nick

Gordon Messmer

5:37 a.m.

On 12/10/2012 05:01 PM, Nikolaos Milas wrote:

...

I still wonder what caused that delay.

What does "getenforce" output? It sort of looks like you went from an SELinux-disabled configuration to an enforcing or permissive configuration and required a relabel.

Nikolaos Milas

10:54 a.m.

On 12/12/2012 7:37 πμ, Gordon Messmer wrote:

...

On 12/10/2012 05:01 PM, Nikolaos Milas wrote:

...
I still wonder what caused that delay.

What does "getenforce" output? It sort of looks like you went from an SELinux-disabled configuration to an enforcing or permissive configuration and required a relabel.

Thank you for helping find the cause of this behavior.

SELinux was always disabled (and still is) on that VM:

# getenforce Disabled

Any other ideas would be appreciated.

Regards, Nick

4638

Age (days ago)

4640

Last active (days ago)

discuss@lists.centos.org

9 comments

4 participants

tags (0)

participants (4)

Eero Volotinen
Gordon Messmer
Markus Falb
Nikolaos Milas