On 30/03/13 7:18, Joakim Ziegler wrote:
On 29/03/13 10:38, Gordon Messmer wrote:
On 03/29/2013 01:23 AM, Joakim Ziegler wrote:
Immediately after getting dropped to rdshell, I looked around in /dev, which brought me a few surprises...
/dev/mapper contains only "control", that is, "vg_resolve02-lv_root" is missing.
Did you get to look at or for /dev/vg_resolve02 as well?
/dev/root is a symlink to /dev/dm-0
Does /dev/dm-0 exist?
Does the system boot if you just "exit" from the rdshell? What about if you "vgchange -a y" without changing the symlink?
I checked this a bit more thoroughly. The status is as follows:
When I boot up and get dropped to rdshell, neither /dev/root nor /dev/vg_resolve02, nor /dev/dm-0 exist. Just exiting at this point drops me back into rdshell. Waiting a few minutes makes no difference.
Doing lvm vgscan finds the volume group, but creates no device nodes. Just exiting at this point drops me back into rdshell as well.
When I do lvm vgchange -ay, /dev/dm-0 is created, /dev/root is created as a symlink to it, as well as /dev/vg_resolve02/ with lv_root inside it, and /dev/mapper/vg_resolve02-lv_root. I don't need to change the symlink or do anything else, if I exit after doing lvm vgchange -ay, everything is ok.
That means /dev/root already is correct, so the only thing I'm actually changing to make the system boot is to scan for volume groups and activate them.
The big question then becomes: Why do I have to do this manually? How do I make Dracut (I assume this is Dracut's job) make this automatically?
udev should be doing this. And... I was just looking at this again, because the last time I came up with nothing useful. Look at /usr/share/dracut/modules.d/90lvm/64-lvm.rules. If I'm reading this correctly, udev will look for dm-0 in /sys and will not run lvm_scan if it's found. I wonder if it's possible that the /sys nodes are getting set up, but device-mapper isn't setting up the nodes in /dev?
It turns out I was wrong about dm-0 already existing, it's created on vgchange -ay. I'm looking at the file you mention, but I'm afraid I don't know LVM well enough to make that much sense of it. From what I can tell, it calls lvm_scan for each device, and there's an lvm_scan.sh in there that looks like it should be doing lvchange -ay, but if dm-0 doesn't already exist, I don't think this will do anything, am I wrong?
I'm really at a loss... it seems like a much simpler explanation is simply that the devices take so long to detect that init gives up. When you run vgchange, they've had the time they need. That idea is inconsistent with the fact that your dmesg output shows what I assume is the correct devices and partition tables.
You could try adding "rdinitdebug rdudevdebug" to your kernel command line, but you're going to see a LOT of output, and it's only really going to be meaningful if you've read the /init script that Dracut creates, and understand more or less what it's doing, particularly in the "main_loop" section.
I can try this, but it might be a bit beyond my area of expertise, I'm afraid.
If I were to just try a brute force approach, what RPM packages should I reinstall/update to get all this stuff reinstalled as it was the first time I installed the system?
Just bumping this up, any ideas about this? It's a little annoying not having this box boot by itself...