Friday 30 July 2010

LAP - Actual Build - Part 9a, What Went Wrong

As I have remarked a couple of times, the first build of this did not boot. As a result, I decided to use Grub 0.97 instead of 1.97. The failure may not have been the fault of Grub 1.97, and I am therefore going to repeat all the Grub 1.97 instructions here, as I may want to review this in the future to try to get it to work.

There are 3 reasons, to my mind, why a boot might fail. The first is the Grub bootloader. If this is configured incorrectly, then you do not pass go. The second is the Kernel. If you are trying to boot off a USB Key but you have not configured USB Storage support in the Kernel, it ain't going to work with even the best grub configuration. The third potential issue is if you haven't configured your /etc/fstab file to actually mount [/] properly.

When I tried to boot from the USB Key the boot process would continually fail, and complain that the 'root=' that I was trying to mount could not be found. No matter what combination of options I put into the grub configuration nothing seemed to work. Irritatingly, sometimes it failed saying sdb1 could not be found, and then listed the possible partitions -INCLUDING FUCKING sdb1. However, sometimes it did not. One of the things that really fucks me off is an intermittent problem, and here I had one. So I went looking for solutions. I went in the reverse order of the potential problems in the previous paragraph.

I had started with an fstab which used the UUID for the USB Key. That's the long-string-of-letters-and-numbers you find if you look in /dev/disk/by-uuid. However, due to the collossal fuckup that was trying to boot, I reverted to mounting the drive by reference to its label, which I called 'amiga', way back when I formatted it in the configuration stages. Why are either of these options not in the book? The book supposes that you are installing to a Harddisk drive. We are installing to a USB Key. As we have already seen if we shift from PC to PC the /dev/sd{a-d}1 changes. So, if we boot on a PC using an [fstab] file which just references the sdb1 location, for instance, then we'll be fucked if it is not exactly the same. One way we can avoid this is by referring to disks by their UUID number. This is the random string and is allocated at the time of (I think) formatting the disk. The UUID is built into the USB Key and will follow it from machine to machine, so referring to it directly is always going to work. That may, however, have contributed to the grand boot fuckup, as I shall now be calling it. So I switched the fstab to use the label instead. This did not immediately fix the problem, but it is now working and being a strong proponent of the 'if it ain't broke don't fix it' school, I am not fucking around with the fstab file now.

So I had tried fstab. My next move was to try the kernel. I actually went through:

<*> ATA/ATAPI/MFM/RLL support  --->
<*> Serial ATA (prod) and Parallel ATA (experimental) drivers  --->
Sonics Silicon Backplane  ---> (apart from debuggin)
USB

*** NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may ***                                                             
*** also be needed; see USB_STORAGE Help for more info ***
<*>   USB Mass Storage support

and activated every single option I could find. I then compiled what must have been a massive Kernel file. Did it work? Did it fuck.

I then tried to get to grips with the settings for Grub 1.97. Few of them made any sense to me. I couldn't work out what the 'search' line in the configuration file was supposed to do. It looked quite unlike anything in the 0.97 version of grub. I suspected that the problem was the reassignation of drive identities when booting from the USB Key. This is relatively easy to fix in grub 0.97, but most definitely NOT in 1.97. So I decided to use Grub 0.97. The only way I could think to do this (because there is no automatic package management in LFS) was to format the disk and start again. This took a few hours. They were not particularly happy hours. Especially when it changed precisely fuck all about the boot problem.

However, I now at least had a simple grub file to work with. The first thing I could fix was the [root (hdX,X) line. I knew that using the same [hdX] as the Key was using during the install was a bad idea, because it would be reassigned on boot. I changed this to [root (hd0,0)], and this let me at least start the boot. I then got the same error messages as before. I thought that the [kernel=] line may have had a problem with finding the USB Key because of the stupid messages like 'sdb1 cannot be found, try sdb1 instead'. Fucking ridiculous. I thought if I set it find the disk by UUID or Label instead it might fix this. It turns out that you can't do that without configuring an [init]ial [r]am[d]isk. That does not sound like fun, so I turned to the Kernel Documentation instead to try and find an alternative. In there I came across the [rootdelay] option. This tells the Kernel to wait for a specified number of seconds before trying to boot. In particular it lets us wait for the USB Keys to show up. 10 seconds seems to work fine. Once I had put this setting in the grub.lst file, it ACTUALLY FUCKING BOOTED LINUX!

POSSIBLY if hd0,0 and rootdelay=10 AND fstab had been correct it may have worked with grub 1.97. Maybe. So, in case I want to revert to grub 1.97 at some point in the future, I am going to repeat its install commands here. Firstly you need to install the package - run these commands from the /sources directory.

tar -xzvf archive/grub-1.97.2.tar.gz
cd grub-1.97.2
mkdir build 
cd build 
../configure --prefix=/usr --sysconfdir=/etc --disable-grub-emu --disable-grub-emu-usb --disable-grub-fstest --disable-efiemu 

The [sysconfdir] presumably tells the install where to put the configuration files for grub.  According to the book, the [disable] options block program features that we have no need for in LFS.  Fair enough.

make -j2
make install
cd ../..
rm -rvf grub-1.97.2

There is much more to do to install Grub to the boot sector of the disk in question. 

First we setup a [device.map] file, which will be no fucking use once we reboot from the USB Key:

grub-mkdevicemap --device-map=device.map

Let's have a look at the contents:

cat device.map

I got:

(hd0) /dev/sda 
(hd1) /dev/sdb 
(hd2) /dev/sdc

[sda] is the harddisk in my laptop. [sdb] is the USB Key which the LiveCD is running from. [sdc] is the USB Key which has Linux From Scratch installed on it. To install the operative files for grub (as opposed to installing the package which we did earlier) I ran this command:

grub-install --grub-setup=/bin/true /dev/sdc

The [grub-setup=/bin/true] option makes sure that the actual MBR remains untouched for the time being. We can now automatically make the configuration file for grub:

grub-mkconfig -o /boot/grub/grub.cfg

This produced these messages:

Generating grub.cfg ... 
Found linux image: /boot/vmlinux-2.6.32.8-lfs-6.6 
done

I then ran the following command to install Grub to the MBR of the USB Key:

grub-setup /dev/sdc 

And that should be that. Only it won't be. We'll have to change the grub.cfg and probably device.map to reflect the reality of what will happen when the USB Key is the boot device.  And I am bollocksed if I am going to work out how to do that when 0.97 just bloody works.

No comments:

Post a Comment