I’m trying to boot Linux from U-boot on an embedded ARM board using a filesystem on a remote machine served via NFS. It appears that the ethernet connection is not coming up correctly, which results in a failure to mount the NFS share. However, I know that the ethernet hardware works, because U-boot loads the kernel via TFTP.
How can I debug this? I can try tweaking the kernel, but that means recompiling the kernel for every iteration, which is slow. Is there a way that I can make the kernel run without being able to mount an external filesystem?
Here is Solutions:
We have many solutions to this problem, But we recommend you to use the first solution because it is tested & true solution that will 100% work for you.
You can compile a initrd image into kernel (
General Setup -> Initial RAM filesystem and RAM disk (initramfs/initrd) support -> Initramfs source file(s)). You specify file in special format like (my init for x86):
dir /bin 0755 0 0 file /bin/busybox /bin/busybox 0755 0 0 file /bin/lvm /sbin/lvm.static0755 0 0 dir /dev 0755 0 0 dir /dev/fb 0755 0 0 dir /dev/misc 0755 0 0 dir /dev/vc 0755 0 0 nod /dev/console 0600 0 0 c 5 1 nod /dev/null 0600 0 0 c 1 3 nod /dev/snapshot 0600 0 0 c 10 231 nod /dev/tty1 0600 0 0 c 4 0 dir /etc 0755 0 0 dir /etc/splash 0755 0 0 dir /etc/splash/natural_gentoo 0755 0 0 dir /etc/splash/natural_gentoo/images 0755 0 0 file /etc/splash/natural_gentoo/images/silent-1680x1050.jpg /etc/splash/natural_gentoo/images/silent-1680x1050.jpg 0644 0 0 file /etc/splash/natural_gentoo/images/verbose-1680x1050.jpg /etc/splash/natural_gentoo/images/verbose-1680x1050.jpg 0644 0 0 file /etc/splash/natural_gentoo/1680x1050.cfg /etc/splash/natural_gentoo/1680x1050.cfg 0644 0 0 slink /etc/splash/tuxonice /etc/splash/natural_gentoo 0755 0 0 file /etc/splash/luxisri.ttf /etc/splash/luxisri.ttf 0644 0 0 dir /lib64 0755 0 0 dir /lib64/splash 0755 0 0 dir /lib64/splash/proc 0755 0 0 dir /lib64/splash/sys 0755 0 0 dir /proc 0755 0 0 dir /mnt 0755 0 0 dir /root 0770 0 0 dir /sbin 0755 0 0 file /sbin/fbcondecor_helper /sbin/fbcondecor_helper 0755 0 0 slink /sbin/splash_helper /sbin/fbcondecor_helper 0755 0 0 file /sbin/tuxoniceui_fbsplash /sbin/tuxoniceui_fbsplash 0755 0 0 file /sbin/tuxoniceui_text /sbin/tuxoniceui_text 0755 0 0 dir /sys 0755 0 0 file /init /usr/src/init 0755 0 0
I haven’t used it on ARM but it should work.
/init is file you are can put startup commands. Rest are various files needed (like busybox etc.).
A few things that come to mind:
- Use tcpdump, wireshark or other Ethernet packet inspector to see whether the board is sending packets to the wrong address or not sending anything at all.
- What do you have on the serial console (if there is one)?
- Try connecting a remote kernel debugger.
- Try running inside a simulator, if you have a simulator that you can reproduce your problem in.
- Instead of just fetching a kernel, put a boot-and-root filesystem in flash memory, or load a root filesystem to a RAM disk.
This post is regarding the network issues brought up in the question, not about kernel debugging.
If your switch supports Spanning Tree Protocol (STP), keep in mind that STP may not activate the the Ethernet port on the switch for 6 seconds or more while STP does it’s work. This delay may start over every time the host resets the Ethernet port on the host, which can happen multiple times between power-up, the DHCP request, when the Kernel loads the network drivers, etc. This can interfere with NFS boots for diskless systems, DHCP, kickstart, etc. and has caused plenty of headaches for many sysadmins. For some examples, see RedHat Bug 189795 – DHCP timeouts during Kickstart , and this PXE Guide.
Most high end switches such as Cisco switches and HP ProCurve switches do support STP, and it’s enabled for all ports out of the box.
Note: Use and implement solution 1 because this method fully tested our system.
Thank you 🙂