What the heck is a bus error?

Lincoln Peters lincoln_peters at hotmail.com
Sat Mar 23 22:11:45 PST 2002


>From: ME <dugan at passwall.com>
>Reply-To: <talk at nblug.org>
>To: talk at nblug.org
>Subject: Re: What the heck is a bus error?
>Date: Sat, 23 Mar 2002 17:38:30 -0800 (PST)
>
>On Sat, 23 Mar 2002, Lincoln Peters wrote:
> > It's 64MB of RAM, not 32MB.  And these errors appear as early as 
>"Mounting
> > proc filesystem"!
>
>I am sure you have done this, but the NFS exported /proc is empty, right,
>and the dev directory (unless using devfs) has the necessary devices
>right?

Yes and yes.

>
>One of the other problems with exporting your own root for other machines
>to use, is the /var/lock and other files normally set/reset on each
>boot. It is really better to have a separate export that is clean/free of
>these problems and allow each station to have its own /var, and /tmp as rw
>or if you have enough memory (you prob dont) then use of a RAMDisk for
>/tmp and /var. (The exported /tmp, /var/lock, and /proc dirs for each
>station should be empty when they try to start in most cases.)

Palantir is *not* exporting its own root filesystem.  Thanks to a hardware 
donation by Tom Rowe, I was able to put a nice SCSI setup in Palantir.  The 
filesystem for the netbooting computers is on a SCSI hard disk mounted on 
the server as /netboot.

>
>(Imagine exporting your own root as rw and then when the client boots, it
>clobbers all your /tmp files and destroys all your /var/lock files. I am
>sure there are likely other areas that I am forgetting at this moment.)

I actually don't need to worry about that until I have more than one 
netbooting workstation.  At that point, I'll probably use tmpfs filesystems 
for all the necessary directories.

>
> > I haven't even made a netbooting machine start up to the point where it 
>can
> > run anything beyond what's shown in /etc/rc.d/rc.sysinit!
> >
> > Is the mount command really that big a memory user?
>
>Nope, not by itself. Not even close. I was assuming you had 32Mb on
>clients and may have been running close to tolerance.
>
> > Even if it's not related to this problem, it does sound like a good idea 
>to
> > re-compile the /sbin programs to use shared libraries and conserve some
> > memory.  Anyone know how to do that on RedHat 7.2?  Should I just 
>download
> > tarballs of all the programs in /sbin and manually re-compile them?
>
>You should not need to do this. I was thinking you may have had some
>scripts running at startup, sucking up memory (such as the locate database
>update, or a few cronjobs with anacron or others or perhaps multiple
>parallel copies of the same admin tool being called into existance.)

It's nice to have that cleared up.

>
> > I would try that if the errors were not occurring so early at boot time. 
>  I
> > do have the system set up for virtual memory, but the NTFS module is not 
>yet
> > available.  However, I should have enough memory to get a usable system 
>at
> > runlevel 1 (so I can install the module), shouldn't I?
>
>If you are getting these errors as early as mounting of proc, then any
>libs/so used by mount may be corrupted or damaged (as sugested by the
>other poster).

If so, they would have to have been corrupted after I copied the setup on 
Isildur into /netboot on Palantir.  I ran a badblocks test on all hard 
drives before I did anything with them, but I'll run another scan to see if 
it might have been damaged after delivery.

Could the libs have been corrupted if the network between Isildur and 
Palantir was unreliable?  Or would there have been checksums to keep that 
from happening (NFS apparently runs on UDP)?

>
>Use ldd to test the different bins you are trying to start that exhibit
>these problems and see what libs/so are used by the bins (if any).

I'll have to wait until Monday before I can do that kind of a test on it.  I 
was only able to do the badblocks test because I'm running SSH on Palantir 
and wrote down its IP address.

>
>If you get bus errors and seg faults with any items that are staticly
>linked, then you can look into hardware problems, or corrupt binaries, or
>the above mentioned suggestions with looking at exporting an empty /proc.

Maybe I should also check the NT workstations for hardware problems.

_________________________________________________________________
MSN Photos is the easiest way to share and print your photos: 
http://photos.msn.com/support/worldwide.aspx



More information about the talk mailing list