Broken NFS server (or client?)

Sat Aug 11 22:14:24 PDT 2001

>From: ME <dugan at passwall.com>
>Reply-To: <talk at nblug.org>
>To: talk at nblug.org
>Subject: Re: Broken NFS server (or client?)
>Date: Sat, 11 Aug 2001 11:35:54 -0700 (PDT)
>
>On Fri, 10 Aug 2001, Lincoln Peters wrote:
> > Here's something interesting: I recompiled my server's kernel and 
>rebooted.
> > When the server came back up, I tried to mount the root filesystem on my
> > non-diskless client, and got a "Connection refused" error.  Then I 
>started
> > the user-space nfsd, tried again, and got the error: "Wrong fs, bad 
>option,
> > bad superblock on MELAMPUS:/, or too many mounted filesystems".  I don't
> > know if that's an improvement or not.
>
>Yep, good information. :-)
>
> > ># cat /proc/filesystems
> > >and see what you have? if nfs does not show up on the client, see if 
>there
> > >is a module and install it. Check on server, and also may want to 
>install
> > >all nfs modules if they are not installed but available for that 
>kernel.
> >

On the diskless client, all networking features, including support for NFS, 
are compiled straight into the kernel (not as modules).  On the non-diskless 
client, NFS support is compiled as modules.

> > It doesn't show up on either test client, but I didn't expect it to,
> > considering the error messages.

I should add something here: I have a directory of MP3 files shared on the 
non-diskless client using NFS, and I successfully loopback-mounted it to 
another path.  Then I saw nfs in /proc/filesystems.

>
>I do not have a 2.4 series kernel here, but would expect that you should
>see "nfs" in the list of filesystems that can be mounted. Did you
>configure your client machines to have support for client based NFS?

Yes, I enabled 'IP: Kernel-level autoconfiguration', 'DHCP Support', 'NFS 
client support', and 'Root filesystem on NFS'.  I have tried both with and 
without NFSv3 on both the client and server, but neither worked.

>
> > portmap is running, but the other services are not.
> >
> > The more I try to find out what's wrong, the more it looks like the NFS
> > server is not working.
>
>Well, if you are using the kernel based NFS server, you may only expect to
>see portmapper running. I might exp. with a 2.4 series kernel and NFS3 in
>kernel land at work on Sunday. I am going to go this route soon anyway for
>a bunch of netbooting diskless clients soon anyway.

What kernel version are they using now at the Schultz Information Center?

> >
> > The test client that has a hard disk is running kernel 2.4.3 (from Red 
>Hat
> > 7.1).
>
>I dont run RH 7.1, but I'll see what I can find in duplicating the
>problems you have been experiencing.

Since I'm not using a Red Hat 7.1 kernel for the diskless client, that 
probably won't be nencessary.  I'm only using the Red Hat box to 
troubleshoot the diskless box.

>
...
>Na, no, I was looking for you to get the IP address of the disk based
>client that is running linux. It was mostly to check the issues of
>/etc/exports and a lack of allowed hosts in the listing.

192.168.0.2
I've tried setting permissions for my non-diskless client based on IP 
address and based on hostname, but in both cases, I get the same error.

>
> > >Now can try one of these:
> > ># exportfs -ar
> > >(Above should work for you, but if you cant find it/get it to work,
> > >then...)
> >
> > When I ran that command, there was no output of any kind.  Does that 
>mean
> > anything?
>
>It should do that when it succeeds. If you want, you can try doing it
>again and then type
># echo $?
>
>if it gives a 0 ("zero") then it thinks it worked. A non-zero value may
>mean that it encountered a problem.

It's a zero.  It worked, but it still doesn't work.  The error message on 
the non-diskless client was a bit longer, though:
call_verify: server accept status: 1
call_verify: server accept status: 1
call_verify: server accept status: 1
RPC: garbage, exit EIO
nfs_get_root: getattr error = 5
call_verify: server accept status: 1
call_verify: server accept status: 1
call_verify: server accept status: 1
RPC: garbage, exit EIO
nfs_get_root: getattr error = 5
nfs_read_super: get root inode failed
nfs warning: mount version older than kernel
call_verify: server accept status: 1
call_verify: server accept status: 1
call_verify: server accept status: 1
RPC: garbage, exit EIO
nfs_get_root: getattr error = 5
nfs_read_super: get root inode failed
mount: wrong fs type, bad option, bad superblock on MELAMPUS:/,
       or too many mounted file systems

>
>Actually, since you are using the kernel based nfs, I am not sure if this
>is the approved procedure now.
>
> > >shutdown and restart your nfs service and related services in the
> > >right order, and then restart them in the right order for that box. (Or
> > >you can just reboot the whole dang box if you would prefer.)
> >
> > Since I'm using a kernel-based NFS server, wouldn't I have to reboot 
>anyway?
>
>I don't know because I am still using the proc-land nfsd. I was waiting
>for kernel based nfs to get out of experimental, and then I was waiting
>for Solar Designer's Nonexecutable stack pacth to be ported to 2.4. :-/
>
>It is likely his patch will come out soon, so I am going to start research
>with migration to 2.4 for our servers this weekend and target a migration
>for some servers before the new semester begins.
>
> > How about: (netstat -an)
> > Active Internet connections (servers and established)
> > Proto	Recv-Q	Send-Q	Local address	Foreign address		State
> > (a bunch of unrelated entries, followed by...)
> > tcp	0	0	0.0.0.0:111	0.0.0.0:*		LISTEN
> > udp	0	0	0.0.0.0:111	0.0.0.0:*
> >
> > (later on...)
> > Active UNIX domain sockets (servers and established)
> > Proto	RefCnt	Flags	Type	State	I-Node	Path
> > unix	2	[ ]	DGRAM		766
> > unix	2	[ ]	DGRAM		735
> >
>
>Even with a kernel based NFS server, I would expect to see the network
>ports open for service with a netstat. Hmm.
>
> > >Also, what version of nfs-utils and mout do you have installed on the
> > >client/server?
> >
> > On the server, nfs-utils-0.3.1-5 (from Red Hat 7.1).  The diskless test
> > client does not have nfs-utils, but the other test client had the same
> > version of nfs-utils as the server.
> >
> > >
> > >Also, as another thought, could you show me your /etc/hosts.allow and
> > >/etc/hosts.deny?
> > >
> >
> > On all systems except the router/firewall, those two files are empty.
>
>Well, there goe sthat thought. :-/
>
> > The /etc/hosts file on the server contains entries for all of the 
>computers
> > on my network except for the diskless test client (and a few Windows NT 
>and
> > 98 machines).  I need to find a way to discover the IP address of the 
>test
> > client before I can enter it in.  Actually, I need to see if the system 
>is
> > making DHCP requests at all.
>
>OK, then find the IP address of the disk based test system and edit your
>etc/hosts to add a name like test1.netboot.yourdomain.com and then make
>the change to /etc/exports mentioned before to the /etc/exports file. It
>is less important to test now that we know your /etc/hosts.deny is empty.

That change to /etc/hosts and /etc/exports did not seem to help.  I still 
get the same error.

>
>This is puzzling. I would continue to focus on these two things:
>1) clients should have nfs listed as a filesystm when cating
>/proc/filesystems

nfs appeared on the non-diskless client.  I can't check the diskless client 
because it won't start up.  Although I would assume that it's there, since 
the kernel was explicitly compiled with NFS support.

>2) the server not showing the service ports for NFS being open bring up a
>question: is it really available for service?

I can try recompiling the kernel on the server WITHOUT the NFS server and 
running a user-space NFS server.  Maybe that would work, or at least provide 
some clues.

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp