[CentOS] NFS help

Mon Oct 24 17:32:56 UTC 2016
Matt Garman <matthew.garman at gmail.com>

On Sun, Oct 23, 2016 at 8:02 AM, Larry Martell <larry.martell at gmail.com> wrote:
>> To be clear: the python script is moving files on the same NFS file
>> system?  E.g., something like
>>
>>     mv /mnt/nfs-server/dir1/file /mnt/nfs-server/dir2/file
>>
>> where /mnt/nfs-server is the mount point of the NFS server on the
>> client machine?
>
> Correct.
>
>> Or are you moving files from the CentOS 7 NFS server to the CentOS 6 NFS client?
>
> No the files are FTP-ed to the CentOS 7 NFS server and then processed
> and moved on the CentOS 6 NFS client.


I apologize if I'm being dense here, but I'm more confused on this
data flow now.  Your use of "correct" and "no" seems to be
inconsistent with your explanation.  Sorry!

At any rate, what I was looking at was seeing if there was any way to
simplify this process, and cut NFS out of the picture.  If you need
only to push these files around, what about rsync?

> The problem doing that is the files are processed and loaded to MySQL
> and then moved by a script that uses the Django ORM, and neither
> django, nor any of the other python packages needed are installed on
> the server. And since the server does not have an external internet
> connection (as I mentioned in my reply to Mark) getting it set up
> would require a large amount of effort.

...right, but I'm pretty sure rsync should be installed on the server;
I believe it's default in all except the "minimal" setup profiles.
Either way, it's trivial to install, as I don't think it has any
dependencies.  You can download the rsync rpm from mirror.centos.org,
then scp it to the server, then install via yum.  And Python is
definitely installed (requirement for yum) and Perl is probably
installed as well, so with rsync plus some basic Perl/Python scripting
you can create your own mover script.

Actually, rsync may not even be necessary, scp may be sufficient for
your purposes.  And scp should definitely be installed.


> Also, we have this exact same setup on over 10 other systems, and it
> is only this one that is having a problem. The one difference with
> this one is that the sever is CentOS7 - on all the other systems both
> the NFS server and client are CentOS6.

>From what you've described so far, with what appears to be a
relatively simple config, C6 or C7 "shouldn't" matter.  However, under
the hood, C6 and C7 are quite different.

> The python script checks the modification time of the file, and only
> if it has not been modified in more then 2 minutes does it process it.
> Otherwise it skips it and waits for the next run to potentially
> process it. Also, the script can tell if the file is incomplete in a
> few different ways. So if it has not been modified in more then 2
> minutes, the script starts to process it, but if it finds that it's
> incomplete it aborts the processing and leaves it for next time.

This script runs on C7 or C6?

> The hardware is new, and is in a rack in a server room with adequate
> and monitored cooling and power. But I just found out from someone on
> site that there is a disk failure, which happened back on Sept 3. The
> system uses RAID, but I don't know what level. I was told it can
> tolerate 3 disk failures and still keep working, but personally, I
> think all bets are off until the disk has been replaced. That should
> happen in the next day or 2, so we shall see.

OK, depending on the RAID scheme and how it's implemented, there could
be disk timeouts causing things to hang.


> I've been watching and monitoring the machines for 2 days and neither
> one has had a large CPU load, not has been using much memory.

How about iostat?  Also, good old "dmesg" can suggest if the system
with the failed drive is causing timeouts to occur.


> None on the client. On the server it has 1 dropped Rx packet.
>
>> Do
>>> "ethtool <interface>" on both machines to make sure both are linked up
>>> at the correct speed and duplex.
>
> That reports only "Link detected: yes" for both client and server.

OK, but ethtool should also say something like:

...
Speed: 1000Mb/s
Duplex: Full
...

For a 1gbps network.  If Duplex is reported as "half", then that is
definitely a problem.  Using netperf is further confirmation of
whether or not your network is functioning as expected.


> sar seems to be running, but I can only get it to report on the
> current day. The man page shows start and end time options, but is
> there a way to specify the stand and end date?

If you want to report on a day in the past, you have to pass the file
argument, something like this:

sar -A -f /var/log/sa/sa23 -s 07:00:00 -e 08:00:00

That would show you yesterday's data between 7am and 8am.  The files
in /var/log/sa/saXX are the files that correspond to the day.  By
default, XX will be the day of the month.