On Mon, Oct 24, 2016 at 5:25 PM, Matt Garman matthew.garman@gmail.com wrote:
On Mon, Oct 24, 2016 at 2:42 PM, Larry Martell larry.martell@gmail.com wrote:
At any rate, what I was looking at was seeing if there was any way to simplify this process, and cut NFS out of the picture. If you need only to push these files around, what about rsync?
It's not just moving files around. The files are read, and their contents are loaded into a MySQL database.
On what server does the MySQL database live?
The C6 host, same one that the script runs on. We can of course access the MySQL server from the C7 host, assuming the needed packages are there.
This site is not in any way connected to the internet, and you cannot bring in any computers, phones, or media of any kind. There is a process to get machines or files in, but it is onerous and time consuming. This system was set up and configured off site and then brought on site.
But clearly you have a means to log in to both the C6 and C7 servers, right? Otherwise, how would be able to see these errors, check top/sar/free/iostat/etc?
And if you are logging in to both of these boxes, I assume you are doing so via ssh?
Or are you actually physically sitting in front of these machines?
The machines are on a local network. I access them with putty from a windows machine, but I have to be at the site to do that.
If you have ssh access to these machines, then you can trivially copy files to/from them. If ssh is installed and working, then scp should also be installed and working. Even if you don't have scp, you can use tar over ssh to the same effect. It's ugly, but doable, and there are examples online for how to do it.
Also: you made a couple comments about these machines, it looks like the C7 box (FTP server + NFS server) is running bare metal (i.e. not a virtual machine). The C6 instance (NFS client) is virtualized.
Correct.
What host is the C6 instance?
Is the C6 instance running under the C7 instance? I.e., are both machines on the same physical hardware? If that is true, then your "network" (at least the one between C7 and C6) is basically virtual, and to have issues like this on the same physical box is certainly indicative of a mis-configuration.
Yes, the C6 instance is running on the C7 machine. What could be mis-configured? What would I check to find out?
To run the script on the C7 NFS server instead of the C6 NFS client many python libs will have to installed. I do have someone off site working on setting up a local yum repo with what I need, and then we are going to see if we can zip and email the repo and get it on site. But none of us are sys admins and we don't really know what we're doing so we may not succeed and it may take longer then I will be here in Japan (I am scheduled to leave Saturday).
Right, but my point is you can write your own custom script(s) to copy files from C7 to C6 (based on rsync or ssh), do the processing on C6 (DB loading, whatever other processing), then move back to C7 if necessary. You said yourself you are a programmer not a sysadmin, so change the nature of the problem from a sysadmin problem to a programming problem.
Yes, that is potential solution I had not thought of. The issue with this is that we have the same system installed at many, many sites, and they all work fine. It is only this site that is having an issue. We really do not want to have different SW running at just this one site. Running the script on the C7 host is a change, but at least it will be the same software as every place else.
I'm certain I'm missing something, but the fundamental architecture doesn't make sense to me given what I understand of the process flow.
Were you able to run some basic network testing tools between the C6 and C7 machines? I'm interested specifically in netperf, which does round trip packet testing, both TCP and UDP. I would look for packet drops with UDP, and/or major performance outliers with TCP, and/or any kind of timeouts with either protocol.
netperf is not installed.
How is name resolution working on both machines? Do you address machines by hostname (e.g., "my_c6_server"), or explicitly by IP address? Are you using DNS or are the IPs hard-coded in /etc/hosts?
Everything is by ip address.
To me it still "smells" like a networking issue...