I have a brand new poweredge 2900 with 10 SAS drives configured in two arrays via the built-in PERC 5 raid controller as:
raid 1: 2x73GB
raid 10: 8x300GB
It's got 4GB of ram, and it's intended to be an NFS filestore.
For some strange reason, logging in with ssh works great, it returns a prompt, all seems well.
I go to run a simple command like 'top' or 'yum -y install <package>', and my xterm/ssh session just locks. In some cases, it's drawn half of the top screen and hung, in other cases, it doesn't even do that. Kill the xterm window, bring a new one up, right back in, try it again, it repeats.
What's interesting to me is that I have all kinds of other 'lesser' systems running CentOS 4.4, and I have none of these issues with them. My ~1.1TB raid 10 drive is sliced up into 4 parts, with the big one being about 950GB. Near as I can figure, I haven't hit any limitations, but I'm stumped by something that I *think* is probably either relatively trivial, or just a straight out hardware incompatibility. One thought is that it could be related to the Gb ethernet devices (bge).
Commands like 'ifconfig -a' work great. 'dmesg | grep eth0' locks up the session.
This is relatively frustrating. Googling doesn't seem to net any real results, and I can't seem to find anything relevant in the logs.
One more relevant bit to add, this behavior does not exist from the console.
Peter
I have a brand new poweredge 2900 with 10 SAS drives configured in two arrays via the built-in PERC 5 raid controller as:
raid 1: 2x73GB
raid 10: 8x300GB
It's got 4GB of ram, and it's intended to be an NFS filestore.
For some strange reason, logging in with ssh works great, it returns a prompt, all seems well.
I go to run a simple command like 'top' or 'yum -y install <package>', and my xterm/ssh session just locks. In some cases, it's drawn half of the top screen and hung, in other cases, it doesn't even do that. Kill the xterm window, bring a new one up, right back in, try it again, it repeats.
What's interesting to me is that I have all kinds of other 'lesser' systems running CentOS 4.4, and I have none of these issues with them. My ~1.1TB raid 10 drive is sliced up into 4 parts, with the big one being about 950GB. Near as I can figure, I haven't hit any limitations, but I'm stumped by something that I *think* is probably either relatively trivial, or just a straight out hardware incompatibility. One thought is that it could be related to the Gb ethernet devices (bge).
Commands like 'ifconfig -a' work great. 'dmesg | grep eth0' locks up the session.
This is relatively frustrating. Googling doesn't seem to net any real results, and I can't seem to find anything relevant in the logs.
One more relevant bit to add, this behavior does not exist from the console.
Peter
-- Peter Serwe <peter at infostreet dot com> http://www.infostreet.com
"The only true sports are bullfighting, mountain climbing and auto racing." -Ernest Hemingway
"Because everything else requires only one ball." -Unknown
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Hi,
Have you tried booting with the non-smp kernel to see it the same problem occurs?
If you have an Intel nic handy you could try to see if that eliminates the problem. I recently had trouble with the sky2 module, but in that case, I would loose all network connectivity until the module was reloaded. I have now had no problems since switching to the Intel nic.
Hope that helps.
A.
Just out of curriosity are you running 64bit or 32bit version of centos?
The 32 bit version usually gives less problems.
I have a brand new poweredge 2900 with 10 SAS drives configured in two arrays via the built-in PERC 5 raid controller as:
raid 1: 2x73GB
raid 10: 8x300GB
It's got 4GB of ram, and it's intended to be an NFS filestore.
For some strange reason, logging in with ssh works great, it returns a prompt, all seems well.
I go to run a simple command like 'top' or 'yum -y install <package>', and my xterm/ssh session just locks. In some cases, it's drawn half of the top screen and hung, in other cases, it doesn't even do that. Kill the xterm window, bring a new one up, right back in, try it again, it repeats.
What's interesting to me is that I have all kinds of other 'lesser' systems running CentOS 4.4, and I have none of these issues with them. My ~1.1TB raid 10 drive is sliced up into 4 parts, with the big one being about 950GB. Near as I can figure, I haven't hit any limitations, but I'm stumped by something that I *think* is probably either relatively trivial, or just a straight out hardware incompatibility. One thought is that it could be related to the Gb ethernet devices (bge).
Commands like 'ifconfig -a' work great. 'dmesg | grep eth0' locks up the session.
This is relatively frustrating. Googling doesn't seem to net any real results, and I can't seem to find anything relevant in the logs.
One more relevant bit to add, this behavior does not exist from the console.
Peter
-- Peter Serwe <peter at infostreet dot com> http://www.infostreet.com
"The only true sports are bullfighting, mountain climbing and auto racing." -Ernest Hemingway
"Because everything else requires only one ball." -Unknown
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Hi,
Have you tried booting with the non-smp kernel to see it the same problem occurs?
If you have an Intel nic handy you could try to see if that eliminates the problem. I recently had trouble with the sky2 module, but in that case, I would loose all network connectivity until the module was reloaded. I have now had no problems since switching to the Intel nic.
Hope that helps.
A.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Sunday 24 December 2006 00:37, Brent Rynn wrote:
Just out of curriosity are you running 64bit or 32bit version of centos?
The 32 bit version usually gives less problems.
I'll take that flame bait.. This is just FUD and BS except when it comes to running some binary 3rd party softwares (which wasn't the case here). I/We run hundreds of 64-bit Centos-4 machines several of those being Dell 2950/1950 (rack version of the 2900).
/Peter
Simply put there is less support for 64bit systems. Sometimes this can cause unexpected weird problems. A person needs to be aware there is less support and therefore potential conflicts. This particular problem from what is described perhaps needs a new card or a driver.
One can't assume the hardware and drivers will be exactly the same on all versions of a server made like 2900 (although they usually are the same or very close, conflicts can arrise). Dell is known to change certain hardware to supply cost issues.
There is nothing wrong with centos in 64bit and it does indeed make sense in certain situations to deploy now. When 64bit is deployed you need to be aware that the main stream is still 32 bit and you may run into unique problems with that setup.
On Sunday 24 December 2006 00:37, Brent Rynn wrote:
Just out of curriosity are you running 64bit or 32bit version of centos?
The 32 bit version usually gives less problems.
I'll take that flame bait.. This is just FUD and BS except when it comes to running some binary 3rd party softwares (which wasn't the case here). I/We run hundreds of 64-bit Centos-4 machines several of those being Dell 2950/1950 (rack version of the 2900).
/Peter _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Brent Rynn wrote:
Just out of curriosity are you running 64bit or 32bit version of centos?
The 32 bit version usually gives less problems.
Hi,
Have you tried booting with the non-smp kernel to see it the same problem occurs?
If you have an Intel nic handy you could try to see if that eliminates the problem. I recently had trouble with the sky2 module, but in that case, I would loose all network connectivity until the module was reloaded. I have now had no problems since switching to the Intel nic.
Hope that helps.
A.
I'm going to snip the top/bottom posting and try and reply to both.
I'm relatively certain that the CentOS-4.4-i386-bin?of4.iso are the 32-bit versions. I'm tempted to try the 64 bit version for grins just to see if the behavior is different.
I will power the machine down and take a stab at it with an Intel NIC.
Peter
first I would try the Nic Card. If for some reason that doesn't work I it wouldn't hurt to try the 64 bit version if the nic card doesn' work.
I believe there is physical limit of 4 gigs on a 32 bit system. So you may want to consider going 64 anyways for this server.
If none of this fixes it you got a bad MB or CPU and time for Dell to replace it.
Just out of curriosity are you running 64bit or 32bit version of centos?
The 32 bit version usually gives less problems.
Hi,
Have you tried booting with the non-smp kernel to see it the same problem occurs?
If you have an Intel nic handy you could try to see if that eliminates the problem. I recently had trouble with the sky2 module, but in that case, I would loose all network connectivity until the module was reloaded. I have now had no problems since switching to the Intel nic.
Hope that helps.
A.
I'm going to snip the top/bottom posting and try and reply to both.
I'm relatively certain that the CentOS-4.4-i386-bin?of4.iso are the 32-bit versions. I'm tempted to try the 64 bit version for grins just to see if the behavior is different.
I will power the machine down and take a stab at it with an Intel NIC.
Peter
-- Peter Serwe <peter at infostreet dot com> http://www.infostreet.com
"The only true sports are bullfighting, mountain climbing and auto racing." -Earnest Hemingway
"Because everything else requires only one ball." -Unknown
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Brent wrote:
first I would try the Nic Card. If for some reason that doesn't work I it wouldn't hurt to try the 64 bit version if the nic card doesn' work.
I believe there is physical limit of 4 gigs on a 32 bit system. So you may want to consider going 64 anyways for this server.
If none of this fixes it you got a bad MB or CPU and time for Dell to replace it.
At this point, I've tried swapping the NIC's, it doesn't seem to change the behavior.
I'm currently downloading the iso images for 4.4-ia64 just to see if that helps, but I'm not necessarily getting a warm and fuzzy that's going to do anything for me.
I also found that the system couldn't seem to find my almost 1TB raid partition label when I rebooted, ( I have one partition that's egregiously larger than the rest), and it choked, forcing me into a repair shell. After mounting the rest of the partitions by label, I ended up changing them all to reference the actual device instead of bothering with the labels in /etc/fstab, and everything came up normally.
A straight 4GB of ram right on the limit shouldn't be an issue, for more, allegedly you can use the kernel-hugemem.i686 kernel if you exceed that, but for 4GB, I should be fine with the regular 2.6.9-42.0.3.EL-smp-i686 I'm running.
I'm not precisely certain, but it seems to me that this might be an issue with the storage component more than anything else. Anybody have any anecdotal data with stripe/block sizing on the Dell boxes and it's possible effects on performance?
I still don't see any issues from the console, but I'm taking a stab in the dark here, since the only differences between two machines I have sitting in front of me is that one's a dual Celeron PE1950 with less storage and SATA drives (same amount of RAM, recognized fine by the 2.6.9-42.0.3.EL-smp-i686 kernel) and the other is a dual Xeon, same ram, a lot more storage, and SAS instead of SATA drives on a beefier raid controller.
Other than that, there are probably all kinds of detailed differences a bunch of the other tiny, usually (relatively) insignificant pieces of hardware that I usually don't run into running various i386/686 *nix operating systems.
Peter
On Tue, 26 Dec 2006 at 6:10pm, Peter Serwe wrote
Brent wrote:
first I would try the Nic Card. If for some reason that doesn't work I it wouldn't hurt to try the 64 bit version if the nic card doesn' work.
I believe there is physical limit of 4 gigs on a 32 bit system. So you may want to consider going 64 anyways for this server.
If none of this fixes it you got a bad MB or CPU and time for Dell to replace it.
At this point, I've tried swapping the NIC's, it doesn't seem to change the behavior.
I'm currently downloading the iso images for 4.4-ia64 just to see if that helps, but I'm not necessarily getting a warm and fuzzy that's going to do anything for me.
You're downloading the wrong image. You want x86_64. ia64 is for the Itanic.
Peter Serwe wrote:
I have a brand new poweredge 2900 with 10 SAS drives configured in two arrays via the built-in PERC 5 raid controller as:
raid 1: 2x73GB
raid 10: 8x300GB
It's got 4GB of ram, and it's intended to be an NFS filestore.
For some strange reason, logging in with ssh works great, it returns a prompt, all seems well.
I go to run a simple command like 'top' or 'yum -y install <package>', and my xterm/ssh session just locks. In some cases, it's drawn half of the top screen and hung, in other cases, it doesn't even do that. Kill the xterm window, bring a new one up, right back in, try it again, it repeats.
What's interesting to me is that I have all kinds of other 'lesser' systems running CentOS 4.4, and I have none of these issues with them. My ~1.1TB raid 10 drive is sliced up into 4 parts, with the big one being about 950GB. Near as I can figure, I haven't hit any limitations, but I'm stumped by something that I *think* is probably either relatively trivial, or just a straight out hardware incompatibility. One thought is that it could be related to the Gb ethernet devices (bge).
Commands like 'ifconfig -a' work great. 'dmesg | grep eth0' locks up the session.
This is relatively frustrating. Googling doesn't seem to net any real results, and I can't seem to find anything relevant in the logs. One more relevant bit to add, this behavior does not exist from the console.
Peter
Not answering your question, but I have to ask, what does ifconfig -a do? I man ifconfig and it does not show an -a switch. Looked it up on the Internet, still can't find a -a switch.
It seems like this is a NIC issue or I/O of the MB. Do you have another NIC you can test it with?
Peter Serwe wrote:
I have a brand new poweredge 2900 with 10 SAS drives configured in two arrays via the built-in PERC 5 raid controller as:
raid 1: 2x73GB
raid 10: 8x300GB
It's got 4GB of ram, and it's intended to be an NFS filestore.
For some strange reason, logging in with ssh works great, it returns a prompt, all seems well.
I go to run a simple command like 'top' or 'yum -y install <package>', and my xterm/ssh session just locks. In some cases, it's drawn half of the top screen and hung, in other cases, it doesn't even do that. Kill the xterm window, bring a new one up, right back in, try it again, it repeats.
What's interesting to me is that I have all kinds of other 'lesser' systems running CentOS 4.4, and I have none of these issues with them. My ~1.1TB raid 10 drive is sliced up into 4 parts, with the big one being about 950GB. Near as I can figure, I haven't hit any limitations, but I'm stumped by something that I *think* is probably either relatively trivial, or just a straight out hardware incompatibility. One thought is that it could be related to the Gb ethernet devices (bge).
Commands like 'ifconfig -a' work great. 'dmesg | grep eth0' locks up the session.
This is relatively frustrating. Googling doesn't seem to net any real results, and I can't seem to find anything relevant in the logs. One more relevant bit to add, this behavior does not exist from the console.
Peter
Not answering your question, but I have to ask, what does ifconfig -a do? I man ifconfig and it does not show an -a switch. Looked it up on the Internet, still can't find a -a switch.
Hi,
From the ifconfig man page:
If no arguments are given, ifconfig displays the status of the currently active interfaces. If a single interface argument is given, it displays the status of the given interface only; if a single -a argument is given, it displays the status of all interfaces, even those that are down. Otherwise, it configures an interface.
A.
It seems like this is a NIC issue or I/O of the MB. Do you have another NIC you can test it with?
-- Damon L. Chesser damon@damtek.com damon@okfairtax.org
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Andrew Bogecho wrote:
Peter Serwe wrote:
I have a brand new poweredge 2900 with 10 SAS drives configured in two arrays via the built-in PERC 5 raid controller as:
raid 1: 2x73GB
raid 10: 8x300GB
It's got 4GB of ram, and it's intended to be an NFS filestore.
For some strange reason, logging in with ssh works great, it returns a prompt, all seems well.
I go to run a simple command like 'top' or 'yum -y install <package>', and my xterm/ssh session just locks. In some cases, it's drawn half of the top screen and hung, in other cases, it doesn't even do that. Kill the xterm window, bring a new one up, right back in, try it again, it repeats.
What's interesting to me is that I have all kinds of other 'lesser' systems running CentOS 4.4, and I have none of these issues with them. My ~1.1TB raid 10 drive is sliced up into 4 parts, with the big one being about 950GB. Near as I can figure, I haven't hit any limitations, but I'm stumped by something that I *think* is probably either relatively trivial, or just a straight out hardware incompatibility. One thought is that it could be related to the Gb ethernet devices (bge).
Commands like 'ifconfig -a' work great. 'dmesg | grep eth0' locks up the session.
This is relatively frustrating. Googling doesn't seem to net any real results, and I can't seem to find anything relevant in the logs. One more relevant bit to add, this behavior does not exist from the console.
Peter
Not answering your question, but I have to ask, what does ifconfig -a do? I man ifconfig and it does not show an -a switch. Looked it up on the Internet, still can't find a -a switch.
Hi,
From the ifconfig man page:
If no arguments are given, ifconfig displays the status of the
currently active interfaces. If a single interface argument is given, it displays the status of the given interface only; if a single -a argument is given, it displays the status of all interfaces, even those that are down. Otherwise, it configures an interface.
A.
It seems like this is a NIC issue or I/O of the MB. Do you have another NIC you can test it with?
-- Damon L. Chesser damon@damtek.com damon@okfairtax.org
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Ahhh, I figured it meant -a as in "all" but I was looking for the list of switches like:
-a --all lists all interfaces -b --bummer this switch breaks everything etc
but I did not READ the man page but neither did I see a list of switches. Figures. Thanks :)
Damon L. Chesser wrote:
Not answering your question, but I have to ask, what does ifconfig -a do? I man ifconfig and it does not show an -a switch. Looked it up on the Internet, still can't find a -a switch.
It seems like this is a NIC issue or I/O of the MB. Do you have another NIC you can test it with?
I think that's hilarious about the -a switch. Perhaps if you typed it on the command line you'd see. :)
For some reason, CentOS 4.4's man page doesn't have the "-a" switch.
Excerpt from man 8 ifconfig under FreeBSD 6.2: (not relevant per se to CentOS, but documents the flag)
Optionally, the -a flag may be used instead of an interface name. This flag instructs ifconfig to display information about all interfaces in the system. The -d flag limits this to interfaces that are down, and -u limits this to interfaces that are up. When no arguments are given, -a is implied.
And output from 'ifconfig --help' on the CentOS 4.4 command line:
root@cfcu alias# ifconfig --help Usage: ifconfig [-a] [-i] [-v] [-s] <interface> [[<AF>] <address>]
The very top line of the --help usage guide. (with no further explanation).
Also, -a is not implied in CentOS's ifconfig, you get more complete output on a CentOS box with it.
I have to look around for a NIC. Not being able to use the GB NIC's the system came with will be mildy problematic at best. I need the I/O throughput, and I don't have any spare GB Nic's laying around, although I do have a dual port Intel I can test with for a few minutes.
Peter
On Tue, 2006-12-26 at 10:31 -0800, Peter Serwe wrote:
Damon L. Chesser wrote:
Not answering your question, but I have to ask, what does ifconfig -a do? I man ifconfig and it does not show an -a switch. Looked it up on the Internet, still can't find a -a switch.
It seems like this is a NIC issue or I/O of the MB. Do you have another NIC you can test it with?
I think that's hilarious about the -a switch. Perhaps if you typed it on the command line you'd see. :)
For some reason, CentOS 4.4's man page doesn't have the "-a" switch.
Excerpt from man 8 ifconfig under FreeBSD 6.2: (not relevant per se to CentOS, but documents the flag)
Optionally, the -a flag may be used instead of an interface name. This flag instructs ifconfig to display information about all interfaces in the system. The -d flag limits this to interfaces that are down,
and -u limits this to interfaces that are up. When no arguments are given, -a is implied.
And output from 'ifconfig --help' on the CentOS 4.4 command line:
root@cfcu alias# ifconfig --help Usage: ifconfig [-a] [-i] [-v] [-s] <interface> [[<AF>] <address>]
The very top line of the --help usage guide. (with no further explanation).
Also, -a is not implied in CentOS's ifconfig, you get more complete output on a CentOS box with it.
I have to look around for a NIC. Not being able to use the GB NIC's the system came with will be mildy problematic at best. I need the I/O throughput, and I don't have any spare GB Nic's laying around, although I do have a dual port Intel I can test with for a few minutes.
Peter
I surmised the meaning of the -a switch by running it and I also did the ifconfig --help and found, like you, it was listed but no meaning given. I was just baffled by a switch not listed. Learn something new every day.
Have you checked support.dell.com for a nick driver? I don't know about the linux versions as I have not worked on Dell linux boxes, but the windows versions have updated drivers there for all the components like TOE (not applicable to you as the TOE feature does not work in linux). Perhaps you just need an updated driver for the Gigabyte NIC?
Damon L. Chesser wrote:
I surmised the meaning of the -a switch by running it and I also did the ifconfig --help and found, like you, it was listed but no meaning given. I was just baffled by a switch not listed. Learn something new every day.
Have you checked support.dell.com for a nick driver? I don't know about the linux versions as I have not worked on Dell linux boxes, but the windows versions have updated drivers there for all the components like TOE (not applicable to you as the TOE feature does not work in linux). Perhaps you just need an updated driver for the Gigabyte NIC?
Checked for an updated driver, didn't see anything. Swapped in the Intel pro1000/100/10, didn't see a change in the behavior. Going to try the x86_64 version, and if that doesn't work, probably just go back and run it on FreeBSD. Not to say that I can't be bothered to figure out what the problem is, I'm just at a loss and not getting a response out of linux-poweredge yet either. We'll see, the jury's still out :)
Peter
On Wed, 2006-12-27 at 10:22 -0800, Peter Serwe wrote:
Damon L. Chesser wrote:
I surmised the meaning of the -a switch by running it and I also did the ifconfig --help and found, like you, it was listed but no meaning given. I was just baffled by a switch not listed. Learn something new every day.
Have you checked support.dell.com for a nick driver? I don't know about the linux versions as I have not worked on Dell linux boxes, but the windows versions have updated drivers there for all the components like TOE (not applicable to you as the TOE feature does not work in linux). Perhaps you just need an updated driver for the Gigabyte NIC?
Checked for an updated driver, didn't see anything. Swapped in the Intel pro1000/100/10, didn't see a change in the behavior. Going to try the x86_64 version, and if that doesn't work, probably just go back and run it on FreeBSD. Not to say that I can't be bothered to figure out what the problem is, I'm just at a loss and not getting a response out of linux-poweredge yet either. We'll see, the jury's still out :)
Peter
Peter,
Do you have Gold support? Those guys are very good at what they do. If you do, I would think that the linux guys at Dell know the answer to this question. If you do not, perhaps the basic support guys will know the answer?
Damon L. Chesser wrote:
Peter,
Do you have Gold support? Those guys are very good at what they do. If you do, I would think that the linux guys at Dell know the answer to this question. If you do not, perhaps the basic support guys will know the answer?
No gold, just the silver 7x24.
Peter
On Thu, 2006-12-28 at 10:12 -0800, Peter Serwe wrote:
Damon L. Chesser wrote:
Peter,
Do you have Gold support? Those guys are very good at what they do. If you do, I would think that the linux guys at Dell know the answer to this question. If you do not, perhaps the basic support guys will know the answer?
No gold, just the silver 7x24.
Peter
Peter,
Silver sit side by side with the Gold. Have you asked them? If anybody has seen this, they would know about the solution. The difference between Silver and Gold is not the skill level, it is the service level allowed, latitude of movement to solve the issues.
Damon L. Chesser wrote:
On Thu, 2006-12-28 at 10:12 -0800, Peter Serwe wrote:
Damon L. Chesser wrote:
Peter,
Do you have Gold support? Those guys are very good at what they do. If you do, I would think that the linux guys at Dell know the answer to this question. If you do not, perhaps the basic support guys will know the answer?
No gold, just the silver 7x24.
Peter
Peter,
Silver sit side by side with the Gold. Have you asked them? If anybody has seen this, they would know about the solution. The difference between Silver and Gold is not the skill level, it is the service level allowed, latitude of movement to solve the issues.
Interestingly, the behavior completely disappeared with the installation of CentOS 4.4 x86_64. Somehow it didn't occur to me that using the distribution precisely for the hardware would be required, or perhaps that the normal i386 distribution wouldn't just run on it.
:)
Peter
Peter Serwe spake the following on 12/28/2006 3:21 PM:
Damon L. Chesser wrote:
On Thu, 2006-12-28 at 10:12 -0800, Peter Serwe wrote:
Damon L. Chesser wrote:
Peter,
Do you have Gold support? Those guys are very good at what they do. If you do, I would think that the linux guys at Dell know the answer to this question. If you do not, perhaps the basic support guys will know the answer?
No gold, just the silver 7x24.
Peter
Peter,
Silver sit side by side with the Gold. Have you asked them? If anybody has seen this, they would know about the solution. The difference between Silver and Gold is not the skill level, it is the service level allowed, latitude of movement to solve the issues.
Interestingly, the behavior completely disappeared with the installation of CentOS 4.4 x86_64. Somehow it didn't occur to me that using the distribution precisely for the hardware would be required, or perhaps that the normal i386 distribution wouldn't just run on it.
:)
Peter
It usually does run OK. But there must be some hardware glitch between your system and an i686 install.