Ok, so I'm trying out CR for 7.5.1804, and I have upgrade three machines with it. One is my laptop, a Dell Precision M6700, which worked fine. Another is a Dell R715 server, and that worked fine. The third machine will not successfully reboot with the 3.10.0-862.el7.x86_64 kernel, throwing endless errors of the form "qla2xxx [0000:03:07.0]-5046:4: Async-gpdb failed - hdX-xxx portid=xxxxxx status=38 mb1=0 mb2=0 mb6=0 mb7" (that's as long as the line is on the screen, and it's not recorded into /var/log/messages (and C7 by default doesn't implement persistent journal storage; journalctl --list-boots only lists the current boot log)).
System is a bit old, but is only used for development: Dell PowerEdge SC1425 two single-core Xeon 3.2GHz (obviously 64-bit capable, as it is running stock CentOS 7 already, with kernel 3.10.0-693.21.1.el7.x86_64 running fine. System is connected to two EMC Clariion arrays via a dual-port Fibre-Channel card: lspci tells me: 03:07.0 Fibre Channel: QLogic Corp. ISP2312-based 2Gb Fibre Channel to PCI-X HBA (rev 02) 03:07.1 Fibre Channel: QLogic Corp. ISP2312-based 2Gb Fibre Channel to PCI-X HBA (rev 02)
Yeah, I know, not new, and maybe even deprecated (I actually don't see that adapter in the deprecated list for RHEL 7.5, but I do see ISP24xx ones in the deprecated list), but it worked fine with the 7.4.1708 kernel. I set the system to boot the older kernel until I can more fully troubleshoot. I have a lot invested in ISP24xx-based cards, which are on the deprecated list, and I am still running fibre-channel, so will have to find another solution before 7.x EOL.
So I consider this a regression; for the moment I'll boot the older kernel, but I know that isn't sustainable.
On 04/30/2018 07:51 AM, Lamar Owen wrote:
Ok, so I'm trying out CR for 7.5.1804, and I have upgrade three machines with it. One is my laptop, a Dell Precision M6700, which worked fine. Another is a Dell R715 server, and that worked fine. The third machine will not successfully reboot with the 3.10.0-862.el7.x86_64 kernel, throwing endless errors of the form "qla2xxx [0000:03:07.0]-5046:4: Async-gpdb failed - hdX-xxx portid=xxxxxx status=38 mb1=0 mb2=0 mb6=0 mb7" (that's as long as the line is on the screen, and it's not recorded into /var/log/messages (and C7 by default doesn't implement persistent journal storage; journalctl --list-boots only lists the current boot log)).
System is a bit old, but is only used for development: Dell PowerEdge SC1425 two single-core Xeon 3.2GHz (obviously 64-bit capable, as it is running stock CentOS 7 already, with kernel 3.10.0-693.21.1.el7.x86_64 running fine. System is connected to two EMC Clariion arrays via a dual-port Fibre-Channel card: lspci tells me: 03:07.0 Fibre Channel: QLogic Corp. ISP2312-based 2Gb Fibre Channel to PCI-X HBA (rev 02) 03:07.1 Fibre Channel: QLogic Corp. ISP2312-based 2Gb Fibre Channel to PCI-X HBA (rev 02)
Yeah, I know, not new, and maybe even deprecated (I actually don't see that adapter in the deprecated list for RHEL 7.5, but I do see ISP24xx ones in the deprecated list), but it worked fine with the 7.4.1708 kernel. I set the system to boot the older kernel until I can more fully troubleshoot. I have a lot invested in ISP24xx-based cards, which are on the deprecated list, and I am still running fibre-channel, so will have to find another solution before 7.x EOL.
So I consider this a regression; for the moment I'll boot the older kernel, but I know that isn't sustainable.
Lamar,
we need to see if this behavior is duplicated by using the RHEL-7.5 kernel.
Do you have the ability to test that?
If the behavior is the same as RHEL-7.5, then the mainline CentOS kernel will be that way too .. BUT .. we would be glad to try to get something into the CentOS Plus kernel, especially if you can provide a config patch that works.
If it works in RHEL and not in CentOS, then we will start troubleshooting to find how we broke it.
Thanks, Johnny Hughes
If your running McAfee products, it will kernel panic at boot.
On Mon, Apr 30, 2018, 12:36 PM Johnny Hughes johnny@centos.org wrote:
On 04/30/2018 07:51 AM, Lamar Owen wrote:
Ok, so I'm trying out CR for 7.5.1804, and I have upgrade three machines with it. One is my laptop, a Dell Precision M6700, which worked fine. Another is a Dell R715 server, and that worked fine. The third machine will not successfully reboot with the 3.10.0-862.el7.x86_64 kernel, throwing endless errors of the form "qla2xxx [0000:03:07.0]-5046:4: Async-gpdb failed - hdX-xxx portid=xxxxxx status=38 mb1=0 mb2=0 mb6=0 mb7" (that's as long as the line is on the screen, and it's not recorded into /var/log/messages (and C7 by default doesn't implement persistent journal storage; journalctl --list-boots only lists the current boot
log)).
System is a bit old, but is only used for development: Dell PowerEdge SC1425 two single-core Xeon 3.2GHz (obviously 64-bit capable, as it is running stock CentOS 7 already, with kernel 3.10.0-693.21.1.el7.x86_64 running fine. System is connected to two EMC Clariion arrays via a dual-port Fibre-Channel card: lspci tells me: 03:07.0 Fibre Channel: QLogic Corp. ISP2312-based 2Gb Fibre Channel to PCI-X HBA (rev 02) 03:07.1 Fibre Channel: QLogic Corp. ISP2312-based 2Gb Fibre Channel to PCI-X HBA (rev 02)
Yeah, I know, not new, and maybe even deprecated (I actually don't see that adapter in the deprecated list for RHEL 7.5, but I do see ISP24xx ones in the deprecated list), but it worked fine with the 7.4.1708 kernel. I set the system to boot the older kernel until I can more fully troubleshoot. I have a lot invested in ISP24xx-based cards, which are on the deprecated list, and I am still running fibre-channel, so will have to find another solution before 7.x EOL.
So I consider this a regression; for the moment I'll boot the older kernel, but I know that isn't sustainable.
Lamar,
we need to see if this behavior is duplicated by using the RHEL-7.5 kernel.
Do you have the ability to test that?
If the behavior is the same as RHEL-7.5, then the mainline CentOS kernel will be that way too .. BUT .. we would be glad to try to get something into the CentOS Plus kernel, especially if you can provide a config patch that works.
If it works in RHEL and not in CentOS, then we will start troubleshooting to find how we broke it.
Thanks, Johnny Hughes
CentOS-devel mailing list CentOS-devel@centos.org https://lists.centos.org/mailman/listinfo/centos-devel
On 04/30/2018 05:25 PM, Lamar Owen wrote:
On 04/30/2018 12:36 PM, Johnny Hughes wrote:
If it works in RHEL and not in CentOS, then we will start troubleshooting to find how we broke it.
The RHEL kernel has the same issue. I'm going to file it upstream and see what happens.
Thanks Lamar.
If we need a config file change or patch to make it work, and if they are not fiing it upstream, then we can try to add something to the plus kernel.
On 05/01/2018 07:58 AM, Johnny Hughes wrote:
If we need a config file change or patch to make it work, and if they are not fiing it upstream, then we can try to add something to the plus kernel.
I'll see what I can figure out as time permits; things are very swamped at $dayjob right now. I have several qla2xxx cards available to me, and I'll check through them on a development machine that's not quite as old as that SC1425, without touching one of the production machines with connections to the arrays.