Jonas Björklund schrieb:
Hello,
On Tue, 26 Sep 2006, Michael Kress wrote:
I tried the "in-kernel" driver, i.e. I modified the config to include 3w-9xxx in the kernel 2.6.16, not as a module. I must admit that the mkfs went through this time, but as soon as I do some additional 'dd' or 'cp' it doesn't take me longer than 10 sec to receive this crash (sorry, no more information than the screen, because the machine crashes and because it doesn't write logs for obvious reasons). Have you got some idea?
Do you have a 64-bit PCI or a 32-bit PCI?
Hi,
it's a 64-bit 133MHz PCI-X. see http://www.supermicro.com/products/motherboard/Xeon800/E7520/X6DH8-G2+.cfm
There must be something different about the way the kernel and its components are composed and unfortunately I don't have the knowledge to find it. Under the kernel that came with CentOS (2.6.9) the controller works perfectly, it's only the 2.6.16 that comes with xen that produces trouble. Is there any more debug options I could activate to provide more details?
I don't want to move away from xen although I've already tried (the) openvz (kernel), which works perfectly during high io load, but xen seems more sympathic to me. I hope this messy technical detail doesn't force me to change to a different product.
Thanks for any more hints!
ciao - Michael
On Wed, 27 Sep 2006 at 5:24pm, Michael Kress wrote
Hi,
it's a 64-bit 133MHz PCI-X. see http://www.supermicro.com/products/motherboard/Xeon800/E7520/X6DH8-G2+.cfm
There must be something different about the way the kernel and its components are composed and unfortunately I don't have the knowledge to find it. Under the kernel that came with CentOS (2.6.9) the controller works perfectly, it's only the 2.6.16 that comes with xen that produces trouble. Is there any more debug options I could activate to provide more details?
What version of the 3w-9xxx driver comes with the xen kernel? The 4.4 CentOS update kernel contains a fairly recent version, and I wouldn't be surprised if 2.6.16 has an older, less stable one. If that's the case, it's rather easy to update 3w-9xxx.
Joshua Baker-LePain wrote:
On Wed, 27 Sep 2006 at 5:24pm, Michael Kress wrote
Hi,
it's a 64-bit 133MHz PCI-X. see http://www.supermicro.com/products/motherboard/Xeon800/E7520/X6DH8-G2+.cfm
There must be something different about the way the kernel and its components are composed and unfortunately I don't have the knowledge to find it. Under the kernel that came with CentOS (2.6.9) the controller works perfectly, it's only the 2.6.16 that comes with xen that produces trouble. Is there any more debug options I could activate to provide more details?
What version of the 3w-9xxx driver comes with the xen kernel? The 4.4 CentOS update kernel contains a fairly recent version, and I wouldn't be surprised if 2.6.16 has an older, less stable one. If that's the case, it's rather easy to update 3w-9xxx.
Hello,
xen as of version 3.0.2 includes kernel 2.6.16 which seems to include version 2.26.02.005: 3w-9xxx.c:#define TW_DRIVER_VERSION "2.26.02.005" Does this help you to find out more? Thanks Regards - Michael
On Wed, 27 Sep 2006 at 6:51pm, Michael Kress wrote
xen as of version 3.0.2 includes kernel 2.6.16 which seems to include version 2.26.02.005: 3w-9xxx.c:#define TW_DRIVER_VERSION "2.26.02.005" Does this help you to find out more?
Yep -- that's a rather old version, which is especially bad for the 9550 series boards. CentOS 4.4 has version 2.26.04.010.
Joshua Baker-LePain wrote:
On Wed, 27 Sep 2006 at 6:51pm, Michael Kress wrote
xen as of version 3.0.2 includes kernel 2.6.16 which seems to include version 2.26.02.005: 3w-9xxx.c:#define TW_DRIVER_VERSION "2.26.02.005" Does this help you to find out more?
Yep -- that's a rather old version, which is especially bad for the 9550 series boards. CentOS 4.4 has version 2.26.04.010.
Well, I downloaded the driver from 3ware, but the newest driver I could find was v2.26.02.007. version 2.26.04.010 must be something specific to centos (that makes centos so stable with that device ;-) )!? btw, I feel like all these pilgrims pilgering to magic fountains to get the water from there because it reliefs pain from peoples' back. :)
Sep 27 21:30:53 matrix kernel: 3ware 9000 Storage Controller device driver for Linux v2.26.02.007. Sep 27 21:30:53 matrix kernel: ACPI: PCI Interrupt 0000:04:01.0[A] -> GSI 48 (level, low) -> IRQ 21 Sep 27 21:30:53 matrix kernel: 3w-9xxx: scsi2: Found a 3ware 9000 Storage Controller at 0xda300000, IRQ: 21. Sep 27 21:30:53 matrix kernel: 3w-9xxx: scsi2: Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002, Ports: 4. Sep 27 21:30:53 matrix kernel: Vendor: AMCC Model: 9550SXU-4L DISK Rev: 3.04
And I have a disappointing message: the kernel dump even arrives earlier when doing an mkfs.ext3
Whereelse can I have a look? Should I unpack the kernel-2.6.9-42.0.2.EL.src.rpm, apply the patch and get that source to include it into xen or won't it work as of centos specific patches? Thanks - Michael
Michael Kress spake the following on 9/27/2006 12:44 PM:
Joshua Baker-LePain wrote:
On Wed, 27 Sep 2006 at 6:51pm, Michael Kress wrote
xen as of version 3.0.2 includes kernel 2.6.16 which seems to include version 2.26.02.005: 3w-9xxx.c:#define TW_DRIVER_VERSION "2.26.02.005" Does this help you to find out more?
Yep -- that's a rather old version, which is especially bad for the 9550 series boards. CentOS 4.4 has version 2.26.04.010.
Well, I downloaded the driver from 3ware, but the newest driver I could find was v2.26.02.007. version 2.26.04.010 must be something specific to centos (that makes centos so stable with that device ;-) )!? btw, I feel like all these pilgrims pilgering to magic fountains to get the water from there because it reliefs pain from peoples' back. :)
Sep 27 21:30:53 matrix kernel: 3ware 9000 Storage Controller device driver for Linux v2.26.02.007. Sep 27 21:30:53 matrix kernel: ACPI: PCI Interrupt 0000:04:01.0[A] -> GSI 48 (level, low) -> IRQ 21 Sep 27 21:30:53 matrix kernel: 3w-9xxx: scsi2: Found a 3ware 9000 Storage Controller at 0xda300000, IRQ: 21. Sep 27 21:30:53 matrix kernel: 3w-9xxx: scsi2: Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002, Ports: 4. Sep 27 21:30:53 matrix kernel: Vendor: AMCC Model: 9550SXU-4L DISK Rev: 3.04
And I have a disappointing message: the kernel dump even arrives earlier when doing an mkfs.ext3
Whereelse can I have a look? Should I unpack the kernel-2.6.9-42.0.2.EL.src.rpm, apply the patch and get that source to include it into xen or won't it work as of centos specific patches? Thanks - Michael
You can find the newer driver under the fedora core 4 level. I don't think 3ware has officially released it as an addon for other distros, but RedHat seems to have picked it up and added it to their kernel.
On Wed, 27 Sep 2006 at 9:44pm, Michael Kress wrote
Joshua Baker-LePain wrote:
On Wed, 27 Sep 2006 at 6:51pm, Michael Kress wrote
xen as of version 3.0.2 includes kernel 2.6.16 which seems to include version 2.26.02.005: 3w-9xxx.c:#define TW_DRIVER_VERSION "2.26.02.005" Does this help you to find out more?
Yep -- that's a rather old version, which is especially bad for the 9550 series boards. CentOS 4.4 has version 2.26.04.010.
Well, I downloaded the driver from 3ware, but the newest driver I could find was v2.26.02.007.
Download the driver for the 9.3.0.4 driver set for "Linux 2.6 (for supported distros)". That'll get you the source for 2.26.04.009.
Joshua Baker-LePain schrieb:
xen as of version 3.0.2 includes kernel 2.6.16 which seems to include version 2.26.02.005: 3w-9xxx.c:#define TW_DRIVER_VERSION "2.26.02.005" Does this help you to find out more?
Download the driver for the 9.3.0.4 driver set for "Linux 2.6 (for supported distros)". That'll get you the source for 2.26.04.009.
Hi!
ok, I tried 2.26.04.009 from "Linux 2.6 (for supported distros)". The mkfs.ext3 advanced a little bit further than before, but still I got my kernel dump. :-| Moreover I've got a warning about a 'struct device_driver shutdown method'.
Sep 27 23:31:33 matrix kernel: 3ware 9000 Storage Controller device driver for Linux v2.26.04.009. Sep 27 23:31:33 matrix kernel: Warning: PCI driver 3w-9xxx has a struct device_driver shutdown method, please update! Sep 27 23:31:33 matrix kernel: ACPI: PCI Interrupt 0000:04:01.0[A] -> GSI 48 (level, low) -> IRQ 21 Sep 27 23:31:33 matrix kernel: scsi2 : 3ware 9000 Storage Controller Sep 27 23:31:33 matrix kernel: 3w-9xxx: scsi2: Found a 3ware 9000 Storage Controller at 0xda300000, IRQ: 21. Sep 27 23:31:33 matrix kernel: 3w-9xxx: scsi2: Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002, Ports: 4. Sep 27 23:31:33 matrix kernel: Vendor: AMCC Model: 9550SXU-4L DISK Rev: 3.04
The dump: [<c0164ac3>] cache_alloc_refill+0x63/8x230 [<c0105148>] hypervisor_callback+0x2c/8x34 [<c014007b>] module_yet_kallsym+0xb/8xd0 [<c023d8fa>] force_evtchn_callback+0xa/8x10 [<c0149895>] buffered_rmqueue+0x285/8x2b0 [<c0149a62>] yet_paye_from_freelist+0xc2i8xf0 [<c0149ae7>] __alloc_payes+0x57/0x320 [<c0146b88>] generic_filebufferedWrite+0x148i0x6c0 [<c0186d50>] file_update_time+0x50i8xd0 [<c014740d>] __generic_file_aioWrite_nolock+0x30d/0x580 [<c02f8997>] tcp_v4_rcv+0x877i8xa10 [<c01d270e>] blk_rui_queue+0x4e/8x70 [<c028cd50>] usb_hcd_irq+0x30/0x70 [<c01476d0>] yeneric_fileaioWritenolock+0x50i8xd0 [<c0142e43>] __do_IBQ+0xb3/0x110 [<c01478c3>] yeneric_filewritenolock+0xa3i8xd0 [<c0137170>] autoremovewakefuiction+0x0i0x60 [<c0108ebl>] monotonicclock+0x51i8xa0 [<c032ff84>] schedule+0x3f4/8x770 [<c0172f18>] blkdev filewrite+0x38/0x40 [<c01695e6>] vfs Write+0x1c6/8x1d0 [<c01696cl>] sys_write+0x51/8x80 [<c0104f85>] syscall_ca11+0x7i8xb (sorry, addresses messed up by ocr from eric remote console card's screenshot).
Any more hints? Would there be another approach, e.g. to apply somehow the xen patches to the centos / upstream kernel and use that one?
regards - Michael
On Wed, 27 Sep 2006, Michael Kress wrote:
Joshua Baker-LePain schrieb:
xen as of version 3.0.2 includes kernel 2.6.16 which seems to include version 2.26.02.005: 3w-9xxx.c:#define TW_DRIVER_VERSION "2.26.02.005" Does this help you to find out more?
Download the driver for the 9.3.0.4 driver set for "Linux 2.6 (for supported distros)". That'll get you the source for 2.26.04.009.
Hi!
ok, I tried 2.26.04.009 from "Linux 2.6 (for supported distros)". The mkfs.ext3 advanced a little bit further than before, but still I got my kernel dump. :-| Moreover I've got a warning about a 'struct device_driver shutdown method'.
Sep 27 23:31:33 matrix kernel: 3ware 9000 Storage Controller device driver for Linux v2.26.04.009. Sep 27 23:31:33 matrix kernel: Warning: PCI driver 3w-9xxx has a struct device_driver shutdown method, please update! Sep 27 23:31:33 matrix kernel: ACPI: PCI Interrupt 0000:04:01.0[A] -> GSI 48 (level, low) -> IRQ 21 Sep 27 23:31:33 matrix kernel: scsi2 : 3ware 9000 Storage Controller Sep 27 23:31:33 matrix kernel: 3w-9xxx: scsi2: Found a 3ware 9000 Storage Controller at 0xda300000, IRQ: 21. Sep 27 23:31:33 matrix kernel: 3w-9xxx: scsi2: Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002, Ports: 4. Sep 27 23:31:33 matrix kernel: Vendor: AMCC Model: 9550SXU-4L DISK Rev: 3.04
The dump: [<c0164ac3>] cache_alloc_refill+0x63/8x230 [<c0105148>] hypervisor_callback+0x2c/8x34 [<c014007b>] module_yet_kallsym+0xb/8xd0 [<c023d8fa>] force_evtchn_callback+0xa/8x10 [<c0149895>] buffered_rmqueue+0x285/8x2b0 [<c0149a62>] yet_paye_from_freelist+0xc2i8xf0 [<c0149ae7>] __alloc_payes+0x57/0x320 [<c0146b88>] generic_filebufferedWrite+0x148i0x6c0 [<c0186d50>] file_update_time+0x50i8xd0 [<c014740d>] __generic_file_aioWrite_nolock+0x30d/0x580 [<c02f8997>] tcp_v4_rcv+0x877i8xa10 [<c01d270e>] blk_rui_queue+0x4e/8x70 [<c028cd50>] usb_hcd_irq+0x30/0x70 [<c01476d0>] yeneric_fileaioWritenolock+0x50i8xd0 [<c0142e43>] __do_IBQ+0xb3/0x110 [<c01478c3>] yeneric_filewritenolock+0xa3i8xd0 [<c0137170>] autoremovewakefuiction+0x0i0x60 [<c0108ebl>] monotonicclock+0x51i8xa0 [<c032ff84>] schedule+0x3f4/8x770 [<c0172f18>] blkdev filewrite+0x38/0x40 [<c01695e6>] vfs Write+0x1c6/8x1d0 [<c01696cl>] sys_write+0x51/8x80 [<c0104f85>] syscall_ca11+0x7i8xb (sorry, addresses messed up by ocr from eric remote console card's screenshot).
Any more hints? Would there be another approach, e.g. to apply somehow the xen patches to the centos / upstream kernel and use that one?
regards - Michael
-- Michael Kress, kress@hal.saar.de http://www.michael-kress.de / http://kress.net P E N G U I N S A R E C O O L
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Have you tried updating the firmware on the card/trying the latest kernel/etc?
Using new driver w/ an old-firmware is not a good idea.
Justin.
Justin Piszcz wrote:
Sep 27 23:31:33 matrix kernel: 3w-9xxx: scsi2: Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002, Ports: 4.
Have you tried updating the firmware on the card/trying the latest kernel/etc?
Using new driver w/ an old-firmware is not a good idea.
As you can see in the quoted part above, the firmware _is_ a current one, there's no newer firmware available as far as I can gather from 3ware's homepage. BTW, the support person at 3ware wrote these lines regarding my issue:
"Use the in-kernel driver in 2.6.16. Don't compile the driver source from 9.3.0.4. If you want to get latest source, use the upstream source from kernel 2.6.14 or higher."
(which is what I've already tried).
Regards - Michael
Hi,
the problem with hangs and kernel dumps on my 3ware 9550SXU-4LP is solved!! Both solutions from http://lists.xensource.com/archives/html/xen-devel/2005-07/msg00431.html helped me, i.e. adding the boot parameter 'noirqbalance' or adding the boot parameter 'nousb'. Of course I prefer 'noirqbalance' so I can use the keyboard via my eric remote console. Now I can do mkfs.ext3 and lots of dd if=/dev/zero of=/tmp/deleteme-`echo $RANDOM` bs=1024 count=102400 in parallel without getting in any trouble except for a little performance bottleneck.
That must be some issue with my supermicro X6DH8-G2 as this guy also had similar trouble http://lists.xensource.com/archives/html/xen-devel/2005-07/msg00440.html
I'm anyways going to stay at driver version v2.26.04.010 which is being delivered with kernel-2.6.9-42.0.2.EL.src.rpm. Thanks for your help!
Michael
Hi,
I'm sorry, I messed up the mailing list addresses - anyways, it's a bit centos related and the thread touches the question "why on earth does my 3ware 9550SX-LP controller work under centos and why doesn't it work under xen?" ;-) See http://lists.xensource.com/archives/html/xen-users/2006-09/msg00760.html for the discussion thread. BTW, has anybody an idea about this issue? Regards Michael
Michael Kress schrieb:
Jonas Björklund schrieb:
Hello,
On Tue, 26 Sep 2006, Michael Kress wrote:
I tried the "in-kernel" driver, i.e. I modified the config to include 3w-9xxx in the kernel 2.6.16, not as a module. I must admit that the mkfs went through this time, but as soon as I do some additional 'dd' or 'cp' it doesn't take me longer than 10 sec to receive this crash (sorry, no more information than the screen, because the machine crashes and because it doesn't write logs for obvious reasons). Have you got some idea?
Do you have a 64-bit PCI or a 32-bit PCI?
Hi,
it's a 64-bit 133MHz PCI-X. see http://www.supermicro.com/products/motherboard/Xeon800/E7520/X6DH8-G2+.cfm
There must be something different about the way the kernel and its components are composed and unfortunately I don't have the knowledge to find it. Under the kernel that came with CentOS (2.6.9) the controller works perfectly, it's only the 2.6.16 that comes with xen that produces trouble. Is there any more debug options I could activate to provide more details?
I don't want to move away from xen although I've already tried (the) openvz (kernel), which works perfectly during high io load, but xen seems more sympathic to me. I hope this messy technical detail doesn't force me to change to a different product.
Thanks for any more hints!
ciao - Michael