I am running irqbalance with default configuration on an Atom 330 machine. This CPU has 2 physical cores + 2 SMT (aka Hyperthreading) cores.
As shown below the interrupt for the eth0 device is always on CPUs 0 and 1, with CPUs 2 and 3 left idle. But why?
Maybe irqbalance prefers physical cores? My understanding, though, is that the even-numbered CPUs are the physical cores, with the odd-numbered one being the SMT cores. If this understanding is correct, it means that irqbalance is toggling between a single physical core and its SMT sibling.
Any thoughts on why irqbalance is not using all 4 CPUs to distribute the eth0 interrupts?
Thanks.
---------------------
# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 717 568 0 0 IO-APIC-edge timer 1: 0 2 0 0 IO-APIC-edge i8042 8: 26 33 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 1 3 0 0 IO-APIC-edge i8042 17: 11409 11661 0 0 IO-APIC-fasteoi ahci 18: 264 424 0 0 IO-APIC-fasteoi snd_hda_intel 20: 0 0 0 0 IO-APIC-fasteoi ohci_hcd:usb2 21: 0 0 0 0 IO-APIC-fasteoi ohci_hcd:usb3 22: 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1 24: 686292451 685980205 0 0 PCI-MSI-edge eth0
Steve Snyder wrote:
I am running irqbalance with default configuration on an Atom 330 machine. This CPU has 2 physical cores + 2 SMT (aka Hyperthreading) cores.
As shown below the interrupt for the eth0 device is always on CPUs 0 and 1, with CPUs 2 and 3 left idle. But why?
Maybe irqbalance prefers physical cores? My understanding, though, is that the even-numbered CPUs are the physical cores, with the odd-numbered one being the SMT cores. If this understanding is correct, it means that irqbalance is toggling between a single physical core and its SMT sibling.
Any thoughts on why irqbalance is not using all 4 CPUs to distribute the eth0 interrupts?
I believe the hyperthreading cores are enumerated after the 'real' cores - you can see this by using 'lstopo' - part of the hwloc package ('yum install hwloc') i.e. logical CPUs 0 and 1 are the 'real' cores and logical CPUs 2 and 3 are the HT cores
I suspect interrupts only have meaning on the real cores - hence you not seeing any on CPUs 2 and 3
James Pearson
Maybe this utility will be useful to you.
http://www.open-mpi.org/projects/hwloc/ Portable Hardware Locality (hwloc)
"The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.
The democratization of multicore processors and NUMA architectures leads to the spreading of complex hardware topologies into the whole server world. Nodaways every single cluster node may contain tens of cores, hierarchical caches, and multiple memory nodes, making its topology far from flat. Such complex and hierarchical topologies have strong impact of the application performance. The developer must take hardware affinities into account when trying to exploit the actual hardware performance. (...)"