Hi there,
I stumbled upon a thread on debian-devel, regarding hyperthreading problems on Intel Skylake and Kaby Lake processors (discovered in Q2 2016 by the OCaml developers, fixed by Intel in microcode updates in May 2017 - for some, but not all processors [1]). The "safe" solution is to disable hyperthreading in the BIOS, but there's a Perl script telling you if you CPU is affected, or affected and patched in microcode. [2]
Considering that (probably) not all processors are fixed, would this warrant notifying our users via the centos-announce mailing list?
Best regards, Laurențiu
[1] https://lists.debian.org/debian-devel/2017/06/msg00308.html [2] https://lists.debian.org/debian-user/2017/06/msg01011.html
On 27/06/17 10:08, Laurentiu Pancescu wrote:
Hi there,
I stumbled upon a thread on debian-devel, regarding hyperthreading problems on Intel Skylake and Kaby Lake processors (discovered in Q2 2016 by the OCaml developers, fixed by Intel in microcode updates in May 2017 - for some, but not all processors [1]). The "safe" solution is to disable hyperthreading in the BIOS, but there's a Perl script telling you if you CPU is affected, or affected and patched in microcode. [2]
Considering that (probably) not all processors are fixed, would this warrant notifying our users via the centos-announce mailing list?
Best regards, Laurențiu
[1] https://lists.debian.org/debian-devel/2017/06/msg00308.html [2] https://lists.debian.org/debian-user/2017/06/msg01011.html
I have a potentially affected system. I've filed a bug with Red Hat to request microcode_ctl be updated to include the latest microcode:
https://bugzilla.redhat.com/show_bug.cgi?id=1465631
I can confirm the issue is not fixed in the current RHEL7.4beta microcode_ctl package.
In the meantime I've manually applied the microcode update on my affected system.
phil
On Tue, Jun 27, 2017 at 09:34:52PM +0100, Phil Perry wrote:
I have a potentially affected system. I've filed a bug with Red Hat to request microcode_ctl be updated to include the latest microcode:
https://bugzilla.redhat.com/show_bug.cgi?id=1465631
I can confirm the issue is not fixed in the current RHEL7.4beta microcode_ctl package.
The microcode update is already being worked on https://bugzilla.redhat.com/show_bug.cgi?id=1456339 no ETA
In the meantime I've manually applied the microcode update on my affected system.
... https://downloadcenter.intel.com/download/26798/Linux-Processor-Microcode-Da... does not mention any Xeon E5 v4
But there is this changelog from the debian team: http://metadata.ftp-master.debian.org/changelogs/non-free/i/intel-microcode/... ... intel-microcode (3.20170511.1) unstable; urgency=medium
* New upstream microcode datafile 20170511 ... + This release fixes undisclosed errata on the desktop, mobile and server processor models from the Haswell, Broadwell, and Skylake families, including even the high-end multi-socket server Xeons + Likely fix the TSC-Deadline LAPIC errata (BDF89, SKL142 and similar) on several processor families + Fix erratum BDF90 on Xeon E7v4, E5v4(?) (closes: #862606) + Likely fix serious or critical Skylake errata: SKL138/144, SKL137/145, SLK149 * Likely fix nightmare-level Skylake erratum SKL150. Fortunately, either this erratum is very-low-hitting, or gcc/clang/icc/msvc won't usually issue the affected opcode pattern and it ends up being rare. SKL150 - Short loops using both the AH/BH/CH/DH registers and the corresponding wide register *may* result in unpredictable system behavior. Requires both logical processors of the same core (i.e. sibling hyperthreads) to be active to trigger, as well as a "complex set of micro-architectural conditions" ...
I am worried by the "This release fixes undisclosed errata ... including even the high-end multi-socket server Xeons".
It may relates to https://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-upda... ... BDF76 An Intel® Hyper-Threading Technology Enabled Processor May Exhibit Internal Parity Errors or Unpredictable System Behavior Problem: Under a complex series of microarchitectural events while running Intel Hyper- Threading Technology, a correctable internal parity error or unpredictable system behavior may occur. Implication: A correctable error (IA32_MC0_STATUS.MCACOD=0005H and IA32_MC0_STATUS.MSCOD=0001H) may be logged. The unpredictable system behavior frequently leads to faults (e.g. #UD, #PF, #GP). Workaround: It is possible for the BIOS to contain a workaround for this erratum. Status: For the Steppings affected, see the Summary Tables of Changes. ...
Tru
On 27/06/17 23:17, Tru Huynh wrote:
But there is this changelog from the debian team: http://metadata.ftp-master.debian.org/changelogs/non-free/i/intel-microcode/...
FWIW, Fedora also updated its microcode to version 20170511 on May 23rd, according to their changelog. Not sure if it would work to just get the microcode files from their package (until RH releases an updated version). Disabling hyperthreading feels less experimental, at a moderate cost in performance.
Laurențiu
On 28/06/17 10:06, Laurentiu Pancescu wrote:
On 27/06/17 23:17, Tru Huynh wrote:
But there is this changelog from the debian team: http://metadata.ftp-master.debian.org/changelogs/non-free/i/intel-microcode/...
FWIW, Fedora also updated its microcode to version 20170511 on May 23rd, according to their changelog. Not sure if it would work to just get the microcode files from their package (until RH releases an updated version). Disabling hyperthreading feels less experimental, at a moderate cost in performance.
Laurențiu
I downloaded the latest microcode archive from Intel here:
https://downloadcenter.intel.com/download/26798/Linux-Processor-Microcode-Da...
and unpacked the intel-ucode folder into /usr/lib/firmware/intel-ucode/ overwriting the files provided by the microcode_ctl package.
To manually force an update without rebooting, do:
echo 1 > /sys/devices/system/cpu/microcode/reload
dmesg then shows the microcode has been updated:
[693680.818073] microcode: CPU0 sig=0x506e3, pf=0x2, revision=0xa0 [693680.818944] microcode: CPU0 updated to revision 0xba, date = 2017-04-09 [693680.818993] microcode: CPU1 sig=0x506e3, pf=0x2, revision=0xa0 [693680.819861] microcode: CPU1 updated to revision 0xba, date = 2017-04-09 [693680.819946] microcode: CPU2 sig=0x506e3, pf=0x2, revision=0xa0 [693680.820788] microcode: CPU2 updated to revision 0xba, date = 2017-04-09 [693680.820834] microcode: CPU3 sig=0x506e3, pf=0x2, revision=0xa0 [693680.821622] microcode: CPU3 updated to revision 0xba, date = 2017-04-09
I haven't rebooted, but I'm assuming the new microcode files will get written upon reboot.
I've not experienced any issues since manually updating as described above so I'm not waiting on Red Hat for an update.
On Wed, Jun 28, 2017 at 3:16 PM, Phil Perry phil@elrepo.org wrote:
On 28/06/17 10:06, Laurentiu Pancescu wrote:
On 27/06/17 23:17, Tru Huynh wrote:
But there is this changelog from the debian team: http://metadata.ftp-master.debian.org/changelogs/non-free/i/ intel-microcode/unstable_changelog
FWIW, Fedora also updated its microcode to version 20170511 on May 23rd, according to their changelog. Not sure if it would work to just get the microcode files from their package (until RH releases an updated version). Disabling hyperthreading feels less experimental, at a moderate cost in performance.
Laurențiu
I have an up-to-date Fedora 25 system that is affected, and running the
perl program by Henrique Hohlschuh show that my microcode is new enough, so I can confirm that the microcode in Fedora 25 is already corrected.