On Tue, Jun 27, 2017 at 09:34:52PM +0100, Phil Perry wrote:
I have a potentially affected system. I've filed a bug with Red Hat to request microcode_ctl be updated to include the latest microcode:
https://bugzilla.redhat.com/show_bug.cgi?id=1465631
I can confirm the issue is not fixed in the current RHEL7.4beta microcode_ctl package.
The microcode update is already being worked on https://bugzilla.redhat.com/show_bug.cgi?id=1456339 no ETA
In the meantime I've manually applied the microcode update on my affected system.
... https://downloadcenter.intel.com/download/26798/Linux-Processor-Microcode-Da... does not mention any Xeon E5 v4
But there is this changelog from the debian team: http://metadata.ftp-master.debian.org/changelogs/non-free/i/intel-microcode/... ... intel-microcode (3.20170511.1) unstable; urgency=medium
* New upstream microcode datafile 20170511 ... + This release fixes undisclosed errata on the desktop, mobile and server processor models from the Haswell, Broadwell, and Skylake families, including even the high-end multi-socket server Xeons + Likely fix the TSC-Deadline LAPIC errata (BDF89, SKL142 and similar) on several processor families + Fix erratum BDF90 on Xeon E7v4, E5v4(?) (closes: #862606) + Likely fix serious or critical Skylake errata: SKL138/144, SKL137/145, SLK149 * Likely fix nightmare-level Skylake erratum SKL150. Fortunately, either this erratum is very-low-hitting, or gcc/clang/icc/msvc won't usually issue the affected opcode pattern and it ends up being rare. SKL150 - Short loops using both the AH/BH/CH/DH registers and the corresponding wide register *may* result in unpredictable system behavior. Requires both logical processors of the same core (i.e. sibling hyperthreads) to be active to trigger, as well as a "complex set of micro-architectural conditions" ...
I am worried by the "This release fixes undisclosed errata ... including even the high-end multi-socket server Xeons".
It may relates to https://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-upda... ... BDF76 An Intel® Hyper-Threading Technology Enabled Processor May Exhibit Internal Parity Errors or Unpredictable System Behavior Problem: Under a complex series of microarchitectural events while running Intel Hyper- Threading Technology, a correctable internal parity error or unpredictable system behavior may occur. Implication: A correctable error (IA32_MC0_STATUS.MCACOD=0005H and IA32_MC0_STATUS.MSCOD=0001H) may be logged. The unpredictable system behavior frequently leads to faults (e.g. #UD, #PF, #GP). Workaround: It is possible for the BIOS to contain a workaround for this erratum. Status: For the Steppings affected, see the Summary Tables of Changes. ...
Tru