[Ci-users] other hardware outage for nodes behind ci.centos.org

Sat Jun 23 06:39:10 UTC 2018
Fabian Arrotin <arrfab at centos.org>

On 19/06/18 11:08, Fabian Arrotin wrote:
> On 16/06/18 08:39, Fabian Arrotin wrote:
>> Hi guys,
>> While investigating the hardware issue we had on one Seamicro chassis in
>> the last days (see previous thread), we lost another one (completely at
>> this point) during the night, so I have disabled all those ones too. (so
>> that means 64 bare-metal nodes less in the pool)
>> I'll create ticket with DC to see if that's possible to investigate the
>> issue, and I'll keep the list informed about the status.
>> Thanks for your comprehending,
> Just to let you know that we're still waiting on some input from DC
> about that unreachable Seamicro chassis (Pufty) and so I can't even give
> you any ETA on this.
> OTOH, we were able to get the previous chassis (Gusty) back online, and
> I did some parallel reinstalls during the whole week-end and yesterday,
> and it seems only one compute card (out of 64) has really a problem, so
> that specific node/card is isolated now and I put that chassis back in
> action in the duffy pool (so nodes were reinstalled, and I see some were
> even deployed for some CI projects today already)
> More informations about the Pufty chassis when I'll have something to report

[update]That "Pufty" chassis is now back online, but under investigation.
We're "stress-testing" it during week-end to see if it's working as it
should (multiple reinstalls in parallels) and if that's ok, we'll add it
back in the CI nodes pool

Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 56BEC54E | twitter: @arrfab

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/ci-users/attachments/20180623/4cc31be4/attachment-0003.sig>