[Ci-users] other hardware outage for nodes behind ci.centos.org

On 16/06/18 08:39, Fabian Arrotin wrote:
> Hi guys,
> 
> While investigating the hardware issue we had on one Seamicro chassis in
> the last days (see previous thread), we lost another one (completely at
> this point) during the night, so I have disabled all those ones too. (so
> that means 64 bare-metal nodes less in the pool)
> 
> I'll create ticket with DC to see if that's possible to investigate the
> issue, and I'll keep the list informed about the status.
> 
> Thanks for your comprehending,
> 

Just to let you know that we're still waiting on some input from DC
about that unreachable Seamicro chassis (Pufty) and so I can't even give
you any ETA on this.

OTOH, we were able to get the previous chassis (Gusty) back online, and
I did some parallel reinstalls during the whole week-end and yesterday,
and it seems only one compute card (out of 64) has really a problem, so
that specific node/card is isolated now and I put that chassis back in
action in the duffy pool (so nodes were reinstalled, and I see some were
even deployed for some CI projects today already)

More informations about the Pufty chassis when I'll have something to report
-- 
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 56BEC54E | twitter: @arrfab

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/ci-users/attachments/20180619/04289e02/attachment-0003.sig>