Hi guys,
While investigating the hardware issue we had on one Seamicro chassis in the last days (see previous thread), we lost another one (completely at this point) during the night, so I have disabled all those ones too. (so that means 64 bare-metal nodes less in the pool)
I'll create ticket with DC to see if that's possible to investigate the issue, and I'll keep the list informed about the status.
Thanks for your comprehending,
On 16/06/18 08:39, Fabian Arrotin wrote:
Hi guys,
While investigating the hardware issue we had on one Seamicro chassis in the last days (see previous thread), we lost another one (completely at this point) during the night, so I have disabled all those ones too. (so that means 64 bare-metal nodes less in the pool)
I'll create ticket with DC to see if that's possible to investigate the issue, and I'll keep the list informed about the status.
Thanks for your comprehending,
Just to let you know that we're still waiting on some input from DC about that unreachable Seamicro chassis (Pufty) and so I can't even give you any ETA on this.
OTOH, we were able to get the previous chassis (Gusty) back online, and I did some parallel reinstalls during the whole week-end and yesterday, and it seems only one compute card (out of 64) has really a problem, so that specific node/card is isolated now and I put that chassis back in action in the duffy pool (so nodes were reinstalled, and I see some were even deployed for some CI projects today already)
More informations about the Pufty chassis when I'll have something to report
On 19/06/18 11:08, Fabian Arrotin wrote:
On 16/06/18 08:39, Fabian Arrotin wrote:
Hi guys,
While investigating the hardware issue we had on one Seamicro chassis in the last days (see previous thread), we lost another one (completely at this point) during the night, so I have disabled all those ones too. (so that means 64 bare-metal nodes less in the pool)
I'll create ticket with DC to see if that's possible to investigate the issue, and I'll keep the list informed about the status.
Thanks for your comprehending,
Just to let you know that we're still waiting on some input from DC about that unreachable Seamicro chassis (Pufty) and so I can't even give you any ETA on this.
OTOH, we were able to get the previous chassis (Gusty) back online, and I did some parallel reinstalls during the whole week-end and yesterday, and it seems only one compute card (out of 64) has really a problem, so that specific node/card is isolated now and I put that chassis back in action in the duffy pool (so nodes were reinstalled, and I see some were even deployed for some CI projects today already)
More informations about the Pufty chassis when I'll have something to report
[update]That "Pufty" chassis is now back online, but under investigation. We're "stress-testing" it during week-end to see if it's working as it should (multiple reinstalls in parallels) and if that's ok, we'll add it back in the CI nodes pool
Cheers,
Is this why I wouldbe seeing issues getting Duffy noses? I saw this on two separate Jenkins slaves.
-== @ri ==-
On Sat, Jun 23, 2018, 2:39 AM Fabian Arrotin arrfab@centos.org wrote:
On 19/06/18 11:08, Fabian Arrotin wrote:
On 16/06/18 08:39, Fabian Arrotin wrote:
Hi guys,
While investigating the hardware issue we had on one Seamicro chassis in the last days (see previous thread), we lost another one (completely at this point) during the night, so I have disabled all those ones too. (so that means 64 bare-metal nodes less in the pool)
I'll create ticket with DC to see if that's possible to investigate the issue, and I'll keep the list informed about the status.
Thanks for your comprehending,
Just to let you know that we're still waiting on some input from DC about that unreachable Seamicro chassis (Pufty) and so I can't even give you any ETA on this.
OTOH, we were able to get the previous chassis (Gusty) back online, and I did some parallel reinstalls during the whole week-end and yesterday, and it seems only one compute card (out of 64) has really a problem, so that specific node/card is isolated now and I put that chassis back in action in the duffy pool (so nodes were reinstalled, and I see some were even deployed for some CI projects today already)
More informations about the Pufty chassis when I'll have something to
report
[update]That "Pufty" chassis is now back online, but under investigation. We're "stress-testing" it during week-end to see if it's working as it should (multiple reinstalls in parallels) and if that's ok, we'll add it back in the CI nodes pool
Cheers,
Fabian Arrotin The CentOS Project | https://www.centos.org gpg key: 56BEC54E | twitter: @arrfab
Ci-users mailing list Ci-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
On 23/06/18 13:05, Ari LiVigni wrote:
Is this why I wouldbe seeing issues getting Duffy noses? I saw this on two separate Jenkins slaves.
-== @ri ==-
No, as those nodes aren't actually used by duffy, and there are more than enough nodes in the ready pool, so what's the error you have ?
I wanted to get some duffy resources to validate some playbook roles and I used cico:
[fedora-atomic@slave03 ~]$ cico --api-key duffy.key node get Starting new HTTP connection (1): admin.ci.centos.org Resetting dropped connection: admin.ci.centos.org The requested operation failed as no inventory is available.
Is there another way I should be allocating duffy resources?
-== @ri ==- My PGP fingerprint is F87F1EE7CD8BEE13
On Sat, Jun 23, 2018 at 8:55 AM, Fabian Arrotin arrfab@centos.org wrote:
On 23/06/18 13:05, Ari LiVigni wrote:
Is this why I wouldbe seeing issues getting Duffy noses? I saw this on two separate Jenkins slaves.
-== @ri ==-
No, as those nodes aren't actually used by duffy, and there are more than enough nodes in the ready pool, so what's the error you have ?
-- Fabian Arrotin The CentOS Project | https://www.centos.org gpg key: 56BEC54E | twitter: @arrfab
Ci-users mailing list Ci-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users