Hi all,
this Sunday, our zabbix monitoring platform informed us that there were plenty of nodes unreachable in community cage, and after some investigation it's a whole rack that was cut off access from network switches. That causes issues for multiple workloads, including duffy CI infra env.
We'll verify with RH IT (managing the network switches) how network connectivity can be restored but if that's due to faulty network switch, we'll have no ETA, as (reminder) everything in that cage runs on out-of-warranty hardware, and so without any SLA/SLE.
We'll keep you informed about status when possible
On 01/10/2023 22:24, Fabian Arrotin wrote:
Hi all,
this Sunday, our zabbix monitoring platform informed us that there were plenty of nodes unreachable in community cage, and after some investigation it's a whole rack that was cut off access from network switches. That causes issues for multiple workloads, including duffy CI infra env.
We'll verify with RH IT (managing the network switches) how network connectivity can be restored but if that's due to faulty network switch, we'll have no ETA, as (reminder) everything in that cage runs on out-of-warranty hardware, and so without any SLA/SLE.
We'll keep you informed about status when possible
Just to let you know that we were able (thanks @Michael from RH Network team) to identify and fix the issue so now https://duffy.ci.centos.org should be back online and operating