Due to a hardware maintenance that needs to take place on the NFS storage node used by openshift ("legacy" and current one - ocp ), we'll have to shutdown the openshift cluster, and then proceed with hardware maintenance on the NFS server (that itself needs to be powered down, no way to actually do that "online")
Migration is scheduled for """"Wednesday September 30th, 12:00 pm UTC time"""". You can convert to local time with $(date -d '2020-09-30 12:00 UTC')
The expected "downtime" is estimated to ~60 minutes , time needed to shutdown the machine, install new disks, restart the machine and also do some updates and tuning on the setup.
For more informations about this, here are some relevant tickets that were created for the perf issue in openshift and nfs :
https://pagure.io/centos-infra/issue/53 https://pagure.io/centos-infra/issue/105 https://pagure.io/centos-infra/issue/85 https://pagure.io/centos-infra/issue/26
<subliminal message> PS : worth noting that while we'll investigate reports on new ocp cluster, we'll probably not spend time investigating in the old/legacy one, that projects are supposed to migrate away from soon, as the legacy openshift setup will disappear soon (see https://pagure.io/centos-infra/issue/16) </subliminal message>
Thanks for your comprehending and patience.
on behalf of the CI Infra team,
On 17/09/2020 16:30, Fabian Arrotin wrote:
Due to a hardware maintenance that needs to take place on the NFS storage node used by openshift ("legacy" and current one - ocp ), we'll have to shutdown the openshift cluster, and then proceed with hardware maintenance on the NFS server (that itself needs to be powered down, no way to actually do that "online")
Migration is scheduled for """"Wednesday September 30th, 12:00 pm UTC time"""". You can convert to local time with $(date -d '2020-09-30 12:00 UTC')
The expected "downtime" is estimated to ~60 minutes , time needed to shutdown the machine, install new disks, restart the machine and also do some updates and tuning on the setup.
For more informations about this, here are some relevant tickets that were created for the perf issue in openshift and nfs :
https://pagure.io/centos-infra/issue/53 https://pagure.io/centos-infra/issue/105 https://pagure.io/centos-infra/issue/85 https://pagure.io/centos-infra/issue/26
<subliminal message> PS : worth noting that while we'll investigate reports on new ocp cluster, we'll probably not spend time investigating in the old/legacy one, that projects are supposed to migrate away from soon, as the legacy openshift setup will disappear soon (see https://pagure.io/centos-infra/issue/16) </subliminal message>
Reminder ! :-)
Also, due to the needed time to also properly/cleanly power down all nodes, we decided to start at 11:00 am UTC, to be ready when on-site engineer will start un-racking storage server for hardware maintenance and put it back online after (we have a fixed appointment for when to do it)
I'd like to remind all projects still on the old openshift cluster that despite our calls to have projects migrated, only a very few did. So we'll have discussion (centos ci infra team) about how to deal with this but at first sight, we'll just announce a date/deadline for decommissioning the old infra
On Wed, Sep 30, 2020 at 1:50 AM Fabian Arrotin arrfab@centos.org wrote:
On 17/09/2020 16:30, Fabian Arrotin wrote:
Due to a hardware maintenance that needs to take place on the NFS storage node used by openshift ("legacy" and current one - ocp ), we'll have to shutdown the openshift cluster, and then proceed with hardware maintenance on the NFS server (that itself needs to be powered down, no way to actually do that "online")
Migration is scheduled for """"Wednesday September 30th, 12:00 pm UTC time"""". You can convert to local time with $(date -d '2020-09-30 12:00 UTC')
The expected "downtime" is estimated to ~60 minutes , time needed to shutdown the machine, install new disks, restart the machine and also do some updates and tuning on the setup.
Due to some issues with legacy cluster volume, this is taking longer than expected. We are working on it. Apologies for the inconveniences.
For more informations about this, here are some relevant tickets that were created for the perf issue in openshift and nfs :
https://pagure.io/centos-infra/issue/53 https://pagure.io/centos-infra/issue/105 https://pagure.io/centos-infra/issue/85 https://pagure.io/centos-infra/issue/26
<subliminal message> PS : worth noting that while we'll investigate reports on new ocp cluster, we'll probably not spend time investigating in the old/legacy one, that projects are supposed to migrate away from soon, as the legacy openshift setup will disappear soon (see https://pagure.io/centos-infra/issue/16) </subliminal message>
Reminder ! :-)
Also, due to the needed time to also properly/cleanly power down all nodes, we decided to start at 11:00 am UTC, to be ready when on-site engineer will start un-racking storage server for hardware maintenance and put it back online after (we have a fixed appointment for when to do it)
I'd like to remind all projects still on the old openshift cluster that despite our calls to have projects migrated, only a very few did. So we'll have discussion (centos ci infra team) about how to deal with this but at first sight, we'll just announce a date/deadline for decommissioning the old infra
-- Fabian Arrotin The CentOS Project | https://www.centos.org gpg key: 17F3B7A1 | twitter: @arrfab
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
Hi Folks,
Here's an update on where we are with the legacy (OKD 3.6) cluster. There were some integrity issues on the filesystem, so we decided to make sure we got a full xfs_repair done, and will be syncing the data to a new volume before we bring up the old cluster again.
The good news is: we have a good xfs_repair The bad news is: the sync is going to take a number of hours yet. I don't have a good ETA for when the legacy cluster will come up again, but it may be into the morning US-time on Thursday
This is another call for folks who would like to start a migration to the fancy OCP cluster, please fill out a ticket at https://pagure.io/centos-infra
The legacy cluster should be back online now.
Thank you all for your patience while we worked this through.
--Brian
On Wed, Sep 30, 2020, at 17:24, Brian Stinson wrote:
Hi Folks,
Here's an update on where we are with the legacy (OKD 3.6) cluster. There were some integrity issues on the filesystem, so we decided to make sure we got a full xfs_repair done, and will be syncing the data to a new volume before we bring up the old cluster again.
The good news is: we have a good xfs_repair The bad news is: the sync is going to take a number of hours yet. I don't have a good ETA for when the legacy cluster will come up again, but it may be into the morning US-time on Thursday
This is another call for folks who would like to start a migration to the fancy OCP cluster, please fill out a ticket at https://pagure.io/centos-infra
-- Brian Stinson brian@bstinson.com
On Wed, Sep 30, 2020, at 08:02, Vipul Siddharth wrote:
On Wed, Sep 30, 2020 at 1:50 AM Fabian Arrotin arrfab@centos.org wrote:
On 17/09/2020 16:30, Fabian Arrotin wrote:
Due to a hardware maintenance that needs to take place on the NFS storage node used by openshift ("legacy" and current one - ocp ), we'll have to shutdown the openshift cluster, and then proceed with hardware maintenance on the NFS server (that itself needs to be powered down, no way to actually do that "online")
Migration is scheduled for """"Wednesday September 30th, 12:00 pm UTC time"""". You can convert to local time with $(date -d '2020-09-30 12:00 UTC')
The expected "downtime" is estimated to ~60 minutes , time needed to shutdown the machine, install new disks, restart the machine and also do some updates and tuning on the setup.
Due to some issues with legacy cluster volume, this is taking longer than expected. We are working on it. Apologies for the inconveniences.
For more informations about this, here are some relevant tickets that were created for the perf issue in openshift and nfs :
https://pagure.io/centos-infra/issue/53 https://pagure.io/centos-infra/issue/105 https://pagure.io/centos-infra/issue/85 https://pagure.io/centos-infra/issue/26
<subliminal message> PS : worth noting that while we'll investigate reports on new ocp cluster, we'll probably not spend time investigating in the old/legacy one, that projects are supposed to migrate away from soon, as the legacy openshift setup will disappear soon (see https://pagure.io/centos-infra/issue/16) </subliminal message>
Reminder ! :-)
Also, due to the needed time to also properly/cleanly power down all nodes, we decided to start at 11:00 am UTC, to be ready when on-site engineer will start un-racking storage server for hardware maintenance and put it back online after (we have a fixed appointment for when to do it)
I'd like to remind all projects still on the old openshift cluster that despite our calls to have projects migrated, only a very few did. So we'll have discussion (centos ci infra team) about how to deal with this but at first sight, we'll just announce a date/deadline for decommissioning the old infra
-- Fabian Arrotin The CentOS Project | https://www.centos.org gpg key: 17F3B7A1 | twitter: @arrfab
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
-- Vipul Siddharth He/His/Him Fedora | CentOS CI Infrastructure Team
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users