[Ci-users] Infra : scheduled hardware maintenance (Openshift/NFS)

Thu Oct 1 04:19:34 UTC 2020
Brian Stinson <brian at bstinson.com>

The legacy cluster should be back online now.

Thank you all for your patience while we worked this through.

--Brian

On Wed, Sep 30, 2020, at 17:24, Brian Stinson wrote:
> Hi Folks,
> 
> Here's an update on where we are with the legacy (OKD 3.6) cluster. 
> There were some integrity issues on the filesystem, so we decided to 
> make sure we got a full xfs_repair done, and will be syncing the data 
> to a new volume before we bring up the old cluster again.
> 
> The good news is: we have a good xfs_repair
> The bad news is: the sync is going to take a number of hours yet. I 
> don't have a good ETA for when the legacy cluster will come up again, 
> but it may be into the morning US-time on Thursday 
> 
> This is another call for folks who would like to start a migration to 
> the fancy OCP cluster, please fill out a ticket at 
> https://pagure.io/centos-infra
> 
> -- 
>   Brian Stinson
>   brian at bstinson.com
> 
> On Wed, Sep 30, 2020, at 08:02, Vipul Siddharth wrote:
> > On Wed, Sep 30, 2020 at 1:50 AM Fabian Arrotin <arrfab at centos.org> wrote:
> > >
> > > On 17/09/2020 16:30, Fabian Arrotin wrote:
> > > > Due to a hardware maintenance that needs to take place on the NFS
> > > > storage node used by openshift ("legacy" and current one - ocp ), we'll
> > > > have to shutdown the openshift cluster, and then proceed with hardware
> > > > maintenance on the NFS server (that itself needs to be powered down, no
> > > > way to actually do that "online")
> > > >
> > > > Migration is scheduled for """"Wednesday September 30th, 12:00 pm UTC
> > > > time"""".
> > > > You can convert to local time with $(date -d '2020-09-30 12:00 UTC')
> > > >
> > > > The expected "downtime" is estimated to ~60 minutes , time needed to
> > > > shutdown the machine, install new disks, restart the machine and also do
> > > > some updates and tuning on the setup.
> > 
> > Due to some issues with legacy cluster volume, this is taking longer
> > than expected.
> > We are working on it.
> > Apologies for the inconveniences.
> > 
> > > >
> > > > For more informations about this, here are some relevant tickets that
> > > > were created for the perf issue in openshift and nfs :
> > > >
> > > > https://pagure.io/centos-infra/issue/53
> > > > https://pagure.io/centos-infra/issue/105
> > > > https://pagure.io/centos-infra/issue/85
> > > > https://pagure.io/centos-infra/issue/26
> > > >
> > > > <subliminal message>
> > > > PS : worth noting that while we'll investigate reports on new ocp
> > > > cluster, we'll probably  not spend time investigating in the old/legacy
> > > > one, that projects are supposed to migrate away from soon, as the legacy
> > > > openshift setup will disappear soon (see
> > > > https://pagure.io/centos-infra/issue/16)
> > > > </subliminal message>
> > > >
> > >
> > >
> > > Reminder !  :-)
> > >
> > > Also, due to the needed time to also properly/cleanly power down all
> > > nodes, we decided to start at 11:00 am UTC, to be ready when on-site
> > > engineer will start un-racking storage server for hardware maintenance
> > > and put it back online after (we have a fixed appointment for when to do it)
> > >
> > > I'd like to remind all projects still on the old openshift cluster that
> > > despite our calls to have projects migrated, only a very few did.
> > > So we'll have discussion (centos ci infra team) about how to deal with
> > > this but at first sight, we'll just announce a date/deadline for
> > > decommissioning the old infra
> > >
> > > --
> > > Fabian Arrotin
> > > The CentOS Project | https://www.centos.org
> > > gpg key: 17F3B7A1 | twitter: @arrfab
> > >
> > > _______________________________________________
> > > CI-users mailing list
> > > CI-users at centos.org
> > > https://lists.centos.org/mailman/listinfo/ci-users
> > 
> > 
> > 
> > -- 
> > Vipul Siddharth
> > He/His/Him
> > Fedora | CentOS CI Infrastructure Team
> > 
> > _______________________________________________
> > CI-users mailing list
> > CI-users at centos.org
> > https://lists.centos.org/mailman/listinfo/ci-users
> >
> _______________________________________________
> CI-users mailing list
> CI-users at centos.org
> https://lists.centos.org/mailman/listinfo/ci-users
>