[Ci-users] Changes on CentOS CI and next steps

Wed Jun 15 11:49:34 UTC 2022
Camila Granella <cgranell at redhat.com>

*TL;DR: CentOS CI is going hardwareless and if you wish your project
remains using it, we need your opt-in by August 2022. There is a Dojo
Summer 2022 <https://wiki.centos.org/Events/Dojo/Summer2022> session
happening on Thursday, June 17th, that will explain further technical
details. *


Hello everyone,


As many of you know, since the beginning of this year we have been
reevaluating the future of CentOS CI, as currently the hardware being used
for it is out of warranty. This is due to the fact that CentOS CI came from
community donations of hardware which were maintained in a best effort
manner by our team. With no warranties, when the physical machine dies we
have no means to replace it. Right now though, our hardware, due to a lack
of warranty, will not be moved with the upcoming data center changes due to
data center requirements to have in warranty hardware for supportability.
We decided to take this opportunity to modernize our current
infrastructure, pushing it to a hybrid cloud environment. Duffy CI will
become the main tool from now on, so that we can support the CI workflow
and best practices on cloud and for this reason, the current hardware infra
will no longer be available soon. However, as an effort to continuously
provide resources and support CI best practices for projects, our team is
adapting Duffy CI so that we can maintain most of the characteristics of
our current, physical-based offering.


At the technical level, what does that mean for you, CI tenants?

   -

   A new Duffy API service will replace the existing one: while it will be
   running in compatibility/legacy mode with the previous version, you will
   need to adapt your workflow to the new API, but more details below
   -

   We will transition to AWS EC2 instances for the aarch64 and x86_64
   architectures by default, with a (limited) option to request “metal”
   instances for projects requiring virtualization for their tests (like
   KVM/vagrant/etc)
   -

   We will keep a (very small) Power9 infra “on-premise” (AWS does not
   support ppc64le) for the ppc64le tests (available through a dedicated VPN
   tunnel)
   -

   The existing OpenShift cluster will be also decommissioned and a new one
   (hosted in AWS, so without an option to run kubevirt operator nor VMs) will
   be then used (you will have to migrate from one to the other)


With that being said, tenants can start preparing for the changes to happen
with the maximum deadline of the end of December 2022 wherein at this
point, Duffy API legacy mode will be removed. You are required to opt-in if
you and/or your team want to use Duffy CI. Projects will only be migrated
if they reply to this email confirming that they wish to proceed.  Worth
knowing that not opting in means that your API key will not be migrated and
so all your requests to get temporary/ephemeral nodes will be rejected by
the new Duffy API.


The maximum decommission deadline of the current hardware infrastructure is
December 12th, 2022 and the new Duffy CI will go live in August 2022, so
please, complete your migration process by the end of CY22. Reminders of
deadlines and of the opt-in requirements will be sent monthly, but your
confirmation of opt-in is required by August 2022. When approaching
December, reminders about deadlines frequency will increase so that we can
ensure effective communication throughout the process.


Here are the steps in which we will migrate CI Infra:


Phase 1 - Deploy Duffy V3 (August 2022)

   -

   Deploy in legacy/compatibility mode, so existing tenants (that opted in
   !) can still request duffy nodes the same way (like with
   'python-cicoclient') : no change at tenants side, and exactly same hardware
   for tests (transparent migration)
   -

   New Duffy API endpoint becomes available, and tenants can start adapting
   their workflows to point to new API (new ‘duffy-cli’ tool coming, with
   documentation)
   -

   Bare metal and VMs options will be available already through the new
   API  (x86_64, aarch64, ppc64le)


Phase 2 - Hybrid Cloud (October 2022)

   -

   Legacy/compatibility API endpoint will handover EC2 instances instead of
   local seamicro nodes (VMs vs bare metal)
   -

   Bare metal options will be available through the new API only
   -

   Legacy seamicro and aarch64/ThunderX hardware are decommissioned
   -

   Only remaining "on-premise" option is ppc64le (local cloud)


Phase 3 - Decommission (December 2022)

   -

   Legacy/compatibility API deprecated and requests (even for EC2
   instances) will no longer be accepted
   -

   All tenants that opted in will be using only EC2 for aarch64/x86_64 and
   on-premise cloud for ppc64le


OpenShift new deployment planning and timeline

To be defined (deadline for planning and timeline: end of June 2022)


Do not hesitate to reach out if you have any questions. It is worth knowing
that there will be a dedicated session about the Future of CentOS CI infra
at the next CentOS Dojo happening on June 17h (check Dojo Summer 2022
<https://wiki.centos.org/Events/Dojo/Summer2022>). That session will be
recorded and then available on Youtube but if you have any questions. Feel
free to join the CentOS Dojo and reach out to us!


Best regards,
-- 

Camila Granella

Associate Manager, Software Engineering

Red Hat <https://www.redhat.com/>
@Red Hat <https://twitter.com/redhat>   Red Hat
<https://www.linkedin.com/company/red-hat>  Red Hat
<https://www.facebook.com/RedHatInc>
<https://www.redhat.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/ci-users/attachments/20220615/2517576e/attachment-0001.html>