*TL;DR: CentOS CI is going hardwareless and if you wish your project
remains using it, we need your opt-in by August 2022. There is a Dojo
Summer 2022 <https://wiki.centos.org/Events/Dojo/Summer2022> session
happening on Thursday, June 17th, that will explain further technical
details. *
Hello everyone,
As many of you know, since the beginning of this year we have been
reevaluating the future of CentOS CI, as currently the hardware being used
for it is out of warranty. This is due to the fact that CentOS CI came from
community donations of hardware which were maintained in a best effort
manner by our team. With no warranties, when the physical machine dies we
have no means to replace it. Right now though, our hardware, due to a lack
of warranty, will not be moved with the upcoming data center changes due to
data center requirements to have in warranty hardware for supportability.
We decided to take this opportunity to modernize our current
infrastructure, pushing it to a hybrid cloud environment. Duffy CI will
become the main tool from now on, so that we can support the CI workflow
and best practices on cloud and for this reason, the current hardware infra
will no longer be available soon. However, as an effort to continuously
provide resources and support CI best practices for projects, our team is
adapting Duffy CI so that we can maintain most of the characteristics of
our current, physical-based offering.
At the technical level, what does that mean for you, CI tenants?
-
A new Duffy API service will replace the existing one: while it will be
running in compatibility/legacy mode with the previous version, you will
need to adapt your workflow to the new API, but more details below
-
We will transition to AWS EC2 instances for the aarch64 and x86_64
architectures by default, with a (limited) option to request “metal”
instances for projects requiring virtualization for their tests (like
KVM/vagrant/etc)
-
We will keep a (very small) Power9 infra “on-premise” (AWS does not
support ppc64le) for the ppc64le tests (available through a dedicated VPN
tunnel)
-
The existing OpenShift cluster will be also decommissioned and a new one
(hosted in AWS, so without an option to run kubevirt operator nor VMs) will
be then used (you will have to migrate from one to the other)
With that being said, tenants can start preparing for the changes to happen
with the maximum deadline of the end of December 2022 wherein at this
point, Duffy API legacy mode will be removed. You are required to opt-in if
you and/or your team want to use Duffy CI. Projects will only be migrated
if they reply to this email confirming that they wish to proceed. Worth
knowing that not opting in means that your API key will not be migrated and
so all your requests to get temporary/ephemeral nodes will be rejected by
the new Duffy API.
The maximum decommission deadline of the current hardware infrastructure is
December 12th, 2022 and the new Duffy CI will go live in August 2022, so
please, complete your migration process by the end of CY22. Reminders of
deadlines and of the opt-in requirements will be sent monthly, but your
confirmation of opt-in is required by August 2022. When approaching
December, reminders about deadlines frequency will increase so that we can
ensure effective communication throughout the process.
Here are the steps in which we will migrate CI Infra:
Phase 1 - Deploy Duffy V3 (August 2022)
-
Deploy in legacy/compatibility mode, so existing tenants (that opted in
!) can still request duffy nodes the same way (like with
'python-cicoclient') : no change at tenants side, and exactly same hardware
for tests (transparent migration)
-
New Duffy API endpoint becomes available, and tenants can start adapting
their workflows to point to new API (new ‘duffy-cli’ tool coming, with
documentation)
-
Bare metal and VMs options will be available already through the new
API (x86_64, aarch64, ppc64le)
Phase 2 - Hybrid Cloud (October 2022)
-
Legacy/compatibility API endpoint will handover EC2 instances instead of
local seamicro nodes (VMs vs bare metal)
-
Bare metal options will be available through the new API only
-
Legacy seamicro and aarch64/ThunderX hardware are decommissioned
-
Only remaining "on-premise" option is ppc64le (local cloud)
Phase 3 - Decommission (December 2022)
-
Legacy/compatibility API deprecated and requests (even for EC2
instances) will no longer be accepted
-
All tenants that opted in will be using only EC2 for aarch64/x86_64 and
on-premise cloud for ppc64le
OpenShift new deployment planning and timeline
To be defined (deadline for planning and timeline: end of June 2022)
Do not hesitate to reach out if you have any questions. It is worth knowing
that there will be a dedicated session about the Future of CentOS CI infra
at the next CentOS Dojo happening on June 17h (check Dojo Summer 2022
<https://wiki.centos.org/Events/Dojo/Summer2022>). That session will be
recorded and then available on Youtube but if you have any questions. Feel
free to join the CentOS Dojo and reach out to us!
Best regards,
--
Camila Granella
Associate Manager, Software Engineering
Red Hat <https://www.redhat.com/>
@Red Hat <https://twitter.com/redhat> Red Hat
<https://www.linkedin.com/company/red-hat> Red Hat
<https://www.facebook.com/RedHatInc>
<https://www.redhat.com/>
As you're all aware now, CentOS CI infra landscape will be different in
the next weeks/months.
One of the biggest change will be the introduction of Duffy v3 as
replacement for the Duffy API and while there will be compatibility
layer (allowing you to use your existing "cico" workflow), the goal is
that each tenant will move to the new api, which itself then requires
either:
- you abusing directly the api endpoint (https://duffy.ci.centos.org/docs)
- use the new duffy client (pip3.8 install --user duffy[client] with a
config file
We'd like to offer the duffy client natively in our cico-workspace
pod/container that is automatically deployed for your jenkins jobs on
openshift cluster.
But if we trigger it, it will also come with something additional : the
current centos Stream8 based container is using ansible 2.9.x, which is
now deprecated, and kicking a container rebuild would automatically
install ansible-core 2.12.x (present in el8 natively now)
As you're aware, ansible-core doesn't have all the modules that you
probably need to use for your workflow, so instead of install the
meta-pkg ansible (which then install almost everything) you can come
(reply to this thread) with a list of collections that you'd like to see
in the base container (the less we have to pull, the better, to keep the
container image as light as possible), we'd then add these to the
container image.
Thanks for your collaboration,
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Hello!
After reading & discussing the recent news regarding the move to AWS, I wonder if it would be possible to provide Fedora Rawhide images along with the C8S and C9S ones (and maybe active stable Fedora releases as well).
A bit of background: In the systemd project we have several jobs with utilize Vagrant to run an Arch Linux VMs, in which we run tests alongside the C8S/C9S jobs, to cover issues with the latest-ish kernel and other software, and to also hunt down security issues with the latest versions of ASan and UBSan. However, all this is held together by a lot of duct tape and sheer will power, and in the end it requires an EC2 Metal instance to run, due to the additional level of virtualization.
If we were able to provision Rawhide instances directly (which should help us achieve the same goal as the Arch Linux VMs we currently use), that could, in theory, allow us to drop the requirement for Metal instances completely.
As far as I know, there are Fedora Rawhide AMIs[0], which should make this much easier, but that's all I can say, since I have almost zero experience with AWS overall.
Thank you!
Cheers,
Frantisek
[0] https://fedoraproject.org/w/index.php?title=Test_Results:Current_Cloud_Test…
--
PGP Key ID: 0xFB738CE27B634E4B