As announced some months ago, we are moving all CI infra components to
new infra (mostly AWS)
WRT Duffy api service itself, phase 1 was completed earlier this month
(August) and phase 2 will be announced for October.
Another service we need to migrate is https://artifacts.ci.centos.org,
which is the next service we'll migrate next week.
Migration is scheduled for """"Monday September 5th, 6:00 am UTC time"""".
You can convert to local time with $(date -d '2022-09-05 06:00 UTC')
The expected "downtime" should be really small, as it's will be just
pointing DNS CNAME record to new host, and eventually (see below) last
rsync between hosts.
Worth knowing that CI tenants using that service in the past were
relying on direct rsync access (tcp/873 and available only internally in
dedicated CI VLAN) with a rsync secret.
Due to the service being now publicly available on AWS, we decided to
just disable plain rsync but allow rsync over ssh (or sftp/scp if you
want), reusing your existing project keypair.
More details available at
https://sigs.centos.org/guide/ci/#artifacts-storage
Don't forget to update your script for next monday or you'll not be able
to push to the new storage server !
PS: while we consider all data "ephemeral" (and so to be discarded as
there is also no backup at all for that temporary hosting solution), we
can though migrate your existing data from old to new server. For that,
please "opt-in" in the existing ticket so that we can keep track of
projects we need to migrate. See ticket
https://pagure.io/centos-infra/issue/906
Thanks for your understanding and patience.
on behalf of the Infra team,
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Hi,
We are currently suffering from a flapping network connectivity to the
main DC where majority of the CentOS Infra is hosted.
After some internal discussion, we confirmed that upstream link provider
is aware of the issue and they are looking for a fix (but no ETA)
Impacted services:
- centos ci
- git.centos.org
- cbs.centos.org
- mirror.centos.org (downstream consumer and having issues pulling content)
- mirror.stream.centos.org (downstream consumer and having issues
pulling content)
- buildlogs.centos.org (same reason)
We'll post an update when this will be finally resolved
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Hello everyone,
This is a friendly reminder of the current and upcoming status of CentOS CI
changes (check [1]).
Projects that opted-in for continuing on CentOS CI have been migrated, and
the new Duffy API is available. With that, *phase 0 has been completed*.
Regarding *phase 1*, we are still working on a permanent fix for the DB
Concurrency issues [2]. Also, as for our OpenShift new deployment, we have
a staging environment up and running, and it should be available at
the beginning of September 2022.
In October 2022 we begin *phase 2* when we will work through the following
items (these were also previously communicated in [1]):
- legacy/compatibility API endpoint will handover EC2 instances instead
of local seamicro nodes (VMs vs bare metal)
- bare-metal options will be available through the new API only
- legacy seamicro and aarch64/ThunderX hardware are decommissioned
- only remaining "on-premises" option is ppc64le (local cloud)Feel free
to reach out if you have any questions or concerns
The final deadline for decommissioning the old infrastructure (*phase 3*)
is *December 2022*. We will be communicating further until then, and
meanwhile, reach out to any of us in case you have any questions.
Regards,
[1] [ci-users] Changes on CentOS CI and next steps:
https://lists.centos.org/pipermail/ci-users/2022-June/004547.html
[2] DB Concurrency issues: https://github.com/CentOS/duffy/issues/523
--
Camila Granella
Associate Manager, Software Engineering
Red Hat <https://www.redhat.com/>
@Red Hat <https://twitter.com/redhat> Red Hat
<https://www.linkedin.com/company/red-hat> Red Hat
<https://www.facebook.com/RedHatInc>
<https://www.redhat.com/>
hi,
We switched last Monday to new Duffy API and while we saw machines being
requested (previous seamicro pool but also now VMs from AWS/EC2) and
returned, from time to time tenants informed us of transient errors.
Based on some troubleshooting, it seems that Duffy api was answering in
the same second to either different tenant with same nodes (so nodes
being handed over to different tenants) or even same tenant but with
different session IDs (but same hostname)
Nils (Duffy code author) is busy today looking at a fix and we'll let
you know when we'll be able to roll it out.
PS : that means that we'll have also to stop Duffy API and proceed with
some DB clean-up operations to restart from a clean/fresh situation.
That will mean that duffy will consider deployed node as unused and so
will start reinstalling these (to start from clean situation). We'll let
you know when we'll proceed with that hotfix push
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
As previously announced in June (see
https://lists.centos.org/pipermail/ci-users/2022-June/004547.html and
also https://www.youtube.com/watch?v=aqgs-3NnRmA), we'll migrate
existing Duffy API service to the new Duffy v3, starting with Phase 1 .
Migration is scheduled for """"Monday August 1st 7:00 am UTC time"""".
You can convert to local time with $(date -d '2022-08-01 07:00 UTC')
The expected API downtime will be under the 60 seconds (DNS TTL).
Worth knowing:
- new duffy api (in legacy mode) has currently less available nodes, but
that will be resolved during the day (see below)
- we'll have to wait ~6h (default maximum "lease" time for a test node)
to be able to let new duffy api reinstall older seamicro nodes and so be
available in the new duffy v3 pool (transitioning from old to new duffy
pool)
- currently running jobs will continue so to work, but when you'll hit
new duffy api endpoint (DNS switch) to "return nodes" it will just
answer that session doesn't exist (new one) but it's safe to ignore
As soon as new duffy API is available you'll already be able to switch
your workflow to new the Duffy API and so use the new features.
Tenants/users documentation about how to interact with Duffy is
available on https://sigs.centos.org/guide/ci/ (see the Duffy part of
that documentation)
That means that switching to EC2 nodes will be directly possible.
During the day we'll tune the duffy nodepool configuration based on
usage metrics.
IMPORTANT remark : *only* projects/tenants that opted-in in the last 45
days are currently migrated (api and ssh keys), so if you haven't (yet),
you can still opt-in, otherwise all your requests for duffy ephemeral
nodes will be declined by new duffy api service next monday.
In case of troubles, both Pedro (nick phsmoura) and myself (nick arrfab)
will be also present in the #centos-ci irc channel on irc.libera.chat
the whole day.
Thanks for your understanding and patience.
on behalf of the Infra team,
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Hello everyone,
This is a friendly reminder to opt-in if you are interested in using CentOS
CI for your project after the changes we announced last month (check
"[ci-users] Changes on CentOS CI and next steps
<https://lists.centos.org/pipermail/ci-users/2022-June/004547.html>").
*Projects
will be migrated only if they opt-in*. Please,* opt-in by August 2022*, you
can do that by simply replying to this email expressing your interest.
Also, we now have a deadline for the OpenShift new deployment as per our
last email. It should be up and running at the *beginning of September 2022*
.
Feel free to reach out if you have any questions or concerns,
--
Camila Granella
Associate Manager, Software Engineering
Community Platform Engineering
Red Hat <https://www.redhat.com/>
@Red Hat <https://twitter.com/redhat> Red Hat
<https://www.linkedin.com/company/red-hat> Red Hat
<https://www.facebook.com/RedHatInc>
<https://www.redhat.com/>
As you're all aware now, CentOS CI infra landscape will be different in
the next weeks/months.
One of the biggest change will be the introduction of Duffy v3 as
replacement for the Duffy API and while there will be compatibility
layer (allowing you to use your existing "cico" workflow), the goal is
that each tenant will move to the new api, which itself then requires
either:
- you abusing directly the api endpoint (https://duffy.ci.centos.org/docs)
- use the new duffy client (pip3.8 install --user duffy[client] with a
config file
We'd like to offer the duffy client natively in our cico-workspace
pod/container that is automatically deployed for your jenkins jobs on
openshift cluster.
But if we trigger it, it will also come with something additional : the
current centos Stream8 based container is using ansible 2.9.x, which is
now deprecated, and kicking a container rebuild would automatically
install ansible-core 2.12.x (present in el8 natively now)
As you're aware, ansible-core doesn't have all the modules that you
probably need to use for your workflow, so instead of install the
meta-pkg ansible (which then install almost everything) you can come
(reply to this thread) with a list of collections that you'd like to see
in the base container (the less we have to pull, the better, to keep the
container image as light as possible), we'd then add these to the
container image.
Thanks for your collaboration,
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Hello!
After reading & discussing the recent news regarding the move to AWS, I wonder if it would be possible to provide Fedora Rawhide images along with the C8S and C9S ones (and maybe active stable Fedora releases as well).
A bit of background: In the systemd project we have several jobs with utilize Vagrant to run an Arch Linux VMs, in which we run tests alongside the C8S/C9S jobs, to cover issues with the latest-ish kernel and other software, and to also hunt down security issues with the latest versions of ASan and UBSan. However, all this is held together by a lot of duct tape and sheer will power, and in the end it requires an EC2 Metal instance to run, due to the additional level of virtualization.
If we were able to provision Rawhide instances directly (which should help us achieve the same goal as the Arch Linux VMs we currently use), that could, in theory, allow us to drop the requirement for Metal instances completely.
As far as I know, there are Fedora Rawhide AMIs[0], which should make this much easier, but that's all I can say, since I have almost zero experience with AWS overall.
Thank you!
Cheers,
Frantisek
[0] https://fedoraproject.org/w/index.php?title=Test_Results:Current_Cloud_Test…
--
PGP Key ID: 0xFB738CE27B634E4B
Just a quick reminder that CentOS Linux 8 is going EOL in december (see
https://www.centos.org/centos-linux-eol/)
So based on that plan, we'll also remove it from available ci infra at
the same time as on mirror network, and if you'll ask for CentOS 8
you'll get no returned node to run your tests on.
Start switching your tests to run on 8-stream instead, available since
the beginning.
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Due to a storage migration (see
https://pagure.io/centos-infra/issue/534), we'll have to shutdown and
restart the openshift CI cluster (OCP)
Migration is scheduled for """"Tuesday January 18th, 9:00 am UTC time"""".
You can convert to local time with $(date -d '2022-01-18 09:00 UTC')
The expected "downtime" is estimated to ~60 minutes , time needed to :
- shutdown openshift workers and control plane nodes
- last data sync between old and new nfs storage node
- restart openshift control plane and workers nodes
Thanks for your comprehending and patience.
on behalf of the Infra team,
--
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab