Sending it here too, but ideally if you have questions/comments, that
would be better to join the thread on centos-devel list.
Thanks !
-------- Forwarded Message --------
Subject: [CentOS-devel] CentOS/Fedora authentication system merge
(Please Read)
Date: Wed, 27 Jan 2021 08:58:13 +0100
From: Fabian Arrotin <arrfab(a)centos.org>
Reply-To: The CentOS developers mailing list. <centos-devel(a)centos.org>
To: The CentOS developers mailing list. <centos-devel(a)centos.org>
# Introduction and background
As it was preannounced some time ago , the CentOS Board agreed to merge
the CentOS accounts (https://accounts.centos.org) with the Fedora FAS
(https://admin.fedoraproject.org/accounts/)
As both projects were running their own instance of FAS (running on
el6/CentOS 6, so coming to EOL, so that needed to be migrated to new
solution/platform), but that there are a lot of contributors common to
both projects, it made sense to "migrate and merge" both into one, and
so having only one account that can be used for both.
The AAA/Noggin team worked in the last months on the new authentication
system that will be used as foundation.
The core block will be (Free)IPA (https://www.freeipa.org , already
available in the distribution) and the community portal feature will be
provided by noggin (https://github.com/fedora-infra/noggin)
If you want to know more about noggin, consider watching the
presentation given at last Fedora Nest event
(https://www.youtube.com/watch?v=x1SevUmkE60)
# What does it mean for you, contributors and SIG members ?
Fedora already had an IPA infra, but "hidden" behind FAS, so accounts
were already created in IPA backend.
For CentOS, we were just using plain FAS, so users in our own backend
(fas db).
The "Merge" operation will go like this :
- Fedora will kick fas2ipa script
(https://github.com/fedora-infra/fas2ipa) synchronizing FAS attributes
back into IPA, including group memberships coming from FAS/Fedora
- Then the same process will be ran but importing from ACO
(https://accounts.centos.org) into the same IPA backend.
That's where the "fun" begins:
* If the same nick/account exists at both side, the script is
considering FAS as authoritative (remember, the FAS user *already*
exists there, and is only modified for group[s] membership and attributes)
* What is used to consider same nick/account being the same person ?
the email (validated when registering account) will be used as primary
key. So that means that you should *now* verify/update your email
address in FAS and ACO so that they match
* in case of a email address mismatch, the ACO account isn't migrated
(group membership) but put in a queue to be verified
* in case of matching email address, existing account is added to
imported ACO groups
The "open" question is about what to do for same account but in fact
being different people (question is debated between Fedora and CentOS
through the AAA initiative)
# What has been already done ?
You can follow publicly the status through dedicated tracker (
https://github.com/orgs/fedora-infra/projects/6 ), but let me focus on
the CentOS Side (sending this to centos-devel so centos contributors)
In the last months, Fedora already deployed a staging (.stg.) IPA
instance, as well as a noggin community portal.
For CentOS, we deployed (to be able to test integration) the following
components in front of the Fedora IPA:
* https://accounts.stg.centos.org (using noggin, with a centos visual
theme applied)
* https://id.stg.centos.org (ipsilon, used for openid/openidc IdP)
We then reached out to some "key users" to validate that some
applications migrated to new authentication system were working fine.
We tested with :
* pagure (https://git.stg.centos.org)
* koji
* openshift/OCP
* some other apps using openid
In December 2020, there was a first ran of the fas2ipa script, so
(consider this a snapshot) existing accounts in both FAS and ACO were
merged.
>From that import, there were 123 accounts that were duplicates ones, but
as said, it can be that they are the same account but using different
email addresses.
# What do you have to do ?
You can try to login through https://accounts.stg.centos.org and see if
you can login.
Important remark: if you *didn't* have a FAS account , your account was
imported/created for the first time in IPA, so that means that you'll
have to use the "Forgot Password ?" feature on portal to reset your
account (mail will be sent to email address tied to your account)
# When will the real migration happen ?
We'll wait on AAA/noggin team to give us estimated date, and when
they'll migrate Fedora first.
Once that will be done, we'll migrate ACO to the new setup (probably
fas2ipa script ran during a week-end, but to be announced)
# How will that impact my workflow for CentOS as SIG member ?
Worth knowing that all deployed services using ACO will have to be
reconfigured for AAA.
That currently means :
* https://git.centos.org (and also the MQTT bus for git push notifications)
* https://cbs.centos.org (and also non public signing service)
* other small services using OpenID/OpenIDC for authentication
(https://blog.centos.org, some jenkins instances used by QA team, etc)
As said, we have already staged all changes to support new auth in our
ansible roles.
When we'll have rolled out these changes, your existing TLS certificate
that you use to authenticate with for cbs.centos.org *will not* work
anymore (important)
That means that you'll have to retrieve a new TLS cert, signed by the
IPA CA cert.
How to do that ? I'll see about how porting this to know repository, but
for now, there is a copr repo that you can use :
https://copr.fedorainfracloud.org/coprs/arrfab/fasjson-client/
IMPORTANT : do *not* use this pkg now, or do this from another
workstation/vm/account/whatever : the new 'centos-cert' util would
replace your currently working TLS cert (from ACO) . (Well, as fasjson
for prod *isn't* deployed yet, that would not work at all, but it would
when deployed
If you have questions, feel free to ask in this thread, or join
#fedora-aaa on Freenode.
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
_______________________________________________
CentOS-devel mailing list
CentOS-devel(a)centos.org
https://lists.centos.org/mailman/listinfo/centos-devel
Hi folks,
We are planning to update all the plugins (compatible) installed on
ci.centos.org jenkins instance.
I will do it tomorrow morning (Dec 18th) at 9am UTC. I will start
preparing for shutdown then and all the jobs triggered post that will
go in the queue until the instance has updated the plugin and
restarted.
I would also like to open a question to you all who are using OCP4
cluster. How would you want to manage Jenkins update for your
namespace? There are multiple ways from them being auto updated
whenever we update the cluster (this is by default that we use),
time'd/self trigger updates, or updates on change in tags.
I am interested in hearing your thoughts.
Thank you
--
Vipul Siddharth
He/His/Him
Fedora | CentOS CI Infrastructure Team
Due to some network switches upgrade in the DC hosting some community
projects (including but not limited to CentOS), we'll have a large
majority of our infra not reachable.
Migration is scheduled for """"Tuesday November 10th, 2:00 am UTC time"""".
You can convert to local time with $(date -d '2020-11-10 14:00 UTC')
We unfortunately can't announce/give you any expected downtime as it can
last for several hours (info I received through invite) but we'll try to
restore all services/connectivity as soon as possible.
Impacted services in that DC :
- *all*
Non impacted services (easier to just mention short list of things not
in that DC, so items not listed below *will* be down) :
- https://www.centos.org
- https://forums.centos.org
- https://lists.centos.org
- mirrorlist.centos.org
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Hi folks,
Something to be aware of : https://pagure.io/centos-infra/issue/146
"'
There will be a planned outage to try and update the switches in the
CentOS main cage. The outage window will be for 4 hours from 14:00 UTC
until 18:00 UTC but should only be short bursts of switch outages.
Affected services:
CentOS CI
CentOS build systems
Other central services
If the outage has issues which can not be fixed within the 4 hour
window, the backup window for an outage is 2020-12-01 at the same time
area.
"'
Thank you, if you have any questions, please comment them on the ticket
--
Vipul Siddharth
He/His/Him
Fedora | CentOS Infrastructure Team
Hi!
To speed up some of the testing we do on bare-metal machines provisioned
through Duffy, I would like to pull pre-build images from the OpenShift
registry. The images are built through a BuildConfig and placed in an
ImageStream.
Now, it seems that the Duffy provisioned bare-metal systems can not pull
from the internal OpenShift registry:
[root@n46 ~]# podman pull image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:5000/ceph-csi/ceph-csi:test
Trying to pull image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:5000/ceph-csi/ceph-csi:test...
Get https://image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:…: dial tcp 172.19.0.254:5000: connect: no route to host
Error: error pulling image "image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:5000/ceph-csi/ceph-csi:test": unable to pull image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:5000/ceph-csi/ceph-csi:test: unable to pull image: Error initializing source docker://image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:5000/ceph-csi/ceph-csi:test: error pinging docker registry image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:5000: Get https://image-registry.openshift-image-registry.svc.apps.ocp.ci.centos.org:…: dial tcp 172.19.0.254:5000: connect: no route to host
I wonder if this is intentional, or if this is a little too strict? If
this can not be allowed through the firewall, what is the recommendation
to use these images? Maybe we should deploy our own registry and push
the images there...
Thanks!
Niels
Yesterday (Saturday) evening we got zabbix notifications that some nodes
in CI environment were unreachable. After a quick look, I discovered
that it was an embedded network switch in a chassis hosting multiple
nodes (including but not limited to jenkins node behind ci.centos.org)
that went nuts.
I tried a remote "hardware reset" and nodes were back online after ~10min.
But this morning (sunday), I see through zabbix that same issue happened
again, and in the hour after I already did the "hardware reset", but
this time, even that doesn't work anymore.
So that means that we have a network switch not working anymore.
As that chassis (like almost *all* equipment in CI) *isn't* under
warranty, we'll see on monday what can be done and how we give priority
to try to dispatch services elsewhere (and that probably means then
powering down other services , depending on priority that will be
given), but it's easy to understand that we can't even give any ETA at
this point.
Thanks for your comprehending,
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
we had a kernel panic on the storage box used as nfs server for
openshift (both okd and ocp) and machine doesn't come back online due to
md device refusing to start.
machine is now in single-mode to analyze the situation and try to fix it.
We'll send more details and progress when possible
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab
Due to a hardware maintenance that needs to take place on the NFS
storage node used by openshift ("legacy" and current one - ocp ), we'll
have to shutdown the openshift cluster, and then proceed with hardware
maintenance on the NFS server (that itself needs to be powered down, no
way to actually do that "online")
Migration is scheduled for """"Wednesday September 30th, 12:00 pm UTC
time"""".
You can convert to local time with $(date -d '2020-09-30 12:00 UTC')
The expected "downtime" is estimated to ~60 minutes , time needed to
shutdown the machine, install new disks, restart the machine and also do
some updates and tuning on the setup.
For more informations about this, here are some relevant tickets that
were created for the perf issue in openshift and nfs :
https://pagure.io/centos-infra/issue/53https://pagure.io/centos-infra/issue/105https://pagure.io/centos-infra/issue/85https://pagure.io/centos-infra/issue/26
<subliminal message>
PS : worth noting that while we'll investigate reports on new ocp
cluster, we'll probably not spend time investigating in the old/legacy
one, that projects are supposed to migrate away from soon, as the legacy
openshift setup will disappear soon (see
https://pagure.io/centos-infra/issue/16)
</subliminal message>
Thanks for your comprehending and patience.
on behalf of the CI Infra team,
--
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab