Hello,
We're trying to revive CentOS CI for systemd, unfortunately
we encountered an issue[0] which prevents the system from
booting. That wouldn't so be bad, however, debugging such issues
without a serial console is a pain. Not sure if I overlooked
something on the CI wiki, but is there any way to connect to
the machine via other means to either debug it when it's
still alive, or at least get the filesystem contents so it could
be properly analyzed?
Thanks!
[0] https://github.com/systemd/systemd-centos-ci/issues/18#issuecomment-4402058…
--
GPG key ID: 0xFB738CE27B634E4B
Hi folks,
I've raised a request in the bug tracker[1]
for creating an account on ci.centos.org.
So that we can start working on migrating
end2end tests for `Openshift Do` project there.
It is a high priority issue for us now.
as of now, It seems like the bug tracker is down.
It will be very helpful If some one can respond to the request.
[1]https://bugs.centos.org/view.php?id=15460
thanks,
sgk
Odo developer,
OpenShift Dev-Experience
Mail me at sgk(a)redhat.com
Encrypt using GPG 4096R/FBB26E60
Hello,
I've been looking at why all the recent jobs in
https://ci.centos.org/job/user-cont-conu-pr/
are pending.
The first pending job in the queue says
'userspace-containerization-ci-slave06 is offline', so I checked
https://ci.centos.org/computer/userspace-containerization-ci-slave06/
but don't see anything I could do there to resolve it.
I can 'ssh userspace-containerization(a)slave06.ci.centos.org' just fine,
so it doesn't seem to be offline.
What can I do to debug/resolve it ?
Thanks,
Jiri
Hi Folks,
There are a couple of pending security fixes to Jenkins and plugins
outstanding.
I'll be restarting the ci.centos.org master as soon as we can find a
lull in the queue.
Jobs should pick up where they left off, and get queued back when the
master returns.
I'll send a note here when we're finished up so we can keep track. If
you have any questions let us know here or in #centos-devel on freenode.
Cheers!
--
Brian Stinson
CentOS CI Infrastructure Team
I am looking for a way to use python3 on Jenkins slaves. Can someone provide me more details how it can be done ? The simplest way seems like installing rh-python* package and enabling it via scl.
--
--
Siteshwar Vashisht
Summary:
A large subset of the application nodes in apps.ci.centos.org were
placed in an unschedulable state around 13h00 UTC on September 27th.
Nodes were rebooted and service was partially restored, but new behavior
was exhibited overnight. Pods were able to schedule on the nodes but DNS
was not functional. DNS service was restored at around 15h00 UTC on
September 28th.
Timeline:
27-Sept-2018 13h00 UTC - 28-Sept-2018 15h00 UTC
Root Cause:
A previously applied update (applied around 17-August) to selinux-policy
caused some files to be relabeled. We did not, at the time, schedule a
reboot, but routine restarts of the docker service caused the nodes to
enter a degraded state.
Further file relabels caused the node boot process to complete, but also
in a degraded state.
Recovery:
Completed the rest of the pending updates, and rebooted the nodes to
clear the node-schedulable degradation. (27-Sept)
Triggered a full autorelabel and rebooted the nodes to clear the
node-boot degradation. (28-Sept)
Preventative Measures:
- Consider rebooting the nodes more often, perhaps on a regular
schedule to catch OS upgrade problems
- Complete the openshift-monitoring EPIC in the CI backlog, which will
add better checks for DNS.
Thank you very much for your patience during this outage.
--
Brian Stinson
CentOS CI Infrastructure Team