[Ci-users] devtools-ci-slave04 is offline

Tue Oct 6 07:52:31 UTC 2020
Katerina Foniok <kkanova at redhat.com>

Ah, ok, thank you very much for clarifying!

On Tue, Oct 6, 2020 at 9:42 AM Vipul Siddharth <vipul at redhat.com> wrote:

> On Tue, Oct 6, 2020 at 12:50 PM Katerina Foniok <kkanova at redhat.com>
> wrote:
> >
> > So, I can see that access to Vault was disabled on purpose, so it
> probably doesn't relate to the outage. Sorry for the hoax.
> >
> > We also can see this error message in our jobs:
> >>
> >> "msg": "Exceeded maximum allowed fail nodes limit, please release other
> machines to continue"
> >
> > Example of the job is here.
> So when you mark a node fail (usually when the job fails), the node
> stays around for 12 hours in case someone wants to check manually on
> what went wrong.
> Keeping too many nodes in fail state becomes a bottleneck for duffy
> pool as it means those nodes can't be reprovisioned for the next round
> of jobs (for 12 hours).
> We have a limit on how many can be in the fail state.
> This is expected and you would have seen it on calling node/fail API
> which should ideally be called when the job failed. So error could be
> something else
>
> > Thank you for taking a look,
> > Katka
> >
> > On Tue, Oct 6, 2020 at 9:04 AM Katerina Foniok <kkanova at redhat.com>
> wrote:
> >>
> >> Thank you, the `devtools-ci-slave04` is running again but it seems that
> our jobs can not get credentials from the vault now. Can it be related to
> the outage?
> >>
> >> On Tue, Oct 6, 2020 at 8:43 AM Vipul Siddharth <vipul at redhat.com>
> wrote:
> >>>
> >>> On Tue, Oct 6, 2020 at 11:40 AM Katerina Foniok <kkanova at redhat.com>
> wrote:
> >>> >
> >>> > Hello guys,
> >>> >
> >>> > our jobs on ci.centos.org are pending because the
> devtools-ci-slave04 is offline. Can someone take a look, please?
> >>> fixed
> >>> > One of the affected jobs is here.
> >>> > Thank you!
> >>> >
> >>> > Have a great day,
> >>> > Katka
> >>> > _______________________________________________
> >>> > CI-users mailing list
> >>> > CI-users at centos.org
> >>> > https://lists.centos.org/mailman/listinfo/ci-users
> >>>
> >>>
> >>>
> >>> --
> >>> Vipul Siddharth
> >>> He/His/Him
> >>> Fedora | CentOS CI Infrastructure Team
> >>>
> >>> _______________________________________________
> >>> CI-users mailing list
> >>> CI-users at centos.org
> >>> https://lists.centos.org/mailman/listinfo/ci-users
> >>>
> > _______________________________________________
> > CI-users mailing list
> > CI-users at centos.org
> > https://lists.centos.org/mailman/listinfo/ci-users
>
>
>
> --
> Vipul Siddharth
> He/His/Him
> Fedora | CentOS CI Infrastructure Team
>
> _______________________________________________
> CI-users mailing list
> CI-users at centos.org
> https://lists.centos.org/mailman/listinfo/ci-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/ci-users/attachments/20201006/ad9601ae/attachment-0003.html>