[Ci-users] Dirty nodes

Fri Jan 24 11:15:29 UTC 2020
Tibor Dancs <tdancs at redhat.com>

Hello.
I'm sending you the list of the jobs that we've seen the failures on and
specific build IDs as well.

Jobs:
https://ci.centos.org/job/devtools-rh-che-periodic-prod-2/
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1a/
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1b/
https://ci.centos.org/job/devtools-rh-che-periodic-prod-preview-2a/

Builds that failed with various dirty node related errors:
cluster production-2:
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2/1861/console
cluster production-2a:
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/1962/console
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/1949/console
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/1945/console
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/1944/console
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/1943/console
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/1935/console
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd/1638/console
cluster production-1a:
cluster production-1b:
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1b/1943/console
https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1b/1929/console
cluster prod-preview-2a:
https://ci.centos.org/job/devtools-rh-che-periodic-prod-preview-2a/2147/console
https://ci.centos.org/job/devtools-rh-che-periodic-prod-preview-2a/2142/console

These are just examples. I hope it will be useful for you.

On Fri, Jan 24, 2020 at 10:55 AM Vipul Siddharth <vipul at redhat.com> wrote:

>
>
> On Fri, 24 Jan, 2020, 10:50 am Tibor Dancs, <tdancs at redhat.com> wrote:
>
>> Hello.
>>
>> Do you want specific job builds as well or only the job IDs?
>>
> Anything would do
> The former would save me some time but not a problem at all
>
>>
>> On Thu, Jan 23, 2020 at 9:14 AM Vipul Siddharth <vipul at redhat.com> wrote:
>>
>>> On Thu, Jan 23, 2020 at 12:23 PM Katerina Foniok <kkanova at redhat.com>
>>> wrote:
>>> >
>>> > Hello,
>>> > we are still facing this problem on  devtools-ci-slave04. Our job is
>>> failing on conflicts during the installation of the dependencies of our
>>> project.
>>> yes, since I have been travelling for DevConf, I couldn't keep a watch
>>> on marking affected nodes in admin down state
>>>
>>> @Katerina Foniok, When you get time, could you please send me the job
>>> ids failing because of this issue?
>>> It would help me tracking down the nodes (also conforming if the issue
>>> is with other chassis as well)
>>> >
>>> > Thank you for taking care of that.
>>> > Have a nice day!
>>> >
>>> > Katka
>>> >
>>> > On Mon, Jan 20, 2020 at 10:28 PM František Šumšal <frantisek at sumsal.cz>
>>> wrote:
>>> >>
>>> >>
>>> >> On 1/20/20 10:12 PM, Vipul Siddharth wrote:
>>> >> > so the issue is another one of the reinstallation problems where it
>>> >> > rolls back to the previous version.
>>> >> > In +Katerina Foniok's case, it contained old data
>>> >> > and in +František Šumšal's case, it was not actually a C8 node but a
>>> >> > C7 marked wrongly.
>>> >> > The solution is to confirm twice (after installing the OS) in
>>> duffy..
>>> >> > and the way we think that would be very easy to implement would be
>>> to
>>> >> > check number of ssh key in authorized_keys. If it contains more in
>>> >> > ready state, that means it still has old data and instead of ready,
>>> >> > put it in some other state which I can take a look at later on.
>>> >> > This would be a quick fix but till then, I will drain the pufty
>>> >> > chassis (this is one with problems)
>>> >> > and reset the chassis (this has fixed the problem in the past so
>>> far)
>>> >> >
>>> >>
>>> >> Thank you very much for your analysis and prompt solution!
>>> >>
>>> >> --
>>> >> PGP Key ID: 0xFB738CE27B634E4B
>>> >>
>>>
>>> --
>>> Vipul Siddharth
>>> He/His/Him
>>> Fedora | CentOS CI Infrastructure Team
>>> Red Hat
>>> w: vipul.dev
>>>
>>> _______________________________________________
>>> CI-users mailing list
>>> CI-users at centos.org
>>> https://lists.centos.org/mailman/listinfo/ci-users
>>>
>> _______________________________________________
>> CI-users mailing list
>> CI-users at centos.org
>> https://lists.centos.org/mailman/listinfo/ci-users
>>
> _______________________________________________
> CI-users mailing list
> CI-users at centos.org
> https://lists.centos.org/mailman/listinfo/ci-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/ci-users/attachments/20200124/a3a6477f/attachment-0003.html>