Hello guys,
it seems that data are not cleaned correctly from the nodes. Some of our tests are failing on that. E.g. https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... where the folder with the logs already exists. This job was run on devtools-ci-slave04.
Thank you for your time. Have a nice day!
Katka Foniok
On Mon, 20 Jan, 2020, 5:18 pm Katerina Foniok, kkanova@redhat.com wrote:
Hello guys,
it seems that data are not cleaned correctly from the nodes. Some of our tests are failing on that. E.g. https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... where the folder with the logs already exists. This job was run on devtools-ci-slave04.
Hi, I am out for ~an hour and will see to fixing this as soon as I get home.
Thank you for your time. Have a nice day!
Katka Foniok _______________________________________________ CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
Hello,
Can confirm this issue as well. Also, some systemd jobs die with:
bash: dnf: command not found
on CentOS 8. Not sure if it's related.
On 1/20/20 12:47 PM, Katerina Foniok wrote:
Hello guys,
it seems that data are not cleaned correctly from the nodes. Some of our tests are failing on that. E.g. https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... folder with the logs already exists. This job was run on devtools-ci-slave04.
Thank you for your time. Have a nice day!
Katka Foniok
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
so the issue is another one of the reinstallation problems where it rolls back to the previous version. In +Katerina Foniok's case, it contained old data and in +František Šumšal's case, it was not actually a C8 node but a C7 marked wrongly. The solution is to confirm twice (after installing the OS) in duffy.. and the way we think that would be very easy to implement would be to check number of ssh key in authorized_keys. If it contains more in ready state, that means it still has old data and instead of ready, put it in some other state which I can take a look at later on. This would be a quick fix but till then, I will drain the pufty chassis (this is one with problems) and reset the chassis (this has fixed the problem in the past so far)
On Tue, Jan 21, 2020 at 12:00 AM František Šumšal frantisek@sumsal.cz wrote:
Hello,
Can confirm this issue as well. Also, some systemd jobs die with:
bash: dnf: command not found
on CentOS 8. Not sure if it's related.
On 1/20/20 12:47 PM, Katerina Foniok wrote:
Hello guys,
it seems that data are not cleaned correctly from the nodes. Some of our tests are failing on that. E.g. https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... where the folder with the logs already exists. This job was run on devtools-ci-slave04.
Thank you for your time. Have a nice day!
Katka Foniok
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
-- PGP Key ID: 0xFB738CE27B634E4B
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
On 1/20/20 10:12 PM, Vipul Siddharth wrote:
so the issue is another one of the reinstallation problems where it rolls back to the previous version. In +Katerina Foniok's case, it contained old data and in +František Šumšal's case, it was not actually a C8 node but a C7 marked wrongly. The solution is to confirm twice (after installing the OS) in duffy.. and the way we think that would be very easy to implement would be to check number of ssh key in authorized_keys. If it contains more in ready state, that means it still has old data and instead of ready, put it in some other state which I can take a look at later on. This would be a quick fix but till then, I will drain the pufty chassis (this is one with problems) and reset the chassis (this has fixed the problem in the past so far)
Thank you very much for your analysis and prompt solution!
Hello, we are still facing this problem on devtools-ci-slave04. Our job is failing on conflicts during the installation of the dependencies of our project.
Thank you for taking care of that. Have a nice day!
Katka
On Mon, Jan 20, 2020 at 10:28 PM František Šumšal frantisek@sumsal.cz wrote:
On 1/20/20 10:12 PM, Vipul Siddharth wrote:
so the issue is another one of the reinstallation problems where it rolls back to the previous version. In +Katerina Foniok's case, it contained old data and in +František Šumšal's case, it was not actually a C8 node but a C7 marked wrongly. The solution is to confirm twice (after installing the OS) in duffy.. and the way we think that would be very easy to implement would be to check number of ssh key in authorized_keys. If it contains more in ready state, that means it still has old data and instead of ready, put it in some other state which I can take a look at later on. This would be a quick fix but till then, I will drain the pufty chassis (this is one with problems) and reset the chassis (this has fixed the problem in the past so far)
Thank you very much for your analysis and prompt solution!
-- PGP Key ID: 0xFB738CE27B634E4B
On Thu, Jan 23, 2020 at 12:23 PM Katerina Foniok kkanova@redhat.com wrote:
Hello, we are still facing this problem on devtools-ci-slave04. Our job is failing on conflicts during the installation of the dependencies of our project.
yes, since I have been travelling for DevConf, I couldn't keep a watch on marking affected nodes in admin down state
@Katerina Foniok, When you get time, could you please send me the job ids failing because of this issue? It would help me tracking down the nodes (also conforming if the issue is with other chassis as well)
Thank you for taking care of that. Have a nice day!
Katka
On Mon, Jan 20, 2020 at 10:28 PM František Šumšal frantisek@sumsal.cz wrote:
On 1/20/20 10:12 PM, Vipul Siddharth wrote:
so the issue is another one of the reinstallation problems where it rolls back to the previous version. In +Katerina Foniok's case, it contained old data and in +František Šumšal's case, it was not actually a C8 node but a C7 marked wrongly. The solution is to confirm twice (after installing the OS) in duffy.. and the way we think that would be very easy to implement would be to check number of ssh key in authorized_keys. If it contains more in ready state, that means it still has old data and instead of ready, put it in some other state which I can take a look at later on. This would be a quick fix but till then, I will drain the pufty chassis (this is one with problems) and reset the chassis (this has fixed the problem in the past so far)
Thank you very much for your analysis and prompt solution!
-- PGP Key ID: 0xFB738CE27B634E4B
-- Vipul Siddharth He/His/Him Fedora | CentOS CI Infrastructure Team Red Hat w: vipul.dev
Hello.
Do you want specific job builds as well or only the job IDs?
On Thu, Jan 23, 2020 at 9:14 AM Vipul Siddharth vipul@redhat.com wrote:
On Thu, Jan 23, 2020 at 12:23 PM Katerina Foniok kkanova@redhat.com wrote:
Hello, we are still facing this problem on devtools-ci-slave04. Our job is
failing on conflicts during the installation of the dependencies of our project. yes, since I have been travelling for DevConf, I couldn't keep a watch on marking affected nodes in admin down state
@Katerina Foniok, When you get time, could you please send me the job ids failing because of this issue? It would help me tracking down the nodes (also conforming if the issue is with other chassis as well)
Thank you for taking care of that. Have a nice day!
Katka
On Mon, Jan 20, 2020 at 10:28 PM František Šumšal frantisek@sumsal.cz
wrote:
On 1/20/20 10:12 PM, Vipul Siddharth wrote:
so the issue is another one of the reinstallation problems where it rolls back to the previous version. In +Katerina Foniok's case, it contained old data and in +František Šumšal's case, it was not actually a C8 node but a C7 marked wrongly. The solution is to confirm twice (after installing the OS) in duffy.. and the way we think that would be very easy to implement would be to check number of ssh key in authorized_keys. If it contains more in ready state, that means it still has old data and instead of ready, put it in some other state which I can take a look at later on. This would be a quick fix but till then, I will drain the pufty chassis (this is one with problems) and reset the chassis (this has fixed the problem in the past so far)
Thank you very much for your analysis and prompt solution!
-- PGP Key ID: 0xFB738CE27B634E4B
-- Vipul Siddharth He/His/Him Fedora | CentOS CI Infrastructure Team Red Hat w: vipul.dev
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
On Fri, 24 Jan, 2020, 10:50 am Tibor Dancs, tdancs@redhat.com wrote:
Hello.
Do you want specific job builds as well or only the job IDs?
Anything would do The former would save me some time but not a problem at all
On Thu, Jan 23, 2020 at 9:14 AM Vipul Siddharth vipul@redhat.com wrote:
On Thu, Jan 23, 2020 at 12:23 PM Katerina Foniok kkanova@redhat.com wrote:
Hello, we are still facing this problem on devtools-ci-slave04. Our job is
failing on conflicts during the installation of the dependencies of our project. yes, since I have been travelling for DevConf, I couldn't keep a watch on marking affected nodes in admin down state
@Katerina Foniok, When you get time, could you please send me the job ids failing because of this issue? It would help me tracking down the nodes (also conforming if the issue is with other chassis as well)
Thank you for taking care of that. Have a nice day!
Katka
On Mon, Jan 20, 2020 at 10:28 PM František Šumšal frantisek@sumsal.cz
wrote:
On 1/20/20 10:12 PM, Vipul Siddharth wrote:
so the issue is another one of the reinstallation problems where it rolls back to the previous version. In +Katerina Foniok's case, it contained old data and in +František Šumšal's case, it was not actually a C8 node but a C7 marked wrongly. The solution is to confirm twice (after installing the OS) in duffy.. and the way we think that would be very easy to implement would be to check number of ssh key in authorized_keys. If it contains more in ready state, that means it still has old data and instead of ready, put it in some other state which I can take a look at later on. This would be a quick fix but till then, I will drain the pufty chassis (this is one with problems) and reset the chassis (this has fixed the problem in the past so far)
Thank you very much for your analysis and prompt solution!
-- PGP Key ID: 0xFB738CE27B634E4B
-- Vipul Siddharth He/His/Him Fedora | CentOS CI Infrastructure Team Red Hat w: vipul.dev
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
Hello. I'm sending you the list of the jobs that we've seen the failures on and specific build IDs as well.
Jobs: https://ci.centos.org/job/devtools-rh-che-periodic-prod-2/ https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1a/ https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1b/ https://ci.centos.org/job/devtools-rh-che-periodic-prod-preview-2a/
Builds that failed with various dirty node related errors: cluster production-2: https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2/1861... cluster production-2a: https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-2aProd... cluster production-1a: cluster production-1b: https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1b/194... https://ci.centos.org/view/Devtools/job/devtools-rh-che-periodic-prod-1b/192... cluster prod-preview-2a: https://ci.centos.org/job/devtools-rh-che-periodic-prod-preview-2a/2147/cons... https://ci.centos.org/job/devtools-rh-che-periodic-prod-preview-2a/2142/cons...
These are just examples. I hope it will be useful for you.
On Fri, Jan 24, 2020 at 10:55 AM Vipul Siddharth vipul@redhat.com wrote:
On Fri, 24 Jan, 2020, 10:50 am Tibor Dancs, tdancs@redhat.com wrote:
Hello.
Do you want specific job builds as well or only the job IDs?
Anything would do The former would save me some time but not a problem at all
On Thu, Jan 23, 2020 at 9:14 AM Vipul Siddharth vipul@redhat.com wrote:
On Thu, Jan 23, 2020 at 12:23 PM Katerina Foniok kkanova@redhat.com wrote:
Hello, we are still facing this problem on devtools-ci-slave04. Our job is
failing on conflicts during the installation of the dependencies of our project. yes, since I have been travelling for DevConf, I couldn't keep a watch on marking affected nodes in admin down state
@Katerina Foniok, When you get time, could you please send me the job ids failing because of this issue? It would help me tracking down the nodes (also conforming if the issue is with other chassis as well)
Thank you for taking care of that. Have a nice day!
Katka
On Mon, Jan 20, 2020 at 10:28 PM František Šumšal frantisek@sumsal.cz
wrote:
On 1/20/20 10:12 PM, Vipul Siddharth wrote:
so the issue is another one of the reinstallation problems where it rolls back to the previous version. In +Katerina Foniok's case, it contained old data and in +František Šumšal's case, it was not actually a C8 node but a C7 marked wrongly. The solution is to confirm twice (after installing the OS) in
duffy..
and the way we think that would be very easy to implement would be
to
check number of ssh key in authorized_keys. If it contains more in ready state, that means it still has old data and instead of ready, put it in some other state which I can take a look at later on. This would be a quick fix but till then, I will drain the pufty chassis (this is one with problems) and reset the chassis (this has fixed the problem in the past so
far)
Thank you very much for your analysis and prompt solution!
-- PGP Key ID: 0xFB738CE27B634E4B
-- Vipul Siddharth He/His/Him Fedora | CentOS CI Infrastructure Team Red Hat w: vipul.dev
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users
CI-users mailing list CI-users@centos.org https://lists.centos.org/mailman/listinfo/ci-users