[Ci-users] Changes to CentOS CI: reminder of Phase 1 and 2

Mon Aug 22 12:19:56 UTC 2022
Camila Granella <cgranell at redhat.com>

Hello,

Thank you for clarifying it.
Soon we will bump the amount of metal machines available for provisioning
on Duffy,
adding a kind reminder here to use them wisely (as you all already do) as
they represent
a significant cost increase. Also, please, whenever possible default to the
use of VMs.

Have a nice week,

On Mon, Aug 22, 2022 at 8:59 AM František Šumšal <frantisek at sumsal.cz>
wrote:

>
> On 8/22/22 13:28, Fabian Arrotin wrote:
> > On 19/08/2022 15:31, František Šumšal wrote:
> >> Hey,
> >>
> >> On 8/19/22 14:23, Camila Granella wrote:
> >>> Hello!
> >>>
> >>>     I understand that the metal machines are expensive, and I'm not
> sure how many other projects are eventually going to migrate over to them,
> but I guess in the future some balance will need to be found out between
> the cost and available metal nodes. Is this even up to a discussion, or the
> size of the metal pools is given and can't/won't be adjusted?
> >>>
> >>>
> >>> We're looking to optimize resource usage with the recent changes to
> CentOS CI. From our side, the goal is to find a balance between adjusting
> to tenants' needs (there are adaptations we could do to have more nodes
> available with an increase in resource consumption) and adjusting projects
> workflows to use EC2.
> >>>
> >>> I'd appreciate your suggestions on mitigating how to make workflows
> more adaptable to EC2.
> >>
> >> The main blocker for many projects is that EC2 VMs don't support nested
> virtualization, which is really unfortunate, since using the EC2 metal
> machines is indeed a "bit" overkill in many scenarios (ours included). I
> spent a week playing with various approaches to avoid this requirement, but
> failed (in our case it would be running the VMs with TCG instead of KVM,
> but that makes the tests flaky/unreliable in many cases, and some of them
> run for several hours with this change).
> >>
> >> Going through many online resources just confirms this - EC2 VMs don't
> support nested virt[0], which is sad, since, for example, Microsoft's Azure
> apparently supports it[1][2] (and Google's Compute Engine apparently
> supports it as well from a quick lookup).
> >>
> >> I'm not really sure if there's an easy solution for this (if any). I'm
> at least trying to spread the workload on the machine "to the limits" to
> utilize as much of the metal resources as possible, which shortens the
> runtime of each job quite considerably, but even that's not ideal
> (resource-wise).
> >>
> >> As I mentioned on IRC, maybe having Duffy changing the pool size
> dynamically based on the demand for the past hour or so would help with the
> overall balance (to avoid wasting resources in "quiet periods"), but that's
> just an idea from top of my head, I'm not sure how feasible it is or if it
> even makes sense.
> >>
> >
> > Yes, that was always communicated that default EC2 instances don't
> support nested virt, as one request a cloud vm, so not an hypervisor :)
> > It's just before migrating to ec2 that we saw it was possible to deploy
> bare-metal options at AWS side, but with a higher cost (obviousy) than
> traditional EC2 instances (VMs)
> >
> > Can you explain why you'd need to have an hypervisor instead of VMs ? I
> guess that troubleshooting comes to mind (`virsh console` to the rescue
> while it's not even possible with the ec2 instance as VM) ?
>
> The systemd integration test suite builds an image for each test and then
> runs it with both systemd-nspawn and directly with qemu/qemu-kvm, since
> running systemd tests straight on the host is in many cases dangerous (and
> in some cases it wouldn't be feasible at all, since we need to test stuff
> that happens during (early) boot). Running only the systemd-nspawn part
> would be an option, but this way we'd lose a significant part of coverage
> (as with nspawn you can't test the full boot process, and some tests don't
> run in nspawn at all, like the systemd-udevd tests and other
> storage-related stuff).
>
> >
> >
> >
> > _______________________________________________
> > CI-users mailing list
> > CI-users at centos.org
> > https://lists.centos.org/mailman/listinfo/ci-users
>
> --
> PGP Key ID: 0xFB738CE27B634E4B
> _______________________________________________
> CI-users mailing list
> CI-users at centos.org
> https://lists.centos.org/mailman/listinfo/ci-users
>


-- 

Camila Granella

Associate Manager, Software Engineering

Red Hat <https://www.redhat.com/>
@Red Hat <https://twitter.com/redhat>   Red Hat
<https://www.linkedin.com/company/red-hat>  Red Hat
<https://www.facebook.com/RedHatInc>
<https://www.redhat.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/ci-users/attachments/20220822/c5578a66/attachment-0002.html>