[CentOS-devel] [EXT] CentOS SIGs and lookaside cache

Tue Feb 22 14:17:00 UTC 2022
Peter Georg <peter.georg at physik.uni-regensburg.de>

On 21/02/2022 16.51, Pierre-Yves Chibon wrote:
> Good Morning Everyone,
> 
> There are currently two lookaside caches in use around the CentOS project:
> * One used by CentOS-Stream: https://sources.stream.centos.org/sources it's not
>    browsable, but it uses the structure:
>    `baseurl/pkgname/tarball/hashtype/hash/tarball`. Example:
>    https://sources.stream.centos.org/sources/rpms/kernel/linux-5.14.0-62.el9.tar.xz/sha512/f7aeac0fe5bf594933cd35b7ecc94ea8ddcbfedc04fa769c4da298e7bf105df116375d44711d944c748c85f61f96f6149be34c76eb37f28aa1f16359a9122abf/linux-5.14.0-62.el9.tar.xz
> * One used by CentOS-Linux, CentOS-Stream 8 and the SIGs:
>    https://git.centos.org/sources/ this one is browsable and as you can see uses
>    the structure: `baseurl/pkgname/branch/hash`. Example:
>    https://git.centos.org/sources/kernel/c8s/0c4e10577cfd4b4f8e3d83c0406da8ab05eb775f
> 
> The rest of this email focuses on this second one. SIGs upload to it using the
> route: https://git.centos.org/sources/upload.cgi
> 
> In an email last week [1] was proposed an idea for how SIGs could leverage the
> centos namespace in gitlab for those who wishes.
> 
> One of the benefits of using gitlab would be increased flexibility for SIGs and
> a clear example for this would be the ability to drop the branch structures
> currently imposed on the git repositories. That structure is imposed because the
> git repositories are shared between CentOS-Linux, CentOS-Stream and (potentially
> multiple) SIGs, so that structure ensures groups are not stepping on each
> other's toes. By moving the SIGs out of these shared repositories, imposing that
> structure is no longer needed.
> 
> However, since the lookaside cache relies on branch name, lifting that structure
> would break the lookaside cache.
> 
> I have already brought this idea to a few folks to see if the idea was sane. The
> consensus that emerged is:
> * Introduce a new upload endpoint next to the existing one, something like:
>    https://git.centos.org/sources/sig_upload.cgi
> * That new endpoint would upload the sources given using the same structure as
>    the one used for CentOS-Stream, but ensuring that the person uploading is
>    member of at least one SIG.
> 
> The idea of using `sig_upload.cgi` instead of just replacing `upload.cgi` is the
> assumption that we want to preserve the current structure used for CentOS-Linux
> and CentOS-Stream, allowing to find more easily which sources are used where and
> not impacting the process Red Hat uses to push its releases.
> 
> Since the structures used by the two upload scripts are different, they will not
> conflict.
> What we will end up seeing is something like:
> 
> sources
>> ├── pkg1
> │   ├── c7
> │   │   ├── hash1
> │   │   └── hash2
> │   ├── c8
> │   │   ├── hash3
> │   │   └── hash4
> │   ├── tarball1
> │   │    └── sha name
> │   │         └── hash5
> │   │              └── tarball1
> │   └── tarball2
> │       └── sha name
> │            └── hash6
> │                 └── tarball2
>> └── pkg2
>      ├── c8
>      │   ├── hash7
>      │   └── hash8
>      ├── c8s
>      │   ├── hash9
>      │   └── hash10
>      ├── tarball3
>      │    └── sha name
>      │         └── hash11
>      │              └── tarball3
>      └── tarball4
>          └── sha name
>               └── hash12
>                    └── tarball4
> and so on

As this already requires some changes, does it make sense to have SIG 
sources in a different directory than RHEL sources?
The proposed structure does not allow sharing sources between RHEL and 
SIGs so what is the benefit of having both in the same directory? It 
might even lead to confusion due to the two different structures used.

I.e. put SIG sources in git.centos.org/SIGs/sources or whatever is 
possible/preferred and have a upload.cgi there.
I personally prefer a clean separation between content provided by RH 
and SIGs.

> On CBS, the script that downloads the sources, will then need to be adjusted to
> try first the old structure before trying the new one. This may slow down a
> little bit the builds, but that should be most of the time, at most by a single
> http request.
> 
> In this email I'm calling for feedback, do you like the idea?
> 
> I'm happy to work on making it happen if there is consensus on this :)
> 
> 
> Looking forward for your thoughts,
> Pierre
> 
> [1] https://lists.centos.org/pipermail/centos-devel/2022-February/120216.html
> _______________________________________________
> CentOS-devel mailing list
> CentOS-devel at centos.org
> https://lists.centos.org/mailman/listinfo/centos-devel
>