On Mon, Feb 21, 2022 at 10:51 AM Pierre-Yves Chibon pingou@pingoured.fr wrote:
Good Morning Everyone,
There are currently two lookaside caches in use around the CentOS project:
- One used by CentOS-Stream: https://sources.stream.centos.org/sources it's not browsable, but it uses the structure: `baseurl/pkgname/tarball/hashtype/hash/tarball`. Example: https://sources.stream.centos.org/sources/rpms/kernel/linux-5.14.0-62.el9.ta...
- One used by CentOS-Linux, CentOS-Stream 8 and the SIGs: https://git.centos.org/sources/ this one is browsable and as you can see uses the structure: `baseurl/pkgname/branch/hash`. Example: https://git.centos.org/sources/kernel/c8s/0c4e10577cfd4b4f8e3d83c0406da8ab05...
The rest of this email focuses on this second one. SIGs upload to it using the route: https://git.centos.org/sources/upload.cgi
In an email last week [1] was proposed an idea for how SIGs could leverage the centos namespace in gitlab for those who wishes.
One of the benefits of using gitlab would be increased flexibility for SIGs and a clear example for this would be the ability to drop the branch structures currently imposed on the git repositories. That structure is imposed because the git repositories are shared between CentOS-Linux, CentOS-Stream and (potentially multiple) SIGs, so that structure ensures groups are not stepping on each other's toes. By moving the SIGs out of these shared repositories, imposing that structure is no longer needed.
However, since the lookaside cache relies on branch name, lifting that structure would break the lookaside cache.
I have already brought this idea to a few folks to see if the idea was sane. The consensus that emerged is:
- Introduce a new upload endpoint next to the existing one, something like: https://git.centos.org/sources/sig_upload.cgi
- That new endpoint would upload the sources given using the same structure as the one used for CentOS-Stream, but ensuring that the person uploading is member of at least one SIG.
The idea of using `sig_upload.cgi` instead of just replacing `upload.cgi` is the assumption that we want to preserve the current structure used for CentOS-Linux and CentOS-Stream, allowing to find more easily which sources are used where and not impacting the process Red Hat uses to push its releases.
Since the structures used by the two upload scripts are different, they will not conflict. What we will end up seeing is something like:
sources │ ├── pkg1 │ ├── c7 │ │ ├── hash1 │ │ └── hash2 │ ├── c8 │ │ ├── hash3 │ │ └── hash4 │ ├── tarball1 │ │ └── sha name │ │ └── hash5 │ │ └── tarball1 │ └── tarball2 │ └── sha name │ └── hash6 │ └── tarball2 │ └── pkg2 ├── c8 │ ├── hash7 │ └── hash8 ├── c8s │ ├── hash9 │ └── hash10 ├── tarball3 │ └── sha name │ └── hash11 │ └── tarball3 └── tarball4 └── sha name └── hash12 └── tarball4 and so on
On CBS, the script that downloads the sources, will then need to be adjusted to try first the old structure before trying the new one. This may slow down a little bit the builds, but that should be most of the time, at most by a single http request.
In this email I'm calling for feedback, do you like the idea?
I'm happy to work on making it happen if there is consensus on this :)
Looking forward for your thoughts, Pierre
The git server and git structure is orthogonal to the lookaside problem. Fundamentally, the issue was that the same upload endpoint is used for both Red Hat compliance and SIG work. We already have authentication/authorization on branches at the Pagure level, so we just lacked a way to handle this for the lookaside upload. By splitting the endpoint, it should be possible to solve that since you can deny access to the Red Hat endpoint to everyone.
However, I'd make a small suggestion: instead of changing the endpoint URL for SIGs, change the endpoint URL for Red Hat. RCM uses that endpoint through automation (I assume), so changing the endpoint for the one service is considerably simpler than dealing with everyone's own scripts to adjust for SIGs.
As an example, I've written automation to deal with Hyperscale work because doing it by hand is a lot of grunt work. While I can probably tweak my stuff easily enough, I don't know if *everyone* can.
And again, the lookaside thing is completely orthogonal to the git structure. I should be able to use it just fine from git.centos.org in the current branched package structure.