On Wed, Jun 19, 2019, at 14:23, James Cassell wrote:
On Wed, Jun 19, 2019, at 12:09 PM, Brian Stinson wrote:
Hi Folks,
While we cycle through some of the remaining builds I'd like to start a discussion about what the CentOS 8 repo structure might look like. We need to think about what the repos look like on-disk, and how this might impact the mirrors.
Currently the thinking is this:
3 "core" repos:
- BaseOS (contains a small packageset of the base distribution)
- AppStream ("where the modules go")
- Devel ("-devel packages and other tools")
These descriptions are very much an oversimplification, but it's an ok model to work with.
We plan to compose all of those repositories, and deliver updates in the same stream. The x86_64 tree for the BaseOS repository will look something like this:
x86_64 ├── debug # Note: we will likely snip this out and move debugs to debuginfo.centos.org │ └── tree │ ├── Packages │ └── repodata ├── iso └── os ├── EFI │ └── BOOT │ └── fonts ├── images │ └── pxeboot ├── isolinux ├── Packages └── repodata
I think this would be fine provided there is a frozen-in-time set of repodata that matches the current 'base' repo.
Going further, I'd propose something akin to https://snapshot.debian.org:
x86_64 ├── iso ├── appstream │ ├── Packages │ └── repodata # replaced daily ├── baseos │ ├── Packages │ └── repodata # replaced daily ├── devel │ ├── Packages │ └── repodata # replaced daily ├── snapshot │ └── 20190619T145736Z │ ├── appstream │ │ ├── Packages -> ../../../appstream/Packages │ │ └── repodata # frozen in time │ ├── baseos │ │ ├── Packages -> ../../../baseos/Packages │ │ └── repodata # frozen in time │ └── devel │ ├── Packages -> ../../../devel/Packages │ └── repodata # frozen in time └── kickstart ├── EFI │ └── BOOT │ └── fonts ├── images │ └── pxeboot ├── isolinux ├── appstream │ ├── repodata # frozen in time │ └── Packages -> ../../appstream/Packages └── baseos ├── repodata # frozen in time └── Packages -> ../../baseos/Packages
Something similar could be done with hardlinks of individual RPMs instead of symlinks to a Packages directory that has all RPMS, but the idea is that on-disk there is only one copy of each RPM. The additional space taken by the snapshot/ directory would be solely the daily repo metadata. These could be moved to archive.centos.org with each new point release, same as happens today w/ the rest of the tree.
(Also useful might be a latest-only tree that could be a smaller item to rsync and grab only the latest version of each package, but that would have to be done with hardlinks rather than symlinks.)
Would something like above snapshot idea have any legs?
V/r, James Cassell
What's the use-case for the snapshot structure like this from a consumer's perspective?
We could (and probably should) post the firehose from our tooling for the release composes somewhere that is not the mirrors (we don't want to mess with mirroring logs). To give you sort of an idea of what that output looks like, take a peek at how Fedora does theirs: https://kojipkgs.fedoraproject.org/compose/30/
Our output would have different repositories, and a lot fewer artifacts but the structure is similar.