hi Guys,
with a bit of time opening up, I've gone back to looking at the yum-security issue and how we can address it ( i.e: atleast get the basics working ).
I plan on doing the work in multiple stages as :
step1: get a basic set of metadata online so that people can detect if an update is tag'd security, bugfix or enhancement
step2: get some more metadata added in, with bug id's and cve's into that metadata
step3: get everything rolled out by default on all centos installs and look at how external projects like spacewalk/pulp etc consume this metadata
At the moment, I'm only thinking about things and trying to scope out the work. However, there is one issue that might be a spanner in the works based on how we have mirror.centos.org setup.
What we have right now only provides for the <version>.<release> rpmset and updates but only in relation to that specific tree (eg. /6.3/ is what /6/ maps to at the moment ). Yum-security metadata in this repo would then only be relevant to the rpms contained in /6.3/, whatever repo they might be in - however, someone running 6.0 or 6.1. when checking for updates is likely to miss interim updates that were security tag'd at some release level or the other.
One way to work around this would be to have yum consider all interim package metadata between installed.rpm and latest-in-repo.rpm ( which would then mean that we would need the yum-security metadata to contain all info for everything ever released ; isnt a problem as such ).
Or we setup a repo that has everything ever released. This in turn has some serious caveats. Storage on every mirror being a good problem to start with - however, in limited tests it looks like yum will work with redirect, so while we would need the metadata to contain all packages, the physical packages can still be handed out from vault.centos.org, but that redirect foo needs some level of smartness on the mirror end; trivial to implement when we control the mirror.centos.org network, however a very large part of the mirror services are offloaded to external mirrors - hundreds of them. Its super tricky getting smartness onto each and every one of their machines.
Thoughts, concerns, ideas ? There is no 'work' thats been done at this point on the problem, so we can take pretty much any course of action that seems sane.
I feel its important that if we are going to provide a mechanism that people will then in turn rely on to get patch requirements for their machines, we need to make sure we have 100% coverage.
Am 01.08.2012 18:49, schrieb Karanbir Singh:
hi Guys,
with a bit of time opening up, I've gone back to looking at the yum-security issue and how we can address it ( i.e: atleast get the basics working ).
Thank you so much for starting this effort!
I plan on doing the work in multiple stages as :
step1: get a basic set of metadata online so that people can detect if an update is tag'd security, bugfix or enhancement
That's probably what 90% of people will be happy with.
What we have right now only provides for the<version>.<release> rpmset and updates but only in relation to that specific tree (eg. /6.3/ is what /6/ maps to at the moment ). Yum-security metadata in this repo would then only be relevant to the rpms contained in /6.3/, whatever repo they might be in - however, someone running 6.0 or 6.1. when checking for updates is likely to miss interim updates that were security tag'd at some release level or the other.
Probably. But that's an issue with the mirror- and repository-layout. The solution to this is having one continous updates-repository per distro-version instead of one per point-release.
So we would have: /6/updates/... /6/extras/... /6/centosplus/... to contain all update-rpms for all point-releases and the rpms released at point releases. Additionally you would place symlinks for isos and os to point to the last point-release directory
And the point-release directories would boil down to: /6.0/isos/... /6.0/os/... to contain the installation media and mayve a set of symlinks pointing to the repos in /6/
If splitting like that it is quite simple: You add updateinfo.xml to all repositories below /6/ and make the listing of all contained rpms in updateinfo.xml a requirement. For the point releases you don't want or need any updateinfo.xml as anaconda doesn't support updateinfo.xml and the repository will only be used for installs and not for updates.
One way to work around this would be to have yum consider all interim package metadata between installed.rpm and latest-in-repo.rpm ( which would then mean that we would need the yum-security metadata to contain all info for everything ever released ; isnt a problem as such ).
I'd just go down that road. I wouldn't even make a difference for architectures and you can argue wether it is worth the effort to differenciate between major releases.
In fact you can probably put the complete updateinfo into every base and updates repository without messing anything up. Without messing around with the overall mirror- and repository-layout this is probably the easiest way to handle it.
If we kept the updateinfo in a database or something else that searchable and machine readable in an efficient manner one could also generate an accurate updateinfo.xml for every repository. This would also make testing much easier: if you want to test stuff, you can simply generate your own updateinfo.xml for your own local mirror by querying the database using a custom query.
I feel its important that if we are going to provide a mechanism that people will then in turn rely on to get patch requirements for their machines, we need to make sure we have 100% coverage.
Coverage is important, but it is rather simple to test: just make sure every package in you repo is mentioned in updateinfo.xml. Maybe throw in a blacklist with the packages from the x.0-release. What is much harder to achieve is reliability and consistency of the information provided.
Upstream has a release-number on their errata and we should probably also introduce one. Additionally we will want another centos-own numbering scheme for updates that do not duplicate upstream errata (i.e. stuff that goes to extras or centosplus).
Scenario 1: Sometimes you will have to change errata that were already released. Say you have CESA-2012:123 that fixes a flaw in a a package contained in CentOS 5, CentOS 6 and the upcoming CentOS 7. As CentOS 7 is not yet done, we push out the errata for CentOS 5 and 6. This is "CESA-2012:123-1". Once CentOS 7 is done, the update would be release for CentOS 7, too. This requires a change to the already release errata, so its version should be increased to "CESA-2012:123-2".
Scenario 2: You build the packages for CEEA-2012:456 and push them to the mirrors. By doing that CEEA-2012:456-1 is released. However, you realize that you really broke stuff and you have to push fixed packages immediately. As you changed a released errata, you have to change its version making it CEEA-2012:456-2
Regards, Andreas
hi Andreas,
On 08/02/2012 12:54 PM, Andreas Rogge wrote:
step1: get a basic set of metadata online so that people can detect if an update is tag'd security, bugfix or enhancement
That's probably what 90% of people will be happy with.
interesting. are you saying that most people are not interested in tracking specific CVE's etc ?
What we have right now only provides for the<version>.<release> rpmset and updates but only in relation to that specific tree (eg. /6.3/ is what /6/ maps to at the moment ). Yum-security metadata in this repo would then only be relevant to the rpms contained in /6.3/, whatever repo they might be in - however, someone running 6.0 or 6.1. when checking for updates is likely to miss interim updates that were security tag'd at some release level or the other.
Probably. But that's an issue with the mirror- and repository-layout. The solution to this is having one continous updates-repository per distro-version instead of one per point-release.
problem with the new layout is that every mirror will now need ~ 1TiB of storage ( compared with the ~ 80 GB needed now ); with hardlinking aggressively, we could bring that down to about 600GB or so, but its still way too much.
Lets only consider doing that if there is realistically no way to fix the problem without making that big a change to mirroring requirements. The other option we have is that yum-security would need to rely on centos-release-extras, which in turn can ship a CentOS-Vault repo, with the relevant metadata included from vault. The only change that would need is that vault.centos.org needs to not only contain older packages, but also ones from the present tree.
There is no way to really tell, but I feel uptake for yum-security is going to be a small number of people across all CentOS users - and we might be able to satisfy demand with a reasonable performance level by adding a few more machines to Vault.centos.org rather than changing mirror.centos.org and all the external mirrors.
I'd just go down that road. I wouldn't even make a difference for architectures and you can argue wether it is worth the effort to differenciate between major releases.
right, thats the point - to make yum-security worthwhile, it needs to not consider point releases, the whole distro ver needs to be considered in one stack.
Coverage is important, but it is rather simple to test: just make sure every package in you repo is mentioned in updateinfo.xml. Maybe throw in a blacklist with the packages from the x.0-release. What is much harder to achieve is reliability and consistency of the information provided.
the plan is to have the generation code hard wired into the script which pushes rpms to the updates/ repo. that should mostly solve the problem going forward; for retrospective updates and for rpms released into the base os at point release time, i think it will haveto be a manual effort for now. Maybe we can try and get some level of smartness sorted for 6.4 and beyond, but for now it will need to be a manual effort ( which I am trying to reduce as much as possible ).
Upstream has a release-number on their errata and we should probably also introduce one.
as far as I can tell that -<release> for errata is to handle internal RH oops' rather than something that is exported publicly. Once something is marked as public, they dont increment the release number, which is good. The only exception I have seen to that is when they update releaes to change grammar or text within a release announcement. A real fix or change to the packages would have to be via a new errata notice.
Am 08.08.2012 12:08, schrieb Karanbir Singh:
hi Andreas,
On 08/02/2012 12:54 PM, Andreas Rogge wrote:
step1: get a basic set of metadata online so that people can detect if an update is tag'd security, bugfix or enhancement
That's probably what 90% of people will be happy with.
interesting. are you saying that most people are not interested in tracking specific CVE's etc ?
In fact most people I know are only interested in wether the patches/updates they're missing fix a security issue or not. More information is interesting, but not crucial. The question usually is: should I install these right now or can I wait another day? In fact some of my customers only install bugfixes in one batch with security fixes.
problem with the new layout is that every mirror will now need ~ 1TiB of storage ( compared with the ~ 80 GB needed now ); with hardlinking aggressively, we could bring that down to about 600GB or so, but its still way too much.
Point taken. But then we can just take another approach placing the updateinfo.
Lets only consider doing that if there is realistically no way to fix the problem without making that big a change to mirroring requirements. [...]
I'd strongly discourage the use of vault.centos.org for stuff like that and there's a rather simple way to get around changing anything on the mirrors.
the plan is to have the generation code hard wired into the script which pushes rpms to the updates/ repo. that should mostly solve the problem going forward; for retrospective updates and for rpms released into the base os at point release time, i think it will haveto be a manual effort for now. Maybe we can try and get some level of smartness sorted for 6.4 and beyond, but for now it will need to be a manual effort ( which I am trying to reduce as much as possible ).
We could have some kind of repository containing all the updateinfo snippets (Wiki, Database, git repo, you-name-it). When generating the repository metadata you just pull all snippets that mention any of the rpms in your repo and assemble your updateinfo.xml. While doing that you'll immediately find all rpms that lack updateinfo.
as far as I can tell that -<release> for errata is to handle internal RH oops' rather than something that is exported publicly. Once something is marked as public, they dont increment the release number, which is good. The only exception I have seen to that is when they update releaes to change grammar or text within a release announcement. A real fix or change to the packages would have to be via a new errata notice.
I know how it works for RH. Problem is: we don't have anything like that. First time you ship something broken that needs to be fixed the CentOS numbering scheme will break. Also there's currently no way to name Errata for the centosplus and/or extras repositories.
Regards, Andreas
On Wed, Aug 8, 2012 at 5:08 AM, Karanbir Singh mail-lists@karan.org wrote:
That's probably what 90% of people will be happy with.
interesting. are you saying that most people are not interested in tracking specific CVE's etc ?
I think I missed the basic premise here. The specifics only matter when you don't have a known fix installed. Separating things isn't the point so much as just getting them in the update stream so normal updates install them. Is this for the special case where normal updates are backed up from build issues at a point/version release - or to help where people don't want updates to fix bugs unless they are security-related?
On 08/16/2012 10:04 AM, Les Mikesell wrote:
On Wed, Aug 8, 2012 at 5:08 AM, Karanbir Singh mail-lists@karan.org wrote:
That's probably what 90% of people will be happy with.
interesting. are you saying that most people are not interested in tracking specific CVE's etc ?
I think I missed the basic premise here. The specifics only matter when you don't have a known fix installed. Separating things isn't the point so much as just getting them in the update stream so normal updates install them. Is this for the special case where normal updates are backed up from build issues at a point/version release - or to help where people don't want updates to fix bugs unless they are security-related?
One point is, for already installed packages you can print out the CVE's or the Index Number of the update (as one example). This means you can fairly easily generate reports to show compliance with some standard (PCI, etc.)
You can also say to only install Security and not BugFix or Enhancement updates, etc.
See this page for the capabilities that yum-security can give:
Hi,
Thanks you to plan to work on this point. If you need some help to implement it, maybe i could help (for example on spacewalk integration part).
with a bit of time opening up, I've gone back to looking at the yum-security issue and how we can address it ( i.e: atleast get the basics working ).
I plan on doing the work in multiple stages as :
step1: get a basic set of metadata online so that people can detect if an update is tag'd security, bugfix or enhancement
That's a good starting point.
step2: get some more metadata added in, with bug id's and cve's into that metadata
I think this will be very useful (from my point of view) to track some clearly identified bugs or CVE that impact installed servers packages.
step3: get everything rolled out by default on all centos installs and look at how external projects like spacewalk/pulp etc consume this metadata
And this will be a "nice to have". We use spacewalk to manage our centos servers. some times ago we used centos-errata.py script (http://www.bioss.ac.uk/people/davidn/spacewalk-stuff/), but since centos project no longer provide md5sum for packages (and, as i know, spacewalk use it in packages storage tree) it don't work any more.
At the moment, I'm only thinking about things and trying to scope out the work. However, there is one issue that might be a spanner in the works based on how we have mirror.centos.org setup.
What we have right now only provides for the <version>.<release> rpmset and updates but only in relation to that specific tree (eg. /6.3/ is what /6/ maps to at the moment ). Yum-security metadata in this repo would then only be relevant to the rpms contained in /6.3/, whatever repo they might be in - however, someone running 6.0 or 6.1. when checking for updates is likely to miss interim updates that were security tag'd at some release level or the other.
One way to work around this would be to have yum consider all interim package metadata between installed.rpm and latest-in-repo.rpm ( which would then mean that we would need the yum-security metadata to contain all info for everything ever released ; isnt a problem as such ).
Or we setup a repo that has everything ever released. This in turn has some serious caveats. Storage on every mirror being a good problem to start with - however, in limited tests it looks like yum will work with redirect, so while we would need the metadata to contain all packages, the physical packages can still be handed out from vault.centos.org, but that redirect foo needs some level of smartness on the mirror end; trivial to implement when we control the mirror.centos.org network, however a very large part of the mirror services are offloaded to external mirrors - hundreds of them. Its super tricky getting smartness onto each and every one of their machines.
Thoughts, concerns, ideas ? There is no 'work' thats been done at this point on the problem, so we can take pretty much any course of action that seems sane.
I feel its important that if we are going to provide a mechanism that people will then in turn rely on to get patch requirements for their machines, we need to make sure we have 100% coverage.
Have a nice day.
Regards.
Baptiste.
hi Baptiste,
On 09/25/2012 03:47 PM, Baptiste AGASSE wrote:
Thanks you to plan to work on this point. If you need some help to implement it, maybe i could help (for example on spacewalk integration part).
yes, I do need a bit of help with this - the spacewalk part would be cool to have working.
step1: get a basic set of metadata online so that people can detect if an update is tag'd security, bugfix or enhancement
That's a good starting point.
This is sort-of-working at this point, however, to be really useful we are going to need to have the metadata done in a way that it includes all RPMS released in the entire distro life. But, we dont store all RPMS in the mirror.centos.org network, we only retain the latest point release ( the amount of disk space needed for every rpm for a long running release like CentOS-5 is prohibitive.
Keeping that in mind, its worth noting that we do have all RPMS released in the distro on vault.centos.org ( but not including the latest release ). So what I am going to look at is making vault also contain all the packages presently on mirror.centos.org and then build a super-repo ( for the lack of a better name ), which contains metadata for every rpm released. And then add in the update info files needed for yum-security.
btw, if we have a working yum-security layer, would that not be all that is needed for spacewalk? or would there be a need for anything else to be added in ? Would that be the same for say, pulp-project as well ?
Regards
Hi Karanbir,
hi Baptiste,
On 09/25/2012 03:47 PM, Baptiste AGASSE wrote:
Thanks you to plan to work on this point. If you need some help to implement it, maybe i could help (for example on spacewalk integration part).
yes, I do need a bit of help with this - the spacewalk part would be cool to have working.
step1: get a basic set of metadata online so that people can detect if an update is tag'd security, bugfix or enhancement
That's a good starting point.
This is sort-of-working at this point, however, to be really useful we are going to need to have the metadata done in a way that it includes all RPMS released in the entire distro life. But, we dont store all RPMS in the mirror.centos.org network, we only retain the latest point release ( the amount of disk space needed for every rpm for a long running release like CentOS-5 is prohibitive.
Keeping that in mind, its worth noting that we do have all RPMS released in the distro on vault.centos.org ( but not including the latest release ). So what I am going to look at is making vault also contain all the packages presently on mirror.centos.org and then build a super-repo ( for the lack of a better name ), which contains metadata for every rpm released. And then add in the update info files needed for yum-security.
btw, if we have a working yum-security layer, would that not be all that is needed for spacewalk? or would there be a need for anything else to be added in ? Would that be the same for say, pulp-project as well ?
Errata support in the current spacewalk version (1.7) seems to be broken (the only one repository that i use and provides errata is EPEL, and i use filters on it to synchronize only wanted packages).
Regards.
Baptiste.
Hi,
On 10/03/2012 11:17 AM, Baptiste AGASSE wrote:
Errata support in the current spacewalk version (1.7) seems to be broken (the only one repository that i use and provides errata is EPEL, and i use filters on it to synchronize only wanted packages).
I'm confused by that a bit - are you saying that Spacewalk is broken, or that the errata published by the repo is broken ?
I've been testing the yum-security stuff at this end and still have a few issues to work out ( mostly involves reading AUP's and T&C's from various places to make sure the metadata being consumed does not violate anything )
Errata support in the current spacewalk version (1.7) seems to be broken (the only one repository that i use and provides errata is EPEL, and i use filters on it to synchronize only wanted packages).
I'm confused by that a bit - are you saying that Spacewalk is broken, or that the errata published by the repo is broken ?
I just said that spacewalk-reposync doesn't import EPEL errata in v1.7 for me. I seem to have read something related to this on the spacewalk ml, I'll try to find it.
I've been testing the yum-security stuff at this end and still have a few issues to work out ( mostly involves reading AUP's and T&C's from various places to make sure the metadata being consumed does not violate anything )