I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
Todd
On 09/05/2005 04:58 PM, Todd Cary wrote:
I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
First of all: You shall not believe everything that you read. :-)
Yum, is one of the youngest players in the rpm update world. Yum has been (re-)written by Seth Vidal.
Project page: http://linux.duke.edu/projects/yum/
Nothing more to say, if you read the project page. Yum is still under active development by Seth and friends; So antiquated is not correct!
What is antiquated in some way is up2date. It's still used by RH and there are good reasons for 'em to use it...
What is also antiquated in some way is apt. But my recommandation against apt comes from technical view: apt wasn't meant to play with rpm, even if it seems to work fine.
However...
yum is currently the best tool for Fedora, CentOS, AlphaLinux, etc. because it's easy to use, it's easy to build own repositories and there are MANY yum-repositories out there allready... And yum knows how to play well with rpm...
Best, Oliver
Oliver -
Thank you for your detail explanation...
Todd
Oliver Falk wrote:
On 09/05/2005 04:58 PM, Todd Cary wrote:
I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
First of all: You shall not believe everything that you read. :-)
Yum, is one of the youngest players in the rpm update world. Yum has been (re-)written by Seth Vidal.
Project page: http://linux.duke.edu/projects/yum/
Nothing more to say, if you read the project page. Yum is still under active development by Seth and friends; So antiquated is not correct!
What is antiquated in some way is up2date. It's still used by RH and there are good reasons for 'em to use it...
What is also antiquated in some way is apt. But my recommandation against apt comes from technical view: apt wasn't meant to play with rpm, even if it seems to work fine.
However...
yum is currently the best tool for Fedora, CentOS, AlphaLinux, etc. because it's easy to use, it's easy to build own repositories and there are MANY yum-repositories out there allready... And yum knows how to play well with rpm...
Best, Oliver _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Mon, 2005-09-05 at 07:58 -0700, Todd Cary wrote:
I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
Some of the same people, among others, will also say APT is "not native" to RPM, and only works well for DPKG. In reality, it's really more about the repositories than the tools.
Although SmartPM looks to finally remove a lot of the issues.
I think the biggest gripe about YUM is the lack of a standard GUI, and YumEx has had compatibility issues in the past with newer YUM versions. SmartPM focuses on solving cross-repository issues and comes with a GUI as standard.
All-in-all, use the tool that is supported by the distro. That is YUM. No, there is no GUI for it that is supported officially, hence some of the complaints. But I'm keeping my eye on SmartPM for the future.
On Mon, 5 Sep 2005 at 1:06pm, Bryan J. Smith wrote
I think the biggest gripe about YUM is the lack of a standard GUI, and YumEx has had compatibility issues in the past with newer YUM versions. SmartPM focuses on solving cross-repository issues and comes with a GUI as standard.
All-in-all, use the tool that is supported by the distro. That is YUM. No, there is no GUI for it that is supported officially, hence some of the complaints. But I'm keeping my eye on SmartPM for the future.
I have yet to see any advantage to a GUI package manager. But, then again, that's just me.
On Mon, 5 Sep 2005 at 1:06pm, Bryan J. Smith wrote
I think the biggest gripe about YUM is the lack of a standard GUI, and YumEx has had compatibility issues in the past with newer YUM versions. SmartPM focuses on solving cross-repository issues and comes with a GUI as standard.
All-in-all, use the tool that is supported by the distro. That is YUM. No, there is no GUI for it that is supported officially, hence some of the complaints. But I'm keeping my eye on SmartPM for the future.
I have yet to see any advantage to a GUI package manager. But, then again, that's just me.
Copy that. Altough, I must say, that Linux-On-The-Desktop-Users _need_ some GUI, usually...
Best, Oliver
On Mon, 5 Sep 2005, Joshua Baker-LePain wrote:
On Mon, 5 Sep 2005 at 1:06pm, Bryan J. Smith wrote
All-in-all, use the tool that is supported by the distro. That is YUM. No, there is no GUI for it that is supported officially, hence some of the complaints. But I'm keeping my eye on SmartPM for the future.
I have yet to see any advantage to a GUI package manager. But, then again, that's just me.
Smart is not a GUI per se. It is a command line tool and people are working on a curses-based front-end. A KDE panel applet exists as well.
There are benefits to having an integrated command line tool and GUI from a maintenance perspective. Most of the code can be reused.
I would love to have RHN support and finally get rid of up2date :)
The biggest disadvantage for both Yum and Smart is that both require a recent version of python. Which is a no-go for older distributions.
BTW The developer previously was in charge of apt-rpm and worked on synaptic too. Smart is available for most of the popular distributions.
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
On Mon, 2005-09-05 at 21:38 +0200, Dag Wieers wrote:
On Mon, 5 Sep 2005, Joshua Baker-LePain wrote:
On Mon, 5 Sep 2005 at 1:06pm, Bryan J. Smith wrote
All-in-all, use the tool that is supported by the distro. That is YUM. No, there is no GUI for it that is supported officially, hence some of the complaints. But I'm keeping my eye on SmartPM for the future.
I have yet to see any advantage to a GUI package manager. But, then again, that's just me.
Smart is not a GUI per se. It is a command line tool and people are working on a curses-based front-end. A KDE panel applet exists as well.
There are benefits to having an integrated command line tool and GUI from a maintenance perspective. Most of the code can be reused.
I would love to have RHN support and finally get rid of up2date :)
The biggest disadvantage for both Yum and Smart is that both require a recent version of python. Which is a no-go for older distributions.
very true
BTW The developer previously was in charge of apt-rpm and worked on synaptic too. Smart is available for most of the popular distributions.
Smart looks nice and is moving along OK, and having a GUI is good. Smart can use repomd (Repo MetaData) used by yum and the metadata used by apt. It also has a couple features that I like ... one of them is signing a priority to a repo (ie, you can make packages in the base repo have a higher score than a add-on repo. This would only update absolute requirements from the add-on repo.)
yum is also making progress with a sqlite backend. This speeds up yum. A new createrepo (creates the repomd data for yum) also now caches the md5sums for packages, so it runs faster too.
I personally use yum ... though I have one test machine that is using smart. Even when I use smart, I use the CLI and not the GUI.
CentOS-4.2 will have an upgrade to yum 2.4.x and createrepo-0.4.3, which requires a couple packages to be added to the distribution. The new pacakges will be:
createrepo-0.4.3-1.noarch.rpm (upgrade from 0.4.2) python-elementtree-1.2.6-4.i386.rpm python-sqlite-1.1.6-1.i386.rpm python-urlgrabber-2.9.6-2.noarch.rpm sqlite-3.2.2-1.i386.rpm sqlite-devel-3.2.2-1.i386.rpm yum-2.4.0-1.centos4.noarch.rpm (upgrade from 2.2.1)
One very good thing for CentOS is that Seth Vidal is one of the CentOS Developers, so we will have extended yum support for the duration :)
On Mon, 2005-09-05 at 15:45 -0500, Johnny Hughes wrote:
Smart looks nice and is moving along OK, and having a GUI is good. Smart can use repomd (Repo MetaData) used by yum and the metadata used by apt. It also has a couple features that I like ... one of them is signing a priority to a repo (ie, you can make packages in the base repo have a higher score than a add-on repo. This would only update absolute requirements from the add-on repo.)
It's much better than APT's pinning IMHO.
I personally use yum ... though I have one test machine that is using smart. Even when I use smart, I use the CLI and not the GUI.
As I recommended, sticking with the distro's default is most ideal. But I'm keeping my eye on SmartPM.
One very good thing for CentOS is that Seth Vidal is one of the CentOS Developers, so we will have extended yum support for the duration :)
That's great news (I didn't know Seth was involved with CentOS).
On Mon, 2005-09-05 at 14:11 -0400, Joshua Baker-LePain wrote:
I think the biggest gripe about YUM is the lack of a standard GUI, and YumEx has had compatibility issues in the past with newer YUM versions. SmartPM focuses on solving cross-repository issues and comes with a GUI as standard.
All-in-all, use the tool that is supported by the distro. That is YUM. No, there is no GUI for it that is supported officially, hence some of the complaints. But I'm keeping my eye on SmartPM for the future.
I have yet to see any advantage to a GUI package manager. But, then again, that's just me.
I was reading this thread and the same thing kept coming to my mind. For the task it does, yum doesn't need a GUI interface. There's a time and place and task when GUI is right. But the world seems to be forgetting that there is also a time and place and task for the command line.
I look after 3 versions of CentOS, 2, 3 & 4. Each has it's own different version of yum. Different versions have different command line parameters, different header formats, different config file layouts etc.
Yum headers are also not very robust. You can't safely use yum while a) updating your mirror or b) running yum-arch (with -c which takes along time, esp on openoffice). This is a PITA when you are patching a lot of machines and want to obtain new software at the same time.
I also think that yum needs a way to track certain packages only from a specific repository, rather than the entire repo (ie. I want 1 package from Dag, not everything). (I don't know if new versions can do this...)
I also think yum is too slow.
All those issues aside, every other solution seems to have similar problems. On CentOS-2 I normally use arrghpm which is a tool I wrote to do what I want. It does not rely on headers at all but it is not designed to solve dependencies (because rpm already does that).
(OT Side note. Mirroring updates for CentOS 3 & 4 is also a PITA because I need to have multiple directories, one for each point release. It is just me???)
Todd Cary wrote:
I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
Todd
John Newbigin wrote:
I also think yum is too slow.
I'm glad I'm not the only one who thinks that. After using urpmi at home on Mandrake with more than 11000 packages in the configured repos I can say that yum is definately far slower with only ~4000 packages in configured repos. The whole "Setting up Repos" and "Reading repository metadata in from local files" is a real drag.
On Tue, 6 Sep 2005, Tim Edwards wrote:
John Newbigin wrote:
I also think yum is too slow.
I'm glad I'm not the only one who thinks that. After using urpmi at home on Mandrake with more than 11000 packages in the configured repos I can say that yum is definately far slower with only ~4000 packages in configured repos. The whole "Setting up Repos" and "Reading repository metadata in from local files" is a real drag.
Check if you have enough memory in the system. I experienced that Yum needs more than 192MB of RAM.
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
On 9/6/05, Dag Wieers dag@wieers.com wrote:
On Tue, 6 Sep 2005, Tim Edwards wrote:
John Newbigin wrote:
I also think yum is too slow.
I'm glad I'm not the only one who thinks that. After using urpmi at home on Mandrake with more than 11000 packages in the configured repos I can say that yum is definately far slower with only ~4000 packages in configured repos. The whole "Setting up Repos" and "Reading repository metadata in from local files" is a real drag.
Check if you have enough memory in the system. I experienced that Yum needs more than 192MB of RAM.
Yeah, but that depends on the number of packages you are updating/installing. If your package list and dependency tree has fewer than 2 or 3 packages, it needs very little RAM. I've used yum 2.2.x on systems with 64MB of RAM without problem by simply listing out all of the packages that need to be updated with "yum list updates" and then installing them in groups of 2 or 3.
Greg
I'd like to comment on this list to say that despite earlier reports saying that I had fixed my YUM problems, it still freezes my computer every now and again. I've been told this might be because I have too little free space on my hard drive, so I cleared over 9 gigs, and it still freezes sometimes. I was told my RPM database was screwed, so I rebuilt the database and it still freezes.
I'm not sure what the problem is now, but I do know that on the one hand YUM seems like a great idea, and when it does work, I think it's really cool. But something is not right with it, and I don't even have error messages which let me know what went wrong, so I can't say that I would recommend YUM.
Dave
On 9/6/05, Dave Gutteridge dave@tokyocomedy.com wrote:
I'd like to comment on this list to say that despite earlier reports saying that I had fixed my YUM problems, it still freezes my computer every now and again. I've been told this might be because I have too little free space on my hard drive, so I cleared over 9 gigs, and it still freezes sometimes.
Dave,
Whomever told you that it might be an overfilled hard drive either has a bad mental map of how computers work or thought that your problem was caused by a unlikely corner case. Run from his advice.
I was told my RPM database was screwed, so I rebuilt the database and it still freezes.
Perhaps, but that typically gives more specific errors that you can search on and find that an rpm remove/rebuild/repeat will solve your problem and not just the evil no error that you seem to be experiencing.
I'm not sure what the problem is now, but I do know that on the one hand YUM seems like a great idea, and when it does work, I think it's really cool. But something is not right with it, and I don't even have error messages which let me know what went wrong, so I can't say that I would recommend YUM.
This is really a pretty harsh criticism without much indication of what your exact hardware/software specs are and what your problem is. Can you please post RAM/CPU, specific version of operating system, and the result of "yum list updates" if it does complete?
I feel certain that given those bits of information we will be far down the path of fixing whatever your problem is or at least being able to point at antiquated/failing hardware as the problem. And note my message that yum can work fine with a Pentium133/64MBRAM system, so "antiquated" really needs to be old for it to not work with yum.
Regards, Greg
I'd suggest running memtest86 and mprime on the system for a few dozen hours, maybe it's CPU/RAM issues?
Cheers, MaZe.
On Tue, 6 Sep 2005, Greg Knaddison wrote:
On 9/6/05, Dave Gutteridge dave@tokyocomedy.com wrote:
I'd like to comment on this list to say that despite earlier reports saying that I had fixed my YUM problems, it still freezes my computer every now and again. I've been told this might be because I have too little free space on my hard drive, so I cleared over 9 gigs, and it still freezes sometimes.
Dave,
Whomever told you that it might be an overfilled hard drive either has a bad mental map of how computers work or thought that your problem was caused by a unlikely corner case. Run from his advice.
I was told my RPM database was screwed, so I rebuilt the database and it still freezes.
Perhaps, but that typically gives more specific errors that you can search on and find that an rpm remove/rebuild/repeat will solve your problem and not just the evil no error that you seem to be experiencing.
I'm not sure what the problem is now, but I do know that on the one hand YUM seems like a great idea, and when it does work, I think it's really cool. But something is not right with it, and I don't even have error messages which let me know what went wrong, so I can't say that I would recommend YUM.
This is really a pretty harsh criticism without much indication of what your exact hardware/software specs are and what your problem is. Can you please post RAM/CPU, specific version of operating system, and the result of "yum list updates" if it does complete?
I feel certain that given those bits of information we will be far down the path of fixing whatever your problem is or at least being able to point at antiquated/failing hardware as the problem. And note my message that yum can work fine with a Pentium133/64MBRAM system, so "antiquated" really needs to be old for it to not work with yum.
Regards, Greg _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
This is really a pretty harsh criticism without much indication of what your exact hardware/software specs are and what your problem is.
Oh, sorry. I didn't mean to totally complain about YUM. I just haven't had an easy time of it, and the original poster was looking for reasons why maybe some people didn't like yum.
Can you please post RAM/CPU, specific version of operating system, and the result of "yum list updates" if it does complete?
512MB Ram, Pentium3, 500Mhz.
[root@localhost ~]# yum list updates Setting up Repos addons 100% |=========================| 951 B 00:00 kbs-CentOS-Extras 100% |=========================| 951 B 00:00 kbs-CentOS-Misc 100% |=========================| 951 B 00:00 update 100% |=========================| 951 B 00:00 dag 100% |=========================| 1.1 kB 00:00 base 100% |=========================| 1.1 kB 00:00 freshrpms 100% |=========================| 951 B 00:00 extras 100% |=========================| 1.1 kB 00:00 Reading repository metadata in from local files kbs-CentOS: ################################################## 1572/1572 kbs-CentOS: ################################################## 61/61 primary.xml.gz 100% |=========================| 49 kB 00:00 MD Read : ################################################## 124/124 update : ################################################## 124/124 dag : ################################################## 2458/2458 base : ################################################## 1406/1406 primary.xml.gz 100% |=========================| 98 kB 00:01 MD Read : ################################################## 326/326 freshrpms : ################################################## 326/326 extras : ################################################## 32/32 Excluding Packages in global exclude list Finished
"antiquated" really needs to be old for it to not work with yum.
That's good to know. I don't think my computer is "antiquated", although it is getting long in the tooth. It would be nice to clear this detail up. Two more details: Yum usually works fine for the first little while after I start the computer. When it freezes, it usually occurs after the computer has been on for a while, or if I've run YUM a couple of time, say, searching for a particular package. I also keep getting this annoying error message: /sbin/ldconfig: File /usr/lib/libk3bdevice.so.2.0.0.#prelink#.X8kEMh is empty, not checked. I tried turning "prelinking" off by editing a file... though I can't remember off hand which file that was right now. It was weeks ago, and it didn't help anyway. The error still happens now and again.
Dave
On 9/7/05, Dave Gutteridge dave@tokyocomedy.com wrote:
Can you please post RAM/CPU, specific version of operating system, and the result of "yum list updates" if it does complete?
512MB Ram, Pentium3, 500Mhz.
Ok, this should be fine.
[root@localhost ~]# yum list updates Setting up Repos addons 100% |=========================| 951 B 00:00 kbs-CentOS-Extras 100% |=========================| 951 B 00:00 kbs-CentOS-Misc 100% |=========================| 951 B 00:00 update 100% |=========================| 951 B 00:00 dag 100% |=========================| 1.1 kB 00:00 base 100% |=========================| 1.1 kB 00:00 freshrpms 100% |=========================| 951 B 00:00 extras 100% |=========================| 1.1 kB 00:00 Reading repository metadata in from local files kbs-CentOS: ################################################## 1572/1572 kbs-CentOS: ################################################## 61/61 primary.xml.gz 100% |=========================| 49 kB 00:00 MD Read : ################################################## 124/124 update : ################################################## 124/124 dag : ################################################## 2458/2458 base : ################################################## 1406/1406 primary.xml.gz 100% |=========================| 98 kB 00:01 MD Read : ################################################## 326/326 freshrpms : ################################################## 326/326 extras : ################################################## 32/32 Excluding Packages in global exclude list Finished
So, looks like you have nothing to update, right? So yum must work well enough that you can update your system :)
"antiquated" really needs to be old for it to not work with yum.
That's good to know. I don't think my computer is "antiquated", although it is getting long in the tooth. It would be nice to clear this detail up.
Your machine seems like it should be fine - I agree it's only a little old, but should be fine in terms of speed if none of the hardware is just flat failing.
Two more details: Yum usually works fine for the first little while after I start the computer. When it freezes, it usually occurs after the computer has been on for a while, or if I've run YUM a couple of time, say, searching for a particular package. I also keep getting this annoying error message: /sbin/ldconfig: File /usr/lib/libk3bdevice.so.2.0.0.#prelink#.X8kEMh is empty, not checked. I tried turning "prelinking" off by editing a file... though I can't remember off hand which file that was right now. It was weeks ago, and it didn't help anyway. The error still happens now and again.
Generally speaking, running yum several times a day shouldn't just cause problems on it's own. However, if you frequently are searching, it can be handy to do a "yum list installed > YumInstalled.txt" and "yum list available > YumAvailable.txt" and then do a grep on the resulting text files rather than starting/re-running yum each time. Also, if you feel like upgrading to yum2.4 then yum should be faster and you can use the yum shell to do your searching in a more efficient manner.
Maciej says to delete your prelink file - I don't know enough to know if that will help, but as far as I can see you don't have a specific problem per se, just that yum is slow which has been addressed in 2.4 and can also be worked around.
Greg
but as far as I can see you don't have a specific problem per se, just that yum is slow which has been addressed
Um... I guess I didn't convey the problem. Yum isn't just slow, it freezes my machine. It halts the system to the point where the mouse and keyboard are unresponsive, and the only way I can regain control of my system is to do a hard reset. Yum is the only program on my system which causes this to happen.
If I run Yum immediately after a reboot, it's usually safe. This is how I've managed to keep up to date.
But, if I have been using my computer for a while, and then I run Yum, it is just as likely as not to freeze my computer. Or, if I run Yum more than once, it may freeze my computer on the second, or third time.
If Yum was just slow, I wouldn't mind so much. But having to hard reset is a pretty big issue.
Dave
On Thu, 2005-09-08 at 02:36 +0900, Dave Gutteridge wrote:
but as far as I can see you don't have a specific problem per se, just that yum is slow which has been addressed
Um... I guess I didn't convey the problem. Yum isn't just slow, it freezes my machine. It halts the system to the point where the mouse and keyboard are unresponsive, and the only way I can regain control of my system is to do a hard reset. Yum is the only program on my system which causes this to happen.
If I run Yum immediately after a reboot, it's usually safe. This is how I've managed to keep up to date.
But, if I have been using my computer for a while, and then I run Yum, it is just as likely as not to freeze my computer. Or, if I run Yum more than once, it may freeze my computer on the second, or third time.
If Yum was just slow, I wouldn't mind so much. But having to hard reset is a pretty big issue.
---- perhaps there is a problem with your /var partition...
try running 'shutdown now -Fr' # reboot and run fsck on /var
you could try 'yum clean all' to clear out all the downloaded rpms, headers & cache after that.
Craig
--- Dave Gutteridge dave@tokyocomedy.com wrote:
perhaps there is a problem with your /var
partition...
What do you mean by var "partition"? I have a /var directory. Is it supposed to be on a separate partition?
Dave
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Dave,
That is what he is talking about. Some people have /var setup as a different partition or they call it a partition instead of a directory.
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
On Thu, 2005-09-08 at 02:53 +0900, Dave Gutteridge wrote:
perhaps there is a problem with your /var partition...
What do you mean by var "partition"? I have a /var directory. Is it supposed to be on a separate partition?
---- on server systems I find it to be a good idea. If you set up as one LVM volume, it's not going to be on it's own partition.
at any rate - sometimes journal'd filesystems are damaged and the only sure way to know is to fsck it - which the previous message would do.
Craig
Craig White craigwhite@azapple.com wrote:
on server systems I find it to be a good idea. If you set up as one LVM volume, it's not going to be on it's own partition.
Aw now, don't get anal. ;-> Besides, that's all subjective.
E.g., even a "Logical Partition" isn't a partition. It's a "partition in a partition" since the legacy PC BIOS/DOS disk label can only have 4 partitions.
That's why I like to use the UNIX terminology of disk labels and disk slices. Removes the whole legacy PC BIOS/DOS non-sense, even making it easier to explain NT5+ (2000+) Logical Disk Manager (LDM) as well.
at any rate - sometimes journal'd filesystems are damaged and the only sure way to know is to fsck it - which the previous message would do.
Exactomundo. Journaling filesystems are not the savior of disk corruption, only the reducer of boot times -- unless you have something like full data journaling with a NVRAM.
On Wed, 2005-09-07 at 12:53 -0700, Bryan J. Smith wrote:
Craig White craigwhite@azapple.com wrote:
on server systems I find it to be a good idea. If you set up as one LVM volume, it's not going to be on it's own partition.
Aw now, don't get anal. ;-> Besides, that's all subjective.
E.g., even a "Logical Partition" isn't a partition. It's a "partition in a partition" since the legacy PC BIOS/DOS disk label can only have 4 partitions.
---- if you have seen true anal retentive behavior, then you would not willy nilly call me being anal for the above - but rather your inclination to put exactness to the statement I made in your own terms is much closer to anal retentive behavior.
Granted what I stated didn't precisely cover all the various possibilities of partitioning with or without LVM and therefore lacked the precision that you hold so dear.
Thankfully, I have a sense of humor but not everyone enjoys having their posts picked apart for what appears to only be petty corrections to their imprecision...you should keep that in mind.
Craig
Craig White craigwhite@azapple.com wrote:
Thankfully, I have a sense of humor but not everyone enjoys having their posts picked apart for what appears to only be petty corrections to their imprecision...you should keep that in mind.
Dude?! Didn't you realize ...
1. My smiley-wink?!?!?!
2. You were "correcting" someone else, with far less "humor" than I.
If you're going to be "anal" on things, be sure to be correct. And even then, show some humility -- I do all-the-time.
On Wed, 2005-09-07 at 16:03 -0700, Bryan J. Smith wrote:
Craig White craigwhite@azapple.com wrote:
Thankfully, I have a sense of humor but not everyone enjoys having their posts picked apart for what appears to only be petty corrections to their imprecision...you should keep that in mind.
Dude?! Didn't you realize ...
- My smiley-wink?!?!?!
--- sorry - I missed it ---
- You were "correcting" someone else, with far less "humor"
than I.
---- I wasn't correcting - I was answering his question ----
If you're going to be "anal" on things, be sure to be correct. And even then, show some humility -- I do all-the-time.
---- indeed - thanks
Craig
On Wed, 2005-09-07 at 14:53, Bryan J. Smith wrote:
at any rate - sometimes journal'd filesystems are damaged and the only sure way to know is to fsck it - which the previous message would do.
Exactomundo. Journaling filesystems are not the savior of disk corruption, only the reducer of boot times -- unless you have something like full data journaling with a NVRAM.
I've had a couple of boxes with Reiserfs crash recently from UPS problems and multiple times they have refused to mount without a fsck with --rebuild-tree, which takes a full day or so. I thought the point of journaling was to avoid needing that... Is there any reason to expect better from xfs? These are running backuppc and need better-than-ext3 performance at creating/removing files.
Les Mikesell lesmikesell@gmail.com wrote:
I've had a couple of boxes with Reiserfs crash recently from UPS problems and multiple times they have refused to mount without a fsck with --rebuild-tree, which takes a full day or so. I thought the point of journaling was to avoid needing that...
Avoid? Yes. Eliminate? Impossible.
Journaling helps _avoid_ full filesystem integrity checks. But the only way to absolutely check if a filesystem is completely consistent is with a full filesystem integrity check.
In fact, the "better" journaling filesystems know when _not_ to trust their own journals. ReiserFS is actually very _good_ from that standpoint, it would rather not reply a journal it does not believe that is good.
But that's only half the issue. A journaling filesystem must _also_ have good off-line recovery tools -- i.e., the off-line fsck utility itself.
Is there any reason to expect better from xfs?
XFS' structure hasn't changed since the mid-'90s -- over a decade. So while its on-line journaling reliability might be debated (among other journaling filesystems), it's off-line "xfs_repair" is trusted as much as e2fsck.
Both JFS and ReiserFS have had signficant structural changes in the last few years. ReiserFS itself is actually designed to be very fluid. While that's fine for its on-line functionality, including the journaling, when it goes off-line, I don't put my trust in the fsck.reiserfs tool to be "in sync" with the latest on-line developments.
These are running backuppc and need better-than-ext3 performance at creating/removing files.
If performance is your bag, then ReiserFS pleases in many areas -- including deletion. XFS absolutely stinks when your filesystem is lot of small, constantly changing files -- and excels better when there are large files as well as small (including extents and delayed writes for fighting fragmentation).
On Wed, 2005-09-07 at 18:14, Bryan J. Smith wrote:
These are running backuppc and need better-than-ext3 performance at creating/removing files.
If performance is your bag, then ReiserFS pleases in many areas -- including deletion. XFS absolutely stinks when your filesystem is lot of small, constantly changing files -- and excels better when there are large files as well as small (including extents and delayed writes for fighting fragmentation).
The reiserfsck runs seemed to work OK so my only complaint about that part is the oddball syntax needed to actually make it fix anything. I'm just wondering why it is so likely to need the fsck at all (maybe 50% of my crashes when busy) and if xfs would be better about that. I thought it was supposed to know what was committed to disk and never leave it in an inconsistent state.
On Wed, 2005-09-07 at 21:30 -0500, Les Mikesell wrote:
The reiserfsck runs seemed to work OK so my only complaint about that part is the oddball syntax needed to actually make it fix anything.
Well, you've had good luck then.
I'm just wondering why it is so likely to need the fsck at all (maybe 50% of my crashes when busy)
On all filesystems? Or just 50% chance that one filesystem will need a fsck?
[ SIDE NOTE from a previous thread: Another reason to segment your filesystems is not only to "localize" any fsck, but segmentation actually _reduces_ the risk of needed a fsck because commits are more localized/contained (especially to /tmp, /var, etc...). ]
and if xfs would be better about that.
Hmmm, not sure "better" is a good word. As much as I love XFS and have _rarely_ had to run "xfs_repair" and the fact that it does do "on-line" journal replays, that doesn't necessarily mean it's "better."
In fact, I still do _not_ trust anything but SGI's specific XFS build. I do not trust the kernel 2.4 backport, and I'm still testing the kernel 2.6 integration.
I thought it was supposed to know what was committed to disk and never leave it in an inconsistent state.
Nope. You through incorrectly.
The _only_ purpose of journaling is to _reduce_ the time it takes to make the filesystem consistent. That assumes the journaling is good and/or the journal replay/unplay works.
There is absolutely _no_ way to guarantee a commit, although full data journaling with a NVRAM board comes close.
On Thu, 2005-09-08 at 07:32, Bryan J. Smith wrote:
The reiserfsck runs seemed to work OK so my only complaint about that part is the oddball syntax needed to actually make it fix anything.
Well, you've had good luck then.
Except that it's a full day of downtime for the service using the drive...
I'm just wondering why it is so likely to need the fsck at all (maybe 50% of my crashes when busy)
On all filesystems? Or just 50% chance that one filesystem will need a fsck?
All the busy ones. I don't think it is a problem with idle filesystems.
[ SIDE NOTE from a previous thread: Another reason to segment your filesystems is not only to "localize" any fsck, but segmentation actually _reduces_ the risk of needed a fsck because commits are more localized/contained (especially to /tmp, /var, etc...). ]
Backuppc conserves space dramatically by hard-linking all duplicates it finds (with a fairly fast hashing scheme to find them). Thus the whole archive has to be on one filesytem. I currently use a 250 gig drive and have a little over 100 gigs used (holding what would be around 900 gigs of raw data before compression and linking).
I thought it was supposed to know what was committed to disk and never leave it in an inconsistent state.
Nope. You through incorrectly.
The _only_ purpose of journaling is to _reduce_ the time it takes to make the filesystem consistent. That assumes the journaling is good and/or the journal replay/unplay works.
There is absolutely _no_ way to guarantee a commit, although full data journaling with a NVRAM board comes close.
I expect to lose data in a crash - I don't expect the system to lose track of the free/used portion of the disk with a journaling filesystem. I suppose the best solution is a better UPS, although I also try to run the backuppc filesystem in software RAID1 between an internal IDE drive and an external firewire drive and the 2.6 kernel crashes consistently in a day or less running like that. It does work well enough that I can usually add the external drive to the raid, let it sync up, then fail and remove it. It is next to impossible to copy the huge number of hardlinks any other way in a reasonable of time.
-- Les Mikesell lesmikesell@gmail.com
Ok, that _CANNOT_ be yum's problem, some other part of your system is flaky. Either Hard Disk, or HDD drivers, or RAM, or CPU, or kernel drivers for one of the above. But it is _NOT_ yum. It can't be. Yum may be the most likely to trigger it, but that doesn't make it it's problem. It just isn't possible (no program can freeze the machine under linux, if a program does freeze the machine, then either there's a bug in the kernel, a hardware error or the program is running with ioport priveledges (which yum doesn't use)). Again, try running memtest86 and mprime on it - verify that the systems CPU and RAM are stable, then try running some disk intensive tasks, then try running them concurrent to mprime. But it's 99.99% guaranteed that you either have flaky hardware or flaky kernel drivers.
Cheers, MaZe.
On Thu, 8 Sep 2005, Dave Gutteridge wrote:
but as far as I can see you don't have a specific problem per se, just that yum is slow which has been addressed
Um... I guess I didn't convey the problem. Yum isn't just slow, it freezes my machine. It halts the system to the point where the mouse and keyboard are unresponsive, and the only way I can regain control of my system is to do a hard reset. Yum is the only program on my system which causes this to happen.
If I run Yum immediately after a reboot, it's usually safe. This is how I've managed to keep up to date.
But, if I have been using my computer for a while, and then I run Yum, it is just as likely as not to freeze my computer. Or, if I run Yum more than once, it may freeze my computer on the second, or third time.
If Yum was just slow, I wouldn't mind so much. But having to hard reset is a pretty big issue.
Dave _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, 2005-09-07 at 21:43 +0200, Maciej Żenczykowski wrote:
Ok, that _CANNOT_ be yum's problem, some other part of your system is flaky. Either Hard Disk, or HDD drivers, or RAM, or CPU, or kernel drivers for one of the above. But it is _NOT_ yum. It can't be. Yum may be the most likely to trigger it, but that doesn't make it it's problem.
Yum does some very CPU and disk intensive operations ... which can make things like bad memory, overloaded power supplies, bad cooling fans, etc. have an effect.
I agree with Maciej .. I don't see how yum could, by itself, cause a hard freeze.
It just isn't possible (no program can freeze the machine under linux, if a program does freeze the machine, then either there's a bug in the kernel, a hardware error or the program is running with ioport priveledges (which yum doesn't use)). Again, try running memtest86 and mprime on it - verify that the systems CPU and RAM are stable, then try running some disk intensive tasks, then try running them concurrent to mprime. But it's 99.99% guaranteed that you either have flaky hardware or flaky kernel drivers.
On Wed, 7 Sep 2005, Johnny Hughes wrote:
On Wed, 2005-09-07 at 21:43 +0200, Maciej ÿÿenczykowski wrote:
Ok, that _CANNOT_ be yum's problem, some other part of your system is flaky. Either Hard Disk, or HDD drivers, or RAM, or CPU, or kernel drivers for one of the above. But it is _NOT_ yum. It can't be. Yum may be the most likely to trigger it, but that doesn't make it it's problem.
Yum does some very CPU and disk intensive operations ... which can make things like bad memory, overloaded power supplies, bad cooling fans, etc. have an effect.
I agree with Maciej .. I don't see how yum could, by itself, cause a hard freeze.
Resource starvation can present itself as a hard freeze of the system. Where even the mouse seems to be unwilling to react :)
Not sure if it's the case here, but it's normal for people to think in this ocassion that the system simply died even though it may just be trashing until eternity.
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
Dave Gutteridge wrote:
[root@localhost ~]# yum list updates Setting up Repos addons 100% |=========================| 951 B 00:00 kbs-CentOS-Extras 100% |=========================| 951 B 00:00 kbs-CentOS-Misc 100% |=========================| 951 B 00:00 update 100% |=========================| 951 B 00:00 dag 100% |=========================| 1.1 kB 00:00 base 100% |=========================| 1.1 kB 00:00 freshrpms 100% |=========================| 951 B 00:00 extras 100% |=========================| 1.1 kB 00:00 Reading repository metadata in from local files
As a matter of interest, what baseurl are you using for freshrpms?
- K
On Thu, 2005-09-08 at 09:44 +0900, Dave Gutteridge wrote:
As a matter of interest, what baseurl are you using for freshrpms?
I don't know how to answer this. What do you mean by "baseurl"?
---- # grep http /etc/yum.repos.d/fedora.repo #baseurl=http://download.fedora.redhat.com/pub/fedora/linux/core/$releasever/$basearc... mirrorlist=http://fedora.redhat.com/download/mirrors/fedora-core- $releasever
Craig
As for finding out what my "baseurl" is, I'm not sure I'm doing the right thing. I tried the command suggested, and I got the following result: [root@localhost ~]# grep http /etc/yum.repos.d/fedora.repo grep: /etc/yum.repos.d/fedora.repo: No such file or directory
So I thought I'd look in my repos directory, but none of the files there are for fedora or freshrpms. [root@localhost ~]# cd /etc/yum.repos.d [root@localhost yum.repos.d]# ls CentOS-Base.repo kbsingh-CentOS-Extras.repo output.txt dag.repo kbsingh-CentOS-Misc.repo
Also, based on what was suggested to me in this thread, I tried running 'shutdown now -Fr', and on reboot, the system did some kind of check, but returned no errors or anything. It kind of flickered to my GUI log in screen before I could see anything come of it.
Then I tried running fsck on var as was also suggested. But that produced some kind of error: [root@localhost ~]# fsck /var fsck 1.35 (28-Feb-2004) e2fsck 1.35 (28-Feb-2004) fsck.ext2: Is a directory while trying to open /var
The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 8193 <device>
And all this reminded me, I have a little extra space on my disk which is not partitioned: Disk /dev/hdb: 30.7 GB, 30738677760 bytes 255 heads, 63 sectors/track, 3737 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/hdb1 * 1 3644 29270398+ 83 Linux /dev/hdb2 3645 3737 747022+ f W95 Ext'd (LBA)
Should I turn that into a Linux swap, and might that help with this YUM issue? A guy on this list gave me some instructions for doing that, but I got this error from fdisk: [root@localhost yum.repos.d]# fdisk /dev/hdb The number of cylinders for this disk is set to 3737. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK) Command (m for help): t Partition number (1-5): 2 Hex code (type L to list codes): 82 You cannot change a partition into an extended one or vice versa Delete it first.
Dave
On Thu, 2005-09-08 at 11:03 +0900, Dave Gutteridge wrote:
As for finding out what my "baseurl" is, I'm not sure I'm doing the right thing. I tried the command suggested, and I got the following result: [root@localhost ~]# grep http /etc/yum.repos.d/fedora.repo grep: /etc/yum.repos.d/fedora.repo: No such file or directory
So I thought I'd look in my repos directory, but none of the files there are for fedora or freshrpms. [root@localhost ~]# cd /etc/yum.repos.d [root@localhost yum.repos.d]# ls CentOS-Base.repo kbsingh-CentOS-Extras.repo output.txt dag.repo kbsingh-CentOS-Misc.repo
---- sorry - I should have been more explicit - I guess I thought you would figure it out from my hint since I don't have freshrpm's in my repos (remember, I don't use yum but use smartpm instead)...
# grep http /etc/yum.repos.d/* /etc/yum.repos.d/dag.repo:baseurl=http://apt.sw.be/fedora/$releasever/en/$basearch/dag /etc/yum.repos.d/fedora- devel.repo:#baseurl=http://download.fedora.redhat.com/pub/fedora/linux/core/development/$basearc... /etc/yum.repos.d/fedora- devel.repo:mirrorlist=http://fedora.redhat.com/download/mirrors/fedora- core-rawhide /etc/yum.repos.d/fedora.repo:#baseurl=http://download.fedora.redhat.com/pub/fedora/linux/core/$releasever/$basearc... /etc/yum.repos.d/fedora.repo:mirrorlist=http://fedora.redhat.com/download/mirrors/fedora-core-$releasever /etc/yum.repos.d/fedora- updates.repo:#baseurl=http://download.fedora.redhat.com/pub/fedora/linux/core/updates/$releasever/... /etc/yum.repos.d/fedora- updates.repo:mirrorlist=http://fedora.redhat.com/download/mirrors/updates-released-fc$releasever /etc/yum.repos.d/fedora-updates- testing.repo:#baseurl=http://download.fedora.redhat.com/pub/fedora/linux/core/updates/testing/$rel... /etc/yum.repos.d/fedora-updates- testing.repo:mirrorlist=http://fedora.redhat.com/download/mirrors/updates-testing-fc$releasever
which spits them all out - or a more specific grep for freshrpms out of this list... # grep http /etc/yum.repos.d/* |grep freshrpms
which of course means that I don't have freshrpms configured.
and somewhere I missed the part of the thread that caused whomever to ask you which base url you are using for freshrpms. ----
Also, based on what was suggested to me in this thread, I tried running 'shutdown now -Fr', and on reboot, the system did some kind of check, but returned no errors or anything. It kind of flickered to my GUI log in screen before I could see anything come of it.
---- well - it hurt nothing to check ----
Then I tried running fsck on var as was also suggested. But that produced some kind of error: [root@localhost ~]# fsck /var fsck 1.35 (28-Feb-2004) e2fsck 1.35 (28-Feb-2004) fsck.ext2: Is a directory while trying to open /var
The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 8193 <device>
---- fsck should only check NON-MOUNTED partitions anyway - since you said /var isn't on it's own partition, you really have no opportunity to fsck /var except to 'shutdown now -Fr' ----
And all this reminded me, I have a little extra space on my disk which is not partitioned: Disk /dev/hdb: 30.7 GB, 30738677760 bytes 255 heads, 63 sectors/track, 3737 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/hdb1 * 1 3644 29270398+ 83 Linux /dev/hdb2 3645 3737 747022+ f W95 Ext'd (LBA)
Should I turn that into a Linux swap, and might that help with this YUM issue? A guy on this list gave me some instructions for doing that, but I got this error from fdisk: [root@localhost yum.repos.d]# fdisk /dev/hdb The number of cylinders for this disk is set to 3737. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with:
- software that runs at boot time (e.g., old versions of LILO)
- booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): t Partition number (1-5): 2 Hex code (type L to list codes): 82 You cannot change a partition into an extended one or vice versa
---- if you are sure that hdb2 has nothing of value on it, all you need to do is to delete it and then create the new swap partition. fdisk is complaining that it can't convert an extended dos partition. Delete it and it's gone. How much ram do you have in that computer? I would have created a separate /boot (approx 100 megabytes) and a separate swap partition (probably 2 * the amount of ram depending upon how much ram). You might simply be running out of memory - swap is sort of needed for systems with less than 512 Mb of real ram - especially one running desktop apps.
Craig
Dave Gutteridge wrote:
As a matter of interest, what baseurl are you using for freshrpms?
I don't know how to answer this. What do you mean by "baseurl"?
Every repository that you use, has a 'baseurl', that points to the 'base' of that repository. yum uses this information to get the list of packages, updates etc from that repository.
eg. the baseurl for kbs-CentOS-Extras is : baseurl=http://centos.karan.org/el$releasever/extras/stable/$basearch/RPMS/
There will be atleast 1 such entry for every repository you have configured on the system. On CentOS4, repositories can be configured either in /etc/yum.conf or in /etc/yum.repos.d/ (where the file names end with .repo ), each repository information beings with a [name of repo]
So if you look through both the places ( /etc/yum.conf and the files in /etc/yum.repos.d/ ) you should be able to find what baseurl is being used for freshrpms.
Reason why I ask is that freshrpms dont host packages that are usable on any CentOS version. If you are pulling in packages from there....
- K
Reason why I ask is that freshrpms dont host packages that are usable on any CentOS version. If you are pulling in packages from there....
I thought CentOS 4 was equal to Red Hat Enterprise 4, and freshrpms seems to have RPMS for Enterprise 4.
Actually, this leads me to a question I wanted to ask anyway. I've been advised here before that I should always try to install things through Yum or RPMs. I've been told that installing from tar files and building from source is a bad idea.
But what does one do when there is software that is readily available for other Linux distros, and is theoretically buildable for CentOS, but not available from the standard Yum repositories like Dag and CentOS's own? Is one supposed to contact the repository owner and request the software be included?
Dave
On Thu, 8 Sep 2005, Dave Gutteridge wrote:
Actually, this leads me to a question I wanted to ask anyway. I've been advised here before that I should always try to install things through Yum or RPMs. I've been told that installing from tar files and building from source is a bad idea.
I felt more comfortable if you also mentioned why it is a bad idea.
If you install files from a tarball or build from source on a system, you have no log of what files have been placed where. With RPM you can at all time query rpmdb to see if a file has been changed, from what package it is, who needs that package, etc...
If you build or install RPM packages, the install process _relies_ on this information and requires it to be correct. But if you installed something from a tarball (and eg. replaced a file that came from an RPM) you introduce an inconsistency which is untraceable.
A good example which appears from time to time is when you used CPAN to install perl modules, and then install an RPM package that requires these modules. RPM doesn't know about the CPAN installed packages and complains the required perl modules are not installed (via RPM) even though you installed them by hand.
Another big reason NOT to install stuff by hand is that you loose a big advantage in the fact that you system will not be similar to someone else's system. If you eg. start replacing libraries and suddenly something doesn't work, most people will not help you since they can not repoduce the problem. Your system is uniquely crippled :)
The same is true when you use packages that were not built or designed for you distribution/release. Or when you --force or --nodeps a package.
But what does one do when there is software that is readily available for other Linux distros, and is theoretically buildable for CentOS, but not available from the standard Yum repositories like Dag and CentOS's own? Is one supposed to contact the repository owner and request the software be included?
Package it yourself or find someone to package it for you. Sometimes people will give you a much better alternative that is already packaged. You can always contribute a self-made package to RPMforge or request something to be packaged at: suggest@lists.rpmforge.net
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
On Thu, 2005-09-08 at 22:46 +0900, Dave Gutteridge wrote:
Actually, this leads me to a question I wanted to ask anyway. I've been advised here before that I should always try to install things through Yum or RPMs. I've been told that installing from tar files and building from source is a bad idea.
But what does one do when there is software that is readily available for other Linux distros, and is theoretically buildable for CentOS, but not available from the standard Yum repositories like Dag and CentOS's own? Is one supposed to contact the repository owner and request the software be included?
I use the checkinstall utility. Make your own rpms out of the tar source and then keep them updated and managed through yum or whatever.
On Wed, 2005-09-07 at 12:00 +0900, Dave Gutteridge wrote:
I'd like to comment on this list to say that despite earlier reports saying that I had fixed my YUM problems, it still freezes my computer every now and again. I've been told this might be because I have too little free space on my hard drive, so I cleared over 9 gigs, and it still freezes sometimes. I was told my RPM database was screwed, so I rebuilt the database and it still freezes.
I'm not sure what the problem is now, but I do know that on the one hand YUM seems like a great idea, and when it does work, I think it's really cool. But something is not right with it, and I don't even have error messages which let me know what went wrong, so I can't say that I would recommend YUM.
Overall it sounds like this is a problem specific to your system. As such it is not likely to be a yum problem.
I'm not defending yum. It's just that the assumptions you make to get to your conclusion of not being able to recommend yum are flawed and unlikely to apply to anyone except yourself.
Dag Wieers wrote:
On Tue, 6 Sep 2005, Tim Edwards wrote:
John Newbigin wrote:
I also think yum is too slow.
I'm glad I'm not the only one who thinks that. After using urpmi at home on Mandrake with more than 11000 packages in the configured repos I can say that yum is definately far slower with only ~4000 packages in configured repos. The whole "Setting up Repos" and "Reading repository metadata in from local files" is a real drag.
Check if you have enough memory in the system. I experienced that Yum needs more than 192MB of RAM.
1GB of RAM. Its not a major problem its just a delay that other package management systems don't seem to have.
On 9/5/05, Tim Edwards tim@registriesltd.com.au wrote:
John Newbigin wrote:
I also think yum is too slow.
I'm glad I'm not the only one who thinks that. After using urpmi at home on Mandrake with more than 11000 packages in the configured repos I can say that yum is definately far slower with only ~4000 packages in configured repos. The whole "Setting up Repos" and "Reading repository metadata in from local files" is a real drag.
That is true, but in yum2.4 (soon to be available for CentOS4, or you can build/install it yourself) this is greatly improved. Also, yum 2.4 includes the "shell" where yum reads in the information, gets itself ready, and then you can ask it to do a whole bunch of stuff without having to ask for it all in advance on the command line. It's pretty neat.
Regards, Greg
On Wed, 2005-09-07 at 11:53, Greg Knaddison wrote:
That is true, but in yum2.4 (soon to be available for CentOS4, or you can build/install it yourself) this is greatly improved. Also, yum 2.4 includes the "shell" where yum reads in the information, gets itself ready, and then you can ask it to do a whole bunch of stuff without having to ask for it all in advance on the command line. It's pretty neat.
It's always painful when utilities add the 'pretty neat' stuff _after_ the scripts that use them have been written. Is there a way to tell yum that you don't care about anything that changed since the last time you ran it (a few seconds ago) and it doesn't have to do all that work again? Or better yet, a way to tell it that you don't _want_ it to consider anything that changed since you did an update on a different machine and you want it now to apply exactly the same changes on an important machine that you tested elsewhere (preferably pulling from exactly the same repository mirror or using some transaction checkpoint to ensure an identical operation).
On 9/7/05, Les Mikesell lesmikesell@gmail.com wrote:
On Wed, 2005-09-07 at 11:53, Greg Knaddison wrote:
That is true, but in yum2.4 (soon to be available for CentOS4, or you can build/install it yourself) this is greatly improved. Also, yum 2.4 includes the "shell" where yum reads in the information, gets itself ready, and then you can ask it to do a whole bunch of stuff without having to ask for it all in advance on the command line. It's pretty neat.
It's always painful when utilities add the 'pretty neat' stuff _after_ the scripts that use them have been written.
I'm not sure what you mean here.
Is there a way to tell yum that you don't care about anything that changed since the last time you ran it (a few seconds ago) and it doesn't have to do all that work again?
No.
Or better yet, a way to tell it that you don't _want_ it to consider anything that changed since you did an update on a different machine and you want it now to apply exactly the same changes on an important machine that you tested elsewhere (preferably pulling from exactly the same repository mirror or using some transaction checkpoint to ensure an identical operation).
As long as you use specific instructions to yum like "yum install foo.1-3.386" and you have a clean and simple set of conf/repos files, then yum will do a very specific thing. If you have multiple repositories in your configuration and you just say "yum update" then it might not behave exactly as you desire.
Greg
On Wed, 2005-09-07 at 22:39, Greg Knaddison wrote:
Or better yet, a way to tell it that you don't _want_ it to consider anything that changed since you did an update on a different machine and you want it now to apply exactly the same changes on an important machine that you tested elsewhere (preferably pulling from exactly the same repository mirror or using some transaction checkpoint to ensure an identical operation).
As long as you use specific instructions to yum like "yum install foo.1-3.386" and you have a clean and simple set of conf/repos files, then yum will do a very specific thing. If you have multiple repositories in your configuration and you just say "yum update" then it might not behave exactly as you desire.
If you managed a set of servers running homegrown code that may or may not be sensitive to library and utility program versions, what steps would you use to keep a test server up to date, then after performing any needed application testing, to roll out the same changes to the production servers in various different locations? The object is to install exactly the updates you just tested in spite of any subsequent repository changes or out-of-sync mirrors.
On Wed, 2005-09-07 at 23:21 -0500, Les Mikesell wrote:
On Wed, 2005-09-07 at 22:39, Greg Knaddison wrote:
Or better yet, a way to tell it that .... you want it now to apply exactly the same changes on an important machine that you tested elsewhere
As long as you use specific instructions to yum like "yum install foo.1-3.386" and you have a clean and simple set of conf/repos files, then yum will do a very specific thing. If you have multiple repositories in your configuration and you just say "yum update" then it might not behave exactly as you desire.
If you managed a set of servers running homegrown code that may or may not be sensitive to library and utility program versions, what steps would you use to keep a test server up to date, then after performing any needed application testing, to roll out the same changes to the production servers in various different locations?
I would use yum to keep the testing system up to date, and use rsync to copy specific RPM packages to the production server. I would not install yum on the production server at all if you need to control things tightly.
yum keeps the RPMs in /var/cache/yum/*/packages/.
Also remember that you need to refresh your testing platform regularly. By this, I mean that you should boot your test system from a CD, completely clear its disk, and rsync an exact image of your production server over. Otherwise, you risk having the two drift apart over time.
When I worked for a large financial institution, changes were tested locally by the developers, rolled to a "testing" server where they were tested by others, then rolled to a "staging" server where they were subjected to a large battery of automated tests, and then finally rolled to production. Rolls always took place on Friday night. Once a roll was declared a success, the testing and staging servers were erased and replaced with near-perfect images of the production servers (the only difference was that the data sets were trimmed so that only the first gig of each table was moved). The testing software was kept in /usr/local, and production servers where not permitted to have anything in /usr/local. We used to complain that it would take a month to fix a comma, but it did keep us out of trouble. ;-)
-David
On Wed, 2005-09-07 at 23:56, David Johnston wrote:
If you managed a set of servers running homegrown code that may or may not be sensitive to library and utility program versions, what steps would you use to keep a test server up to date, then after performing any needed application testing, to roll out the same changes to the production servers in various different locations?
I would use yum to keep the testing system up to date, and use rsync to copy specific RPM packages to the production server.
That's pretty painful when the production servers have better connectivity to the internet than to the test location.
I would not install yum on the production server at all if you need to control things tightly.
Hmmm... that's not very complimentary of yum's capability. I'd really rather not need to know every single version number myself - just that they are all the same as had been tested.
On Wed, 2005-09-07 at 23:21 -0500, Les Mikesell wrote:
On Wed, 2005-09-07 at 22:39, Greg Knaddison wrote:
Or better yet, a way to tell it that you don't _want_ it to consider anything that changed since you did an update on a different machine and you want it now to apply exactly the same changes on an important machine that you tested elsewhere (preferably pulling from exactly the same repository mirror or using some transaction checkpoint to ensure an identical operation).
As long as you use specific instructions to yum like "yum install foo.1-3.386" and you have a clean and simple set of conf/repos files, then yum will do a very specific thing. If you have multiple repositories in your configuration and you just say "yum update" then it might not behave exactly as you desire.
If you managed a set of servers running homegrown code that may or may not be sensitive to library and utility program versions, what steps would you use to keep a test server up to date, then after performing any needed application testing, to roll out the same changes to the production servers in various different locations? The object is to install exactly the updates you just tested in spite of any subsequent repository changes or out-of-sync mirrors.
You would run a local mirror that only had the updates you tested on it :)
On Thu, 2005-09-08 at 07:11, Johnny Hughes wrote:
If you managed a set of servers running homegrown code that may or may not be sensitive to library and utility program versions, what steps would you use to keep a test server up to date, then after performing any needed application testing, to roll out the same changes to the production servers in various different locations? The object is to install exactly the updates you just tested in spite of any subsequent repository changes or out-of-sync mirrors.
You would run a local mirror that only had the updates you tested on it :)
Local to what? The production boxes are distributed but have good internet connectivity. The test box only has so-so internet connectivity. Isn't having to do that an admission that yum doesn't really do a good job of managing the packages you want on a box?
Actually I think some invocation of rpm -q will give a list of installed packages that you can feed to yum to install on another machine, but it is not at all intuitive. Don't the people writing package management tools actually manage any machines or understand that keeping them identical is desirable?
-- Les Mikesell lesmikesell@gmail.com
On Thu, 2005-09-08 at 07:31 -0500, Les Mikesell wrote:
On Thu, 2005-09-08 at 07:11, Johnny Hughes wrote:
If you managed a set of servers running homegrown code that may or may not be sensitive to library and utility program versions, what steps would you use to keep a test server up to date, then after performing any needed application testing, to roll out the same changes to the production servers in various different locations? The object is to install exactly the updates you just tested in spite of any subsequent repository changes or out-of-sync mirrors.
You would run a local mirror that only had the updates you tested on it :)
Local to what?
Setup your own mirror and access it via ftp, http, nfs or whatever...
The production boxes are distributed but have good internet connectivity. The test box only has so-so internet connectivity. Isn't having to do that an admission that yum doesn't really do a good job of managing the packages you want on a box?
No, this is clearly an admission on your part that you don't know how yum works, or how to setup your own repo.
Don't the people writing package management tools actually manage any machines or understand that keeping them identical is desirable?
--jesse
On Thu, 2005-09-08 at 08:17, Jesse wrote:
No, this is clearly an admission on your part that you don't know how yum works, or how to setup your own repo.
It is more of a mental block regarding the inefficiency of having to store a snapshot of a massive repository just to be able to copy a few of the individual files correctly.
How do I tell if my snapshot copy is consistent or if incomplete changes were being made as I copied it?
Les Mikesell wrote:
On Thu, 2005-09-08 at 07:11, Johnny Hughes wrote:
If you managed a set of servers running homegrown code that may or may not be sensitive to library and utility program
You would run a local mirror that only had the updates you tested on it :)
Local to what? The production boxes are distributed but have good internet connectivity. The test box only has so-so internet
local to your organisation. If you do have good connectivity, forking out for an extra role machine should not be an issue. I fail to see how bandwidth has anything to do with yum. no matter what package manager you use, you are going to need to pull down the same packages....
Any system admin who needs such solid control on the package tree's will host his/her own repository of packages, sometimes even a bunch ( eg. based on intended system role ) and run an automated delivery mechanism.
Also, are you saying that you admin a large number of machines, and dont actually test the packages before they are rolled out ?
connectivity. Isn't having to do that an admission that yum doesn't really do a good job of managing the packages you want on a box?
there is yumlib, now available. Feel free to hack away.... A lot of the 'home grown' scripts that I have seen out there are screen-scrapers.... which by default, are locked into the specific version/setup of yum anyway. Its stupid to think that those sort of scripts are ever going to maintain functionality across versions.
btw, you could always go 'basic' and use a bunch of includepkgs and excludepkgs directives in your yum.conf :) and have that rsync over to all the various machines on a scheduled basis.
Actually I think some invocation of rpm -q will give a list of installed packages that you can feed to yum to install on
you mean something like
rpm -qp *.rpm --qf "%{name}.%{arch} "
which should give you a list of packages... easy to feed that to a yum install process...
another machine, but it is not at all intuitive. Don't the people writing package management tools actually manage any machines or understand that keeping them identical is desirable?
If a couple of hundred machines counts as 'machines', then yes - the people I know - working on these package management tools do indeed manage machines.
perhaps, this conversation needs to move to the yum-devel list, where you can then recommend ways to make yum more user-friendly.
- K
On Thu, 2005-09-08 at 09:05, Karanbir Singh wrote:
Local to what? The production boxes are distributed but have good internet connectivity. The test box only has so-so internet
local to your organisation. If you do have good connectivity, forking out for an extra role machine should not be an issue. I fail to see how bandwidth has anything to do with yum. no matter what package manager you use, you are going to need to pull down the same packages....
Any system admin who needs such solid control on the package tree's will host his/her own repository of packages, sometimes even a bunch ( eg. based on intended system role ) and run an automated delivery mechanism.
But you wouldn't need that if the package manager could actually manage packages.
Also, are you saying that you admin a large number of machines, and dont actually test the packages before they are rolled out ?
I provide the QA people with a machine with the latest updates and when they say everything works I try to duplicate those updates into the production boxes. As you imply, this is something nearly everyone has to do, so I find it surprising that the package management tools don't give you a simple way to do it. Also note that there are as many risks in waiting for testing and QA approval as not and you have to balance them. For example, I'd probably roll out any available update to openssh to internet exposed boxes as fast as possible since that is extremely unlikely to affect our services and not doing it invites a compromise. Besides, as the steady stream of security and bugfix updates to a supposedly stable OS distribution demonstrates, there will always be issues that you don't find until you have something in real-world production. So, I consider the 'real' test to be the first small set of of production boxes that are updated after QA's blessing and watch for problems before rolling out to the rest.
connectivity. Isn't having to do that an admission that yum doesn't really do a good job of managing the packages you want on a box?
there is yumlib, now available. Feel free to hack away.... A lot of the 'home grown' scripts that I have seen out there are screen-scrapers.... which by default, are locked into the specific version/setup of yum anyway. Its stupid to think that those sort of scripts are ever going to maintain functionality across versions.
There is a basic concept missing to provide what you get by making a snapshot of the repository. Consider the repository as a database and then consider whether making a snapshot of an entire database at every operation is the best way to get repeatable operations. What it needs is for the repository index (only) to have the snapshot operation after changes are complete so that no additions are considered until the set is complete and you know all dependencies are available and by keeping prior indexes and being able to specify something to identify them in the update command you could precisely repeat the set of changes that happened on a previous date. This presents a small problem with mirror operations since you have to ensure that all other files are present before the index is updated.
Actually I think some invocation of rpm -q will give a list of installed packages that you can feed to yum to install on
you mean something like
rpm -qp *.rpm --qf "%{name}.%{arch} "
which should give you a list of packages... easy to feed that to a yum install process...
I think that might be a reasonable starting point. But it doesn't solve the problem of attempting updates as a repository is being modified. You have to work around that even if you do your own snapshot copies.
another machine, but it is not at all intuitive. Don't the people writing package management tools actually manage any machines or understand that keeping them identical is desirable?
If a couple of hundred machines counts as 'machines', then yes - the people I know - working on these package management tools do indeed manage machines.
Do they do it by working around the package management tool's problems with huge repository snapshots?
perhaps, this conversation needs to move to the yum-devel list, where you can then recommend ways to make yum more user-friendly.
I've mentioned it on the RPM list and a yum developer responded, but I don't think I got across how desirable it would be to have yum operations be repeatable and reliable in spite of the inconsistencies during repository updates. I think I need a better way to explain the similarity to a CVS repository and the equivalent need to be able to extract any consistent set of revisions.
Les Mikesell wrote: [snip]
I think that might be a reasonable starting point. But it doesn't solve the problem of attempting updates as a repository is being modified. You have to work around that even if you do your own snapshot copies.
If you run the repo, you can control it so machines aren't trying to update during the (very short time) that you are running createrepo.
If you don't run your own repo, then all bets are off. Of course, if you don't run your own repo, how are you going to be sure the older version of any give package is there when you want to install it?
On Thu, 2005-09-08 at 11:09, William Hooper wrote:
[snip]
I think that might be a reasonable starting point. But it doesn't solve the problem of attempting updates as a repository is being modified. You have to work around that even if you do your own snapshot copies.
If you run the repo, you can control it so machines aren't trying to update during the (very short time) that you are running createrepo.
If you don't run your own repo, then all bets are off. Of course, if you don't run your own repo, how are you going to be sure the older version of any give package is there when you want to install it?
Does anyone delete released packages from repositories? Regardless, they have to come from somewhere. How do you know that the 'somewhere' that you use isn't inconsistent at the time you take your copy of the contents? Is there a test to make sure that all possible dependencies can be resolved within a repository - and is that enough to know that the snapshot you took is actually what the OS developers intended to be used together? I'm looking for something like a tag that can be applied to a CVS repository that would be applied by someone who knows the state is consistent and can be used by anyone else to retrieve exactly that state regardless of ongoing changes.
Les Mikesell wrote:
On Thu, 2005-09-08 at 11:09, William Hooper wrote:
[snip]
I think that might be a reasonable starting point. But it doesn't solve the problem of attempting updates as a repository is being modified. You have to work around that even if you do your own snapshot copies.
If you run the repo, you can control it so machines aren't trying to update during the (very short time) that you are running createrepo.
If you don't run your own repo, then all bets are off. Of course, if you don't run your own repo, how are you going to be sure the older version of any give package is there when you want to install it?
Does anyone delete released packages from repositories?
I know there have been more than two errata versions of httpd for RHEL 3, but I only see two in the CentOS updates repo.
Regardless, they have to come from somewhere. How do you know that the 'somewhere' that you use isn't inconsistent at the time you take your copy of the contents? Is there a test to make sure that all possible dependencies can be resolved within a repository
Yum-utils has repoclosure.
- and is that enough to know that the
snapshot you took is actually what the OS developers intended to be used together?
I would assume that any issues of that nature would come out in QA.
I'm looking for something like a tag that can be applied to a CVS repository that would be applied by someone who knows the state is consistent and can be used by anyone else to retrieve exactly that state regardless of ongoing changes.
Again, if you run the repository, you get to decide on the changes. One approach would be to create multiple repos at different stages (using hardlinks as needed to reduce disk space). OTOH, I personally can't see a need for more than "testing" and "stable" repos. If a machine falls out of these two categories for any reason, it probably needs to be handled more hands on anyway, and would get the needed exclusions in the config files.
On Thu, 2005-09-08 at 11:53, William Hooper wrote:
Does anyone delete released packages from repositories?
I know there have been more than two errata versions of httpd for RHEL 3, but I only see two in the CentOS updates repo.
I think Centos is a special case where the repositories shift for the point releases and only contain the updates past the versions on the rebuilt isos. I've had a few surprises from this, but they would not have been reduced by having local snapshots.
I'm looking for something like a tag that can be applied to a CVS repository that would be applied by someone who knows the state is consistent and can be used by anyone else to retrieve exactly that state regardless of ongoing changes.
Again, if you run the repository, you get to decide on the changes.
But you would have to install all of your own spec files and build all the rpms so you control the dependencies for that to help. The point of using a distribution that has official updates is to let someone else do that. I just want my update tool to be able to know when they have completed a consistent set of those changes - or that the mirror I'm pulling from has the consistent set available.
One approach would be to create multiple repos at different stages (using hardlinks as needed to reduce disk space). OTOH, I personally can't see a need for more than "testing" and "stable" repos. If a machine falls out of these two categories for any reason, it probably needs to be handled more hands on anyway, and would get the needed exclusions in the config files.
Personally, I think 'stable' is exactly the wrong term for the one that is missing known bug and security fixes. 'Broken' and 'fixed' might be better descriptions if they are in different states. The point is that it is almost always better to have the updates than not have them. The people going to the trouble of making them generally know what they are doing. I just want to avoid surprises like you get in the middle of a repository update or when mirrors aren't in sync.
On Thu, 2005-09-08 at 12:35 -0500, Les Mikesell wrote:
On Thu, 2005-09-08 at 11:53, William Hooper wrote:
Does anyone delete released packages from repositories?
No ... but, we only maintain the latest released ISO (centos point release) and updates to it in the main-line tree.
This is the one that we push all over the world. It is about 50GB.
There is also a http://vault.centos.org/ where all the old stuff is located. That one does not get mirrored everywhere, but has old (archived) software and ISOs. (This one is about 150GB and growing ... and we would have problems having it mirrored everywhere).
So, all the old packages (anything released by CentOS), is available via vault.centos.org.
I know there have been more than two errata versions of httpd for RHEL 3, but I only see two in the CentOS updates repo.
I think Centos is a special case where the repositories shift for the point releases and only contain the updates past the versions on the rebuilt isos. I've had a few surprises from this, but they would not have been reduced by having local snapshots.
I'm looking for something like a tag that can be applied to a CVS repository that would be applied by someone who knows the state is consistent and can be used by anyone else to retrieve exactly that state regardless of ongoing changes.
Again, if you run the repository, you get to decide on the changes.
But you would have to install all of your own spec files and build all the rpms so you control the dependencies for that to help. The point of using a distribution that has official updates is to let someone else do that. I just want my update tool to be able to know when they have completed a consistent set of those changes - or that the mirror I'm pulling from has the consistent set available.
Yum is quite capable of doing what you want to do.
You just can't use the feature update or upgrade, instead you can pass in a version number when doing installs.
yum install kernel-2.6.9-5.0.5.EL
will install that kernel version.
You will have issues if we move to a new point release, because you might not have access to that kernel any longer, except via vault.centos.org
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there. (You do not need to build anything from SRPMS). You make that accessible from the web and run createrepo on it.
You point your yum to it from all your machines.
You only put authorized RPMS in there, and you rerun createrepo every time you put a new RPM in there.
You run auto YUM updates on your machines, pointing to your repo, where you only put the RPMS that you are happy with.
You didn't have to build anything, nor do you have any problems with RPMS not being available.
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there. (You do not need to build anything from SRPMS). You make that accessible from the web and run createrepo on it.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises. That means I need to copy that whole repository (of a size you said was such a problem mirroring that you had to break it at the point releases) and repeat the copy for every state where I might want repeatable updates or I have to track every change. I do realize that both of these options are possible, I just don't see why anyone considers them desirable. Compare it to how you get a set of consistent updates from a cvs repository where someone has tagged the 'known good' states as the changes were added.
You only put authorized RPMS in there, and you rerun createrepo every time you put a new RPM in there.
Normally I'll want to mirror the official repository to get the set for testing. How do I know when you are finished doing your updates so that I don't get an rpm with a dependency that you haven't copied in yet? Or if I'm mirroring some other mirror, how do I know their full set is consistent? I hit problems like that using yum directly - what will be different if I make a snapshot at the wrong time?
You run auto YUM updates on your machines, pointing to your repo, where you only put the RPMS that you are happy with.
That's the 2nd step. I don't know I'm happy with them until I've applied them, so this copy has to co-exist with other copies and have separate versions for x86/x86_64, etc.
On Thu, Sep 08, 2005 at 02:12:15PM -0500, Les Mikesell wrote:
every change. I do realize that both of these options are possible, I just don't see why anyone considers them desirable. Compare it to how you get a set of consistent updates from a cvs repository where someone has tagged the 'known good' states as the changes were added.
You might be able to do something similar with LVM snapshots. Not sure how long those are supposed to be kept around for before they cause problems, though.
Les Mikesell lesmikesell@gmail.com wrote:
That means I need to copy that whole repository (of a size you said was such a problem mirroring that you had to break it at the point releases) and repeat the copy for every state where I might want repeatable updates
There is a difference between "mirroring" (like between servers, disks, etc...) and "copying" (like on the same, single server ;-).
Symlinks (same server/mounts) and hard links (same filesystem) are excellent options for APT/YUM repositories. ;->
I just don't see why anyone considers them desirable.
I have to agree with someone else's post ... simple put (in my own words) ...
What is it that you don't understand about the "costs" of configuration management?
If there is one place _no_ OS can save on, it's configuration management. _No_ distro can solve the configuration management requirement. But at least with UNIX and UNIX-like systems, especially Linux, it is "piecemeal" and "well packaged" enough that it really makes it much, much easier to deploy.
Compare it to how you get a set of consistent updates from a cvs repository where someone has tagged the 'known good' states as the changes were added.
Who says you can_not_ just use RCS/CVS/Subversions to track changes to your test system's RPM database and feed those into YUM ... HMMMMMM?!?!?! (hint, hint, hint, big-@$$ hint ;-).
That's _exactly_ what I do with CVS.
I have a "complete" repository. I then check out select packages/updates to a "test" system. When I have the "test" system to my liking something to my liking, I then do a listing of all RPMs on that system into a file.
I commit that file of RPM package-ver into CVS with a tag.
I then check that file out in another repository location. Then I use that file to create hard links in that new repository with 1 command that I write on the command line -- an ultra-simple for loop: for pkg in `cat rpm.list`; do ln ${YUMMY}/${pkg} .; done
Where YUMMY (YUM, MY) = My complete YUM repository
Bam! I have an exact configuration ready installs, updates, etc... I don't have to worry about any inconsistencies, I already did an update on a test system and it worked, and that's _exactly_ the packages I'm making available in that "production" YUM repository.
Normally I'll want to mirror the official repository to get the set for testing. How do I know when you are finished doing your updates so that I don't get an rpm with a dependency that you haven't copied in yet? Or if I'm mirroring some other mirror, how do I know their full set is consistent? I hit problems like that using yum directly - what will be different if I make a snapshot at the wrong time?
Don't know, but you _can_ mitigate the risk by maintaining a full repository internally, and linking only select file listings that have been tested/resolved on a test system.
That's the 2nd step. I don't know I'm happy with them until I've applied them, so this copy has to co-exist with other copies and have separate versions for x86/x86_64,
etc.
Separate x86_64 and ix86 is not an issue at all.
Now knowing what ix86 packages to use in an x86_64 system is a different story. If I was in charge of the Fedora Project, it would be my #1 priority to get this addressed in Fedora Core because they are making some very poor choices IMHO (largely on browser/multimedia).
On Thu, 2005-09-08 at 14:30, Bryan J. Smith wrote:
I have to agree with someone else's post ... simple put (in my own words) ...
What is it that you don't understand about the "costs" of configuration management?
The part I don't understand is why the tool built for the purpose doesn't do what everyone needs it to do. Is that simple enough? Yes, I know I can build my own system. I know there are workarounds. I'd rather not.
Compare it to how you get a set of consistent updates from a cvs repository where someone has tagged the 'known good' states as the changes were added.
Who says you can_not_ just use RCS/CVS/Subversions to track changes to your test system's RPM database and feed those into YUM ... HMMMMMM?!?!?! (hint, hint, hint, big-@$$ hint
I could keep rolling my own tarballs like I used to also. The question is why everyone who is responding thinks it is a good thing or at least expected that a new system designed to manage packages does not do the simple and needed thing that cvs has always done to make it possible to make updates consistent and repeatable out of a changing repository.
Les Mikesell lesmikesell@gmail.com wrote:
The part I don't understand is why the tool built for the purpose doesn't do what everyone needs it to do.
Like? I'm still trying to find your explanation on this. So far, you talked about breaking out SPEC files. (WTF?)
Is that simple enough?
No. Specifics would be very nice.
Yes, I know I can build my own system.
There is _no_ system to build. I use RPM and YUM directly to do _all_ I need. I don't know why you even discussed SPEC files.
Now if you want a way to build a tree of RPM dependencies, there's several tools for that. But again, a test system and it's RPM database is all I need to get a package listing.
I know there are workarounds. I'd rather not.
Like? If you'd explain them, I could understand you better.
I could keep rolling my own tarballs like I used to also.
??? For what ???
The question is why everyone who is responding thinks it is a good thing or at least expected that a new system
designed
to manage packages does not do the simple and needed thing that cvs has always done to make it possible to make updates consistent and repeatable out of a changing repository.
First off, you're talking about atomicity. That has *0* to do with the tools on the client, and 100% to do with the server. I think you're completely on the _wrong_track_ there, hence your confusion.
Secondly, you _do_ have atomicity _if_ you generate the YUM repository meta-data _after_ you upload the files. How the site updates the YUM listing is more of an issue with their server procedures, _not_ the tools.
YUM repositories are just a package list and related meta-data which point to files. The atomicity happens at the server, on how that package list is updated.
On Thu, 2005-09-08 at 16:05 -0500, Les Mikesell wrote:
On Thu, 2005-09-08 at 14:30, Bryan J. Smith wrote:
I have to agree with someone else's post ... simple put (in my own words) ...
What is it that you don't understand about the "costs" of configuration management?
The part I don't understand is why the tool built for the purpose doesn't do what everyone needs it to do. Is that simple enough? Yes, I know I can build my own system. I know there are workarounds. I'd rather not.
Yum is not designed for configuration management ... unless you want to update to the latest releases in the repo. In that case, it works perfectly.
If you want to artificially place limits on the repo, (ie ... this is stable, this is not) ... then you have to create your own repo.
When we release an update, it is considered stable by RedHat. Up you run up2date from RedHat, it will update you to the exact same versions that you get from CentOS.
I am not understanding what your issue with the repos or the tool is.
Yum update works exactly like up2date -u from RHEL ... you update something and it is updated.
Compare it to how you get a set of consistent updates from a cvs repository where someone has tagged the 'known good' states as the changes were added.
Who says you can_not_ just use RCS/CVS/Subversions to track changes to your test system's RPM database and feed those into YUM ... HMMMMMM?!?!?! (hint, hint, hint, big-@$$ hint
I could keep rolling my own tarballs like I used to also. The question is why everyone who is responding thinks it is a good thing or at least expected that a new system designed to manage packages does not do the simple and needed thing that cvs has always done to make it possible to make updates consistent and repeatable out of a changing repository.
On Thu, 2005-09-08 at 16:24, Johnny Hughes wrote:
What is it that you don't understand about the "costs" of configuration management?
The part I don't understand is why the tool built for the purpose doesn't do what everyone needs it to do. Is that simple enough? Yes, I know I can build my own system. I know there are workarounds. I'd rather not.
Yum is not designed for configuration management ... unless you want to update to the latest releases in the repo. In that case, it works perfectly.
What I want is to be able to update more than one machine and expect them to have the same versions installed. If that isn't a very common requirement I'd be very surprised.
If you want to artificially place limits on the repo, (ie ... this is stable, this is not) ... then you have to create your own repo.
When we release an update, it is considered stable by RedHat. Up you run up2date from RedHat, it will update you to the exact same versions that you get from CentOS.
I am not understanding what your issue with the repos or the tool is.
This isn't Centos-specific - I just rambled on from some other mention of it and apologize for dwelling on it here.
There are 2 separate issues: One is that yum doesn't know if a repository or mirror is consistent or in the middle of an update with only part of a set of RPM's that really need to be installed together.
The other is that if you update one machine and everything works, you have no reason to expect the same results on the next machine a few minutes later.
Both issues would be solved if there were some kind of tag mechanism that could be applied by the repository updater after all files are present and updates could be tied to earlier tags even if the repository is continuously updated.
I realize that yum doesn't do what I want - but lots of people must be having the same issues and either going to a lot of trouble to deal with them or just taking their chances.
-- Les Mikesell lesmikesell@gmail.com
Les Mikesell lesmikesell@gmail.com wrote:
What I want is to be able to update more than one machine and expect them to have the same versions installed. If that isn't a very common requirement I'd be very surprised.
So what you want to checkout the repository from a specific tag and/or date. So you want:
1. The repository to have every single package -- be it packages as whole, or some binary delta'ing between RPMs (if possible)
2. The repository meta-data to have all history so it can backtrack to any tag/date.
In other words, you want a repository to maintain storage and use CPU-I/O power to resolves tens of GBs of inter-related data and corresponding versioning meta-data.
BTW, Your comparison to CVS is extremely poor, so _stop_. ;-> I'm going to show you how in a moment.
APT, YUM and countless other package repositories store packages whole, with a "current state" meta-data list, and the packages and that meta-data is services via HTTP and the _client_ resolves what it wants to do.
What you want is a more "real-time" resolution logic "like CVS." That either requires:
A) A massive amount of data transfer if done at the client, or
B) A massive amount of CPU-I/O overhead if done at the server
Gettting to your piss-poor and inapplicable analogy to CVS, "A" is typically done either on _local_ disk or over a NFS mount, possibly a streamed RSH/SSH. In any case, "A" is almost always done locally -- at least when it comes to multiple-GBs of files. ;->
"B" is what happens when you run in pserver/kserver mode, and you now limit your transaction size. I.e., try checking in a 500MB file to a CVS pserver, and see how _slow_ it is.
In other words, what you want is rather impractical for a remote server _regardless_ if the server or client does it. Remember, we're talking GBs of files!
I see 2 evolutionary approaches to the problem.
1. Maintain multiple YUM repositories, even if all but the original are links to the original. The problem is this is who defines what the "original" is? That's why you should maintain your _own_, so it's what _you_ expect it to be.
2. Modify the YUM respository meta-data files so they store revisions, whereby each time createrepo is run, the meta-data is continuing list.
#1 is direct and practical. #2 adds a _lot_ to the initial query YUM does, and could push it from seconds to minutes or even _hours_ at the client (not to mention the increase in traffic). That's the problem.
This isn't Centos-specific - I just rambled on from some other mention of it and apologize for dwelling on it here. There are 2 separate issues: One is that yum doesn't know if a repository or mirror is consistent or in the middle of an update with only part of a set of RPM's that really need to be installed together.
Not true. The checks that createrepo does can prevent an update if there are missing dependencies. The problem is that most "automated" repos bypass those checks.
So, again, we're talking "repository management" issues and _not_ the tool itself.
The other is that if you update one machine and everything works, you have no reason to expect the same results on the
next machine a few minutes later.
Because there is not tagging/date facility. But to add that, you'd have to add either (again): 1. A _lot_ of traffic (client-based) 2. A _lot_ of CPU-I/O overhead (server-based)
Again, using your poor analogy to CVS, have you every done a checkout of a 500MB over the Internet -- using ssh or, God help you, pserver/kserver?
Both issues would be solved if there were some kind of tag mechanism that could be applied by the repository updater after all files are present and updates could be tied to earlier tags even if the repository is continuously
updated.
So, in other words, you want the client to get repository info in 15-30 minutes, instead of 15-30 seconds. ;->
Either that, or you want the server of the repository to deal with all that overhead, taking "intelligent requests" from clients, instead of merely serving via HTTP.
I realize that yum doesn't do what I want - but lots of people must be having the same issues and either going to a lot of trouble to deal with them or just taking their chances.
Or we do what we've _always_ done. We maintain _internal_ configuration management.
We maintain the "complete" repository, and then individual "tag/date" repositories of links.
Understand we are _not_ talking a few MB of source code that you resolve via CVS. We're talking GBs of binary packages.
You _could_ come up with a server repository solution using XDelta and a running journal for the meta-data. And after a few hits, the repository server would tank.
The alternative is for the server repository to just keep complete copies of all packages (which some do), but then keeping a running journal for the meta-data. But that would still require the client to either download/resolve a lot (taking 15-30 minutes, instead of 15-30 seconds), _or_ put that resolution back on the server.
_This_ is the point you keep missing. It's the load that is required to do what you want. Not just a few hundred developers moving around a few MBs of files, but _tens_ of _thousands_ of users accessing _GBs_ of binaries.
That's why you rsync the repository down, and you do that locally. There is no way to avoid that. Even Red Hat Network (RHN) and other solutions do that -- they have you mirror things locally, with resolution going on locally.
In other words, local configuration management. It's very easy to do with RPM and YUM. You can't "pass the buck" to the Internet repository. Red Hat doesn't even let its Enterprise customers do it, and they wouldn't want to either. They have a _local_ repository.
"Bryan J. Smith" b.j.smith@ieee.org wrote:
- Maintain multiple YUM repositories, even if all but the
original are links to the original. The problem is this is who defines what the "original" is? That's why you should maintain your _own_, so it's what _you_ expect it to be.
Er, change that 2nd statement: 'The problem is this is who defines what the "multiple" repositories are of?'
If you want it to "revision" a new repository and links everytime someone uploads _any_ new package update, that's going to be a PITA on the server end.
It's much better for _you_ to download the complete repository, and rsync regularly, and then _you_ decide what "snapshots" you want of the repository.
Versioned or even host specific repositories are not that hard (so to speak).
You need 3 or 4 or 5 things. 1) A big fat ugly collect everything under the sun repo...the BFR for short. 2) Some modicum of control over the machines to be updated.. say type1, type2, type3 3) A group of test machines that are representative of type[1-3] 4) Some scripting and database skills. 5) Lots of political clout.
So pick your favorite method (rsync, curl, wget, carrier pigeon) for populating the BFR and let er rip.
As part of defining your types, create a complete list of the names of the rpms (including the versions) that is installed or that is ok to install for this type. Place this list under version control (cvs, subversion or maybe git...). Create a script to read the list, and create a link farm pointing into the BFR. Run createrepo on this link farm and give it a meaningful name (type1-versiony maybe).
As part of the install process create a record (database, ldap, flat file, stone hieroglyphics) containing the hostname, the type and the version of the rpm list (actually the repo version).
Wait a few days for new stuff to trickle or gush into the BFR and for your customers to find out that their stuff is out of date, or included in a CSIRT, or that version latest is absolutely required to be able to implement company saving project z on time and bombard your manage..never mind.
Take one of your test machines, and build the type(s) that could be effected at their original revision level. Create a new repo list with the new version of the rpms. Test with all of your automated test harness (covered elsewhere :-) ). Pronounce it good (preferrably after a multiple of 6 time periods) and commit it to your version control system and create a record saying that typex is now at version y.
Perform the requisite sacrifices and rituals to get the change controls in place for some, all or one of the typex machines. While waiting for the change control approval, repeat the test and creation phase at least twice, or as often as time allows. (Having something to do will keep you out of trouble.)
Now use your remote management tools (cfengine, Tivoli, ssh, whatever) to invoke yum with something like yum -c http://updateserver.lan/yum.php?host=fred&type=type1 -y update
Oh I forgot to mention that you need to create a php,perl,cgi script to look up the host name and or the type and return an appropriate yum.conf. This script can do many interesting things, like forcing all typex's to be at rev k, or only hosts that match regex j and type d, or even allowing certain hosts to have direct access to the entire BFR, etc, etc, etc....
there, that didn't hurt much.....
Of course you can go as deep as you want here. You could probably tie the change control system together with the management tools and the testing tools and a Kerberos ticket server if you really wanted to control stuff. And you could wrap the yum invocation with a check of what is installed against what is supposed to be installed, backup config files, etc, etc, etc. But I digress.
------------------------------------------------------------------------ Jim Wildman, CISSP, RHCE jim@rossberry.com http://www.rossberry.com "Society in every state is a blessing, but Government, even in its best state, is a necessary evil; in its worst state, an intolerable one." Thomas Paine
On Thu, 2005-09-08 at 17:37, Bryan J. Smith wrote:
What I want is to be able to update more than one machine and expect them to have the same versions installed. If that isn't a very common requirement I'd be very surprised.
So what you want to checkout the repository from a specific tag and/or date. So you want:
- The repository to have every single package -- be it
packages as whole, or some binary delta'ing between RPMs (if possible)
It just needs to keep every package that it has ever had - at least as long as it might be useful for someone to install them. That seems to be the case now. You need this anyway unless you are sure that no files that remain have specific dependencies on anything removed.
- The repository meta-data to have all history so it can
backtrack to any tag/date.
If by history, you mean a timestamp of when a file was added, yes - and that already seems to be there. That would be sufficient to make updates repeatable. I'd like to add one more thing to make it more or less atomic, and that would be some indication of the latest timestamp that should be usable - that is, newer files are in an inconsistent state of a partial update. When the repository maintainer has all the files in place this file would be modified - and some special consideration should be applied to make sure it shows up last during mirror updates. This extra part could be avoided if the 'latest timestamp' is published somewhere and you could manually pass it to yum during the update.
In other words, you want a repository to maintain storage and use CPU-I/O power to resolves tens of GBs of inter-related data and corresponding versioning meta-data.
No, I want to be able to tell yum not to consider files newer than a certain date corresponding to the time I did the update on the baseline/test machine even if newer ones happen to be sitting in the repository. And I'd like yum to always ignore changes that are transitory and incomplete.
BTW, Your comparison to CVS is extremely poor, so _stop_.
CVS would give the result I want. How it gets done is not particularly relevant.
APT, YUM and countless other package repositories store packages whole, with a "current state" meta-data list, and the packages and that meta-data is services via HTTP and the _client_ resolves what it wants to do.
CVS can run with only file-level access to the repository and no particular intelligence on the server. However, I agree that it isn't exactly the service we need here.
What you want is a more "real-time" resolution logic "like CVS." That either requires:
A) A massive amount of data transfer if done at the client, or
Yum only needs the headers which involve a massive amount of data tranfer already. Using them slightly more intelligently would not take much more, even if a timestamp/tagname filed had to be added to the header.
B) A massive amount of CPU-I/O overhead if done at the server
No it doesn't. All it needs is for yum to observe the timestamps on files and ignore any past the point you specify even if they are available. Or move this info to the headers if you don't trust timestamps to be maintained.
I see 2 evolutionary approaches to the problem.
- Maintain multiple YUM repositories, even if all but the
original are links to the original. The problem is this is who defines what the "original" is? That's why you should maintain your _own_, so it's what _you_ expect it to be.
The Centos repository is the only one I've seen that doesn't keep every file that has ever been added forever. And they do have that available. I'm really not asking for eons of history here. I just want repeatable updates for some small testing window.
- Modify the YUM respository meta-data files so they store
revisions, whereby each time createrepo is run, the meta-data is continuing list.
#1 is direct and practical. #2 adds a _lot_ to the initial query YUM does, and could push it from seconds to minutes or even _hours_ at the client (not to mention the increase in traffic). That's the problem.
The only extra piece of data really needed is the latest timestamp of a consistent update. The rest could be figured out but you'd need a way to find what that value was at the time you do one update so you could re-use it for repeatable results even if it had subsequently changed in the repository. If I were doing the #2 approach, as much as I like an arbitrary number of arbitrary named tags, I'd probably go with an incrementing 'repository update version' tag that would be bumped on new sets of files so you don't ever have to change old ones and you can compute which ones are past what you specify and should be ignored. Some of those header files are 100k now - how much more overhead could an update version entry add?
This isn't Centos-specific - I just rambled on from some other mention of it and apologize for dwelling on it here. There are 2 separate issues: One is that yum doesn't know if a repository or mirror is consistent or in the middle of an update with only part of a set of RPM's that really need to be installed together.
Not true. The checks that createrepo does can prevent an update if there are missing dependencies. The problem is that most "automated" repos bypass those checks.
Does createrepo do its magic atomically? What do yum attempts running concurrently see as it succeeds/fails?
So, again, we're talking "repository management" issues and _not_ the tool itself.
No, I want the repository to be able to be inconsistent and the tool to be able to perform an update based on a prior known-good state.
The other is that if you update one machine and everything works, you have no reason to expect the same results on the
next machine a few minutes later.
Because there is not tagging/date facility. But to add that, you'd have to add either (again):
- A _lot_ of traffic (client-based)
- A _lot_ of CPU-I/O overhead (server-based)
Or a sensible approach.
Both issues would be solved if there were some kind of tag mechanism that could be applied by the repository updater after all files are present and updates could be tied to earlier tags even if the repository is continuously
updated.
So, in other words, you want the client to get repository info in 15-30 minutes, instead of 15-30 seconds. ;->
No, I want it to get more or less what it already does but ignore inconsistent changes in progress and have the option to ignore things newer than a time you did an earlier update which you'd like to repeat.
_This_ is the point you keep missing. It's the load that is required to do what you want. Not just a few hundred developers moving around a few MBs of files, but _tens_ of _thousands_ of users accessing _GBs_ of binaries.
That's why you rsync the repository down, and you do that locally.
Sorry, I just don't buy the concept that rsync'ing a whole repository is an efficient way to keep track of the timestamps on a few updates so you can repeat them later. Rsync imposes precisely that big load on the server side that you wanted to avoid having everyone do.
Les, I think this discussion has gotten lost in the weeds. What you want to do is easy to accomplish using Yum, but you have to use Yum the way it was intended to be used. First, you must have your own repository server, which will host as many repositories as your change control policies require (eg, untested, test, qa, production). Using the Centos repositories directly means that you're rolling out on Centos' change- control schedule, not yours. Set up your own repository and you can control it.
The rest of this message is just a summary of things that others on this list have already said.
1. You need your own repository server. This machine needs good Internet connectivity, Apache, and about 100GB of disk space.
2. On your repository server, you need to host three repositories. #1 is a copy of the Centos repositories you use, call it "untested" #2 is a subset of #1, call it "qa" #3 is a subset of #2, call it "production"
3. #1 gets updated nightly with a simple yum update. In other words, it just copies the binary packages from the Centos repositories.
4. Keep in mind that #2 and #3 won't take up much disk space because the files in these repositories are actually just links to the real files in #1.
5. "createrepo" will create the Yum headers for all of the packages in a given directory.
6. If you want to promote a set of packages from #1 to #2, you first add links (eg, "ln -s /var/spool/repo/untested/somepkg.rpm /var/spool/repo/qa/"). As you do this, any machine attempting to run "yum update" against the qa repository will see no change.
7. Here's where the repo gets its atomicity: once all of the packages are linked, you re-run createrepo. Until createrepo finishes, the repo will appear unchanged.
8. All of your QA machines use your "qa" repository, and run yum update nightly.
9. All of your production machines use your "production" repository and run yum update nightly.
10. YOU control when createrepo is run on your QA and Production repositories, thus controlling when things roll out.
On Fri, 2005-09-09 at 00:30 -0400, David Johnston wrote:
Les, I think this discussion has gotten lost in the weeds. What you want to do is easy to accomplish using Yum, but you have to use Yum the way it was intended to be used. First, you must have your own repository server, which will host as many repositories as your change control policies require (eg, untested, test, qa, production). Using the Centos repositories directly means that you're rolling out on Centos' change- control schedule, not yours. Set up your own repository and you can control it.
The rest of this message is just a summary of things that others on this list have already said.
- You need your own repository server. This machine needs good
Internet connectivity, Apache, and about 100GB of disk space.
On your repository server, you need to host three repositories. #1 is a copy of the Centos repositories you use, call it "untested" #2 is a subset of #1, call it "qa" #3 is a subset of #2, call it "production"
#1 gets updated nightly with a simple yum update. In other words, it
just copies the binary packages from the Centos repositories.
- Keep in mind that #2 and #3 won't take up much disk space because the
files in these repositories are actually just links to the real files in #1.
- "createrepo" will create the Yum headers for all of the packages in a
given directory.
- If you want to promote a set of packages from #1 to #2, you first add
links (eg, "ln -s /var/spool/repo/untested/somepkg.rpm /var/spool/repo/qa/"). As you do this, any machine attempting to run "yum update" against the qa repository will see no change.
- Here's where the repo gets its atomicity: once all of the packages
are linked, you re-run createrepo. Until createrepo finishes, the repo will appear unchanged.
- All of your QA machines use your "qa" repository, and run yum update
nightly.
- All of your production machines use your "production" repository and
run yum update nightly.
- YOU control when createrepo is run on your QA and Production
repositories, thus controlling when things roll out.
I agree with most of this ... except that I would (and do) use rsync to do updates of the repo from the CentOS trees.
You can make rsync only pull down the directories you need (ie, nothing but centos-3 i386 and x86_64 if that is what you care about, etc.)
Understand that CentOS already has more "repos" than just about anyone else out there already (contrib, extras, centosplus, addons, testing, updates, os). We do this to give the users almost unprecedented control on what they want to take.
The bottom line is ... if you want to QA control your own updates, and not rely on CentOS (or the upstream provider) to maintain an up2date and secure set of packages, then you need to do it yourself.
This is not a unique requirement ... I am a CentOS developer, and I personally QA almost every single package that gets released from the main CentOS-4 repos for i386 and x86_64. I still maintain a separate, local repo at my work that has tested and QA'ed packages required for my production machines to maintain configuration management. Not because I don't trust the updates ... but because I need a specific set of packages for specific requirements.
Having a configuration date/time feature in yum ... whereby anything after a specific point in time would not be considered in the resolution process might be a good thing (from the standpoint of configuration management). But that would not really do anything to verify that certain packages were stable, nor would it give you the flexibility to take certain packages newer than that date which you want while testing others.
You can use tools (like RHN) to register machines and pick certain updates to push across those machines ... but to be honest, I create local yum repos even for upstream supported machines too. I then point to those repos and use up2date to control my supported machines updates.
There is an open source project called "current" that is an open-source implementation of an up2date server. It might do some of the things you want. I haven't really looked at this in depth, but here is a link:
On Fri, 2005-09-09 at 04:19, Johnny Hughes wrote:
Having a configuration date/time feature in yum ... whereby anything after a specific point in time would not be considered in the resolution process might be a good thing (from the standpoint of configuration management). But that would not really do anything to verify that certain packages were stable,
You have to pull the newest at some point before you can decide that. The missing piece is the ability to repeat an update without pulling newer untested changes. The 'repository stability' issue would never be a problem when limiting the run to a time when a prior run did what you want.
nor would it give you the flexibility to take certain packages newer than that date which you want while testing others.
The ability to specify packages is already there, and I'd expect the timestamp limit to be specified per run - so you could still get whatever you want.
How often do you remove or modify existing files in the repositories? My premise is based on having all changes be the addition of new files.
This is growing tiresome. I hope this e-mail ends it.
On Thu, 2005-09-08 at 22:16 -0500, Les Mikesell wrote:
When the repository maintainer has all the files in place this file would be modified - and some special consideration should be applied to make sure it shows up last during mirror updates. This extra part could be avoided if the 'latest timestamp' is published somewhere and you could manually pass it to yum during the update.
And I have told you this is _more_involved_ than you think! If you knew how even CVS actually works, you would understand this.
Here, do this comparison:
1. Setup CVS
Check-in thousands of large, binary files totaling GBs of data at different times, revisioning several. You can use either pserver mode or ssh/local (including NFS mount).
2. Setup Apache
Create 2 or 3 trees of different sets of files that total a few GBs. Now share them out via an Apache directive (or under the Apache root).
3. Run the test ...
Do #1, check out those large, binary files using different dates Do #2, wget portions of each Apache tree
When you do this, run network traffic analysis, vmstat, iostat and make comparisons.
What do you think is going to be the difference between CVS and just an Apache tree? ;->
On Fri, 2005-09-09 at 01:21, Bryan J. Smith wrote:
This is growing tiresome. I hope this e-mail ends it.
No, it just shows that you don't understand what I said yet.
When the repository maintainer has all the files in place this file would be modified - and some special consideration should be applied to make sure it shows up last during mirror updates. This extra part could be avoided if the 'latest timestamp' is published somewhere and you could manually pass it to yum during the update.
And I have told you this is _more_involved_ than you think! If you knew how even CVS actually works, you would understand this.
Everthing I want could be done if yum simply had an option to ignore files newer than a specified timestamp and repository changes only involve additions (and the latter is already the case). No extra server operations or network traffic is necessary.
All I want is to be able to pretend that additions more recent than a prior run weren't there.
On Fri, 2005-09-09 at 07:39 -0500, Les Mikesell wrote:
No, it just shows that you don't understand what I said yet.
Actually, I don't understand what you're thinking. Then again, I don't think you understand how CVS works either.
But I _do_ know _exactly_ what you want. Unfortunately, how you think it can be done is full of omissions. Date/timestamps won't solve your "resolve from date X" want. That's what I can't seem to explain.
I even tried to use your CVS analogy and why it becomes infeasible. Apparently you're not familiar with how CVS works on the repository either.
Everthing I want could be done if yum simply had an option to ignore files newer than a specified timestamp and repository changes only involve additions (and the latter is already the case).
And you have absolutely no familiarity with what is involved with this on the _repository_ end. You have seemingly only used the YUM client, not actually created a YUM repo. The YUM client relies on a meta-data file because the YUM repo is little more than a "web site." It yanks down those meta-data files to do resolution.
If you want arbitrary fetches at an arbitrary tag/date, they you are going to _massive_bloat_ those meta-data files, not to mention the time it takes to resolve.
No extra server operations or network traffic is necessary.
Absolutely _not_ true. You are oblivious to the fact that the YUM client _relies_ on a set of meta-data files on the YUM repository. That's where the YUM client gets its information and many other details, _not_ by directly "browsing through" the tree.
All I want is to be able to pretend that additions more recent than a prior run weren't there.
Until you run your _own_ YUM repository and use createrepo a few times, you will be oblivious to what a YUM repository is.
Bryan J. Smith wrote:
On Fri, 2005-09-09 at 07:39 -0500, Les Mikesell wrote:
No, it just shows that you don't understand what I said yet.
Actually, I don't understand what you're thinking. Then again, I don't think you understand how CVS works either.
I think this is a key point in the discussion. I have withheld to comment on this thread so far, but it is becoming clear that this fellow has never actually done the work necessary to manage a released package with hundreds of files and forked ancestry trees.
Les, what you seem to want is to put a tag on the files so one can create a snapshot version. A simple date will not do what you want. I know that you think it will. But I did configuration management for years, and you just do not have any concept of how difficult the problem is. You need to think of yum as being a transport agent, because that's really all it is. Yes, it knows about some dependencies, but it can't know about everything there is to know.
What you want to happen requires co-ordination between the various developers, the repository maintainers, the transport/install (yum), and the users. You are concentrating on the transport/install tool, and hoping it can do configuration management for you.
I'm here to tell you that I did software development and participated in configuration management at a large corporation, and we had *teams* of over 100 engineers and testers working to do what you want, and we *still* had some holes in the process.
CVS is a pretty weak tool by comparison with what we used. And we found that the tools we used were still too weak. We had a lot of add-ons that we did over the standard tools just to allow us to be able to re-create a given load on demand. We could do it, but it took teams of engineers, many of whom were dedicated simply to creating packages which were mutually compatible releases. We had a multitude of integration testers just to verify that loads worked together.
You need to think of yum as just being a fancy version of wget. You can ask wget to get a whole web page and its pieces, and it will. And it'll even resolve some dependencies to fix up the hyperlinks between the files to fit the structure it builds on your machine. But that's it. It's a transport mechanism. So is yum. It's a transport.
[snip]
All I want is to be able to pretend that additions more recent than a prior run weren't there.
No, that is not what you want. What you expressly requested was to be able to recreate a yum install, and that the results be consistent.
That requires co-operation all along the whole software development chain, starting with source, and having a dedicated QA team to create the certified packages. It requires a certified source library of all the pieces, and a method for the development tools (compilers, assemblers, linkers, etc.) to pass along information about what version went into the build, and the test team to be able to certify what versions were tested together during integration.
This is inconsistent with "open software development on the web".
It requires a management structure.
Until you run your _own_ YUM repository and use createrepo a few times, you will be oblivious to what a YUM repository is.
I believe you have struck the nail upon the head with perfect orthogonality.
In fact, he seems to have no concept of what configuration management is.
Mike
On Fri, 2005-09-09 at 12:05, Mike McCarty wrote:
Les, what you seem to want is to put a tag on the files so one can create a snapshot version. A simple date will not do what you want. I know that you think it will. But I did configuration management for years, and you just do not have any concept of how difficult the problem is. You need to think of yum as being a transport agent, because that's really all it is. Yes, it knows about some dependencies, but it can't know about everything there is to know.
Please explain what would go wrong if yum simply ignored the presence of files newer than a specified date.
I'm here to tell you that I did software development and participated in configuration management at a large corporation, and we had *teams* of over 100 engineers and testers working to do what you want, and we *still* had some holes in the process.
I'm not asking to deal with arbitrary revisions to same-named files here. I'm asking for repeatable actions with all the same stuff available plus possibly some new things I'd prefer to ignore. Everything still exists in the repository and would be available if you extracted the version numbers from the machine with the prior run with rpm -q and fed it explicitly to yum to install. If there is a reason that yum would not make exactly the same decision again by itself if it just could be told to ignore the subsequently-added files in the repository I am missing something.
You need to think of yum as just being a fancy version of wget. You can ask wget to get a whole web page and its pieces, and it will. And it'll even resolve some dependencies to fix up the hyperlinks between the files to fit the structure it builds on your machine. But that's it. It's a transport mechanism. So is yum. It's a transport.
The only new thing I'm asking is for it to pretend some new files weren't there.
All I want is to be able to pretend that additions more recent than a prior run weren't there.
No, that is not what you want. What you expressly requested was to be able to recreate a yum install, and that the results be consistent.
Then I'm missing why yum would make different decisions than it did when the files actually weren't there.
That requires co-operation all along the whole software development chain, starting with source, and having a dedicated QA team to create the certified packages. It requires a certified source library of all the pieces, and a method for the development tools (compilers, assemblers, linkers, etc.) to pass along information about what version went into the build, and the test team to be able to certify what versions were tested together during integration.
This is inconsistent with "open software development on the web".
It requires a management structure.
That's all there, and all been done. That's the beauty of the current system and the value of using yum in the first place. And that's why I've pressed this conversation this far. Everything in the repository has already been version-numbered, dependency checked and well tested. It's insane to think that I can do a better job of this than has already been done. The only extra step is that I need to check for any surprises in interaction with home-grown apps and scripts (and in a distribution like Centos this would most likely mean the app was broken if it ever happened).
Until you run your _own_ YUM repository and use createrepo a few times, you will be oblivious to what a YUM repository is.
I believe you have struck the nail upon the head with perfect orthogonality.
In fact, he seems to have no concept of what configuration management is.
The configuration management is already done. All I'm asking for is a way to access what was done last week (which is still there) and ignore available additions. Yes it can be done with a complete snapshot of every repository state that I might want to repeat an update from. It could also be done by making a snapshot of a current repository and physically removing the files I don't want yet. It could also be done if the yum program just ignored the files I don't want yet. The last option just seems nicer.
-- Les Mikesell lesmikesell@gmail.com
On Fri, 2005-09-09 at 12:49 -0500, Les Mikesell wrote:
On Fri, 2005-09-09 at 12:05, Mike McCarty wrote:
The configuration management is already done. All I'm asking for is a way to access what was done last week (which is still there) and ignore available additions. Yes it can be done with a complete snapshot of every repository state that I might want to repeat an update from. It could also be done by making a snapshot of a current repository and physically removing the files I don't want yet. It could also be done if the yum program just ignored the files I don't want yet. The last option just seems nicer.
---- have you checked out smartpm yet?
no - I don't want in on the thread
;-)
Craig
Craig White craigwhite@azapple.com wrote:
have you checked out smartpm yet? http://smartpm.org no - I don't want in on the thread
Has absolutely nothing to do with what we are discussing.
SmartPM is an end-user tool that can use YUM repositories, among others, and does not address the larger issue of how repositories work (and the clients that access them) and the larger issue of configuration management.
Bryan J. Smith wrote:
Craig White craigwhite@azapple.com wrote:
have you checked out smartpm yet? http://smartpm.org no - I don't want in on the thread
Has absolutely nothing to do with what we are discussing.
SmartPM is an end-user tool that can use YUM repositories, among others, and does not address the larger issue of how repositories work (and the clients that access them) and the larger issue of configuration management.
From what I've seen of SmartPM's documentation, it makes the issue worse.
Package versions that get installed don't only depend on what the repo has, but also what version of it's dependencies is installed on the machine.
On Fri, 2005-09-09 at 11:45 -0700, Bryan J. Smith wrote:
Craig White craigwhite@azapple.com wrote:
have you checked out smartpm yet? http://smartpm.org no - I don't want in on the thread
Has absolutely nothing to do with what we are discussing.
SmartPM is an end-user tool that can use YUM repositories, among others, and does not address the larger issue of how repositories work (and the clients that access them) and the larger issue of configuration management.
---- so therefore - I am not allowed to ask the question?
I didn't realize that your vastly superior knowledge of all things also permitted you to dictate topicality. So sorry. Do you ever pause before you click send and ask yourself is sending this necessary?
I said I wasn't entering the thread.
Craig
Craig White craigwhite@azapple.com wrote:
so therefore - I am not allowed to ask the question?
But what question are you asking and why? You are mentioning that someone should try SmartPM, which will do _nothing_ to solve the problem we are discussing.
I didn't realize that your vastly superior knowledge
Give me a break. I am a _major_advocate_ of SmartPM. But this isn't about the end-user tool. It's about the repository and/or service. SmartPM has nothing to do with that. SmartPM uses YUM.
And as someone else pointed out, it could actually make things worse with what we're discussing. But SmartPM accesses a YUM repository, and the YUM repository is where this capability could _only_ be addressed.
of all things also permitted you to dictate topicality.
Dude, I do _not_ dictate what topics are discussed.
But one thing I really can't stand is when many people are trying to figure out a solution, and different people keep interjecting things because they don't understand the root problem. And why don't they? They don't take the time to understand.
I merely pointed out that SmartPM, APT or any other tool is not the issue here. I have repeatedly stated that the "client tool" has _nothing_ to do with _anything_ we are talking about. In fact, 99% of the confusion is because people are only used to using the "client tool" and not actually how the repository works.
The "yum client" program has *0* to do with this discussion, its how the "yum repository" is organized over simplistic HTTP. Switching to SmartPM does not solve it at all, because it's still using a YUM repository.
So sorry. Do you ever pause before you click send and ask yourself is sending this necessary?
Do you ever stop to read what people are discussing before just saying "use product X" to "fix problem Y" which you clearly didn't even stop to read?
I said I wasn't entering the thread.
You were entering into the thread by suggesting someone "use product X" to "fix problem Y" where X is wholly inapplicable.
It's like saying fix your Apache issue by using Microsoft Windows with Internet Explorer as your client instead of Linux with Firefox as your client. The client is _not_ the problem, it's the server, and changing clients does _nothing_. ;->
At this point, you're reminding me of Dilbert's point-hair boss. You're suggest something that is wholly inapplicable to the discussion. I merely pointed that out, and if you want to take offense to it, then I can't help you.
On Fri, 2005-09-09 at 12:51 -0700, Bryan J. Smith wrote:
Craig White craigwhite@azapple.com wrote:
so therefore - I am not allowed to ask the question?
But what question are you asking and why? You are mentioning that someone should try SmartPM, which will do _nothing_ to solve the problem we are discussing.
I didn't realize that your vastly superior knowledge
Give me a break. I am a _major_advocate_ of SmartPM. But this isn't about the end-user tool. It's about the repository and/or service. SmartPM has nothing to do with that. SmartPM uses YUM.
And as someone else pointed out, it could actually make things worse with what we're discussing. But SmartPM accesses a YUM repository, and the YUM repository is where this capability could _only_ be addressed.
of all things also permitted you to dictate topicality.
Dude, I do _not_ dictate what topics are discussed.
But one thing I really can't stand is when many people are trying to figure out a solution, and different people keep interjecting things because they don't understand the root problem. And why don't they? They don't take the time to understand.
I merely pointed out that SmartPM, APT or any other tool is not the issue here. I have repeatedly stated that the "client tool" has _nothing_ to do with _anything_ we are talking about. In fact, 99% of the confusion is because people are only used to using the "client tool" and not actually how the repository works.
The "yum client" program has *0* to do with this discussion, its how the "yum repository" is organized over simplistic HTTP. Switching to SmartPM does not solve it at all, because it's still using a YUM repository.
So sorry. Do you ever pause before you click send and ask yourself is sending this necessary?
Do you ever stop to read what people are discussing before just saying "use product X" to "fix problem Y" which you clearly didn't even stop to read?
I said I wasn't entering the thread.
You were entering into the thread by suggesting someone "use product X" to "fix problem Y" where X is wholly inapplicable.
It's like saying fix your Apache issue by using Microsoft Windows with Internet Explorer as your client instead of Linux with Firefox as your client. The client is _not_ the problem, it's the server, and changing clients does _nothing_. ;->
At this point, you're reminding me of Dilbert's point-hair boss. You're suggest something that is wholly inapplicable to the discussion. I merely pointed that out, and if you want to take offense to it, then I can't help you.
---- you assume way too much.
I never offered smartpm as any solution to any problem referenced in this discussion. I merely asked if he had used smartpm. The assumption that I asked that in order to solve his issues was yours and yours alone. You need to keep in mind that it is actually ok for topics of conversation to move tangentially and that shouldn't require the blessing of Bryan.
I never really attempted to dissect the discussion that you and Les were having because having had discussions with both of you already, I am well aware that neither of you would ever concede an inch - even if you pondered each other's points of view long enough to try and make them work for you, neither of you would allow it just out of principle (that's my opinion). I believe this because I have had these types of discussions with Les and you before. I had to be a moron to enter the discussion - even peripherally as I did (and got compared to Dilbert's boss as a reward).
You would be well served to invite others to participate in discussions rather than bite them off as you do if for no other reason than to give the impression of geniality. If you added geniality and humility to your arguments, they would be more persuasive. This is another way of saying that persuasion is an art - not a test of wills nor a referendum of one's knowledge.
Craig
Craig White craigwhite@azapple.com wrote:
I never offered smartpm as any solution to any problem referenced in this discussion. I merely asked if he had used smartpm. The assumption that I asked that in order to solve his issues was yours and yours alone.
As I said, it was like we were in a middle of a discussion on an Apache server issue and you asked if someone had tried the -- and let me make this a little more "friendly" -- the Opera browser on Linux instead of Firefox. It had nothing to do with the entire thread.
Now I _never_ said you couldn't make your comment. But I _did_ point out it was wholly inapplicable to our thread before someone else thought the SmartPM client could solve a problem that the YUM client supposedly had. That's what I wanted to avoid. It took awhile just to get to the point of understand that the YUM client doesn't access RPMs directly for resolution from a YUM repository.
Given the timing and content of your post, it was a safe assumption to make. Again, if you're offended, I can't do much about that.
You need to keep in mind that it is actually ok for topics of conversation to move tangentially and that shouldn't require the blessing of Bryan.
You can introduce _all_ the posts you want. If it doesn't apply to the current thread, I'll point that out before someone assumes it is a possible solution. If you want to take offense to that, then by all means, consider the relevance and the possible responses you might get.
In other words -- right now -- I think you're "backtracking" and trying to pin the fact that you weren't following the thread. Frankly, I could care less. But it's clear you now have an "agenda" that I'm out to control every thread.
Perhaps it just might be that I'm honestly interested in coming up with solutions (like the possible createrepo hack), or at least explaining "best practices" (in dealing with a lot of issues, such as FRAID support).
I never really attempted to dissect the discussion that you and Les were having because having had discussions with
both
of you already, I am well aware that neither of you would ever concede an inch - even if you pondered each other's points of view long enough to try and make them work for you, neither of you would allow it just out of principle (that's my opinion). I believe this because I have had these types of discussions with Les and you
before.
I had to be a moron to enter the discussion - even peripherally as I did (and got compared to Dilbert's boss as a reward).
Yes, after you made comments just like the above! If you psycho-analyze myself, don't get upset if I do the same in reverse. Re-read the lineage, _you_ analyzed me, so I just analyzed you right back.
Which is my #1 pet-peeve, hypocracy. Now you're just dragging Les' name down in this above rant as well. Don't take it out on him if you have a problem with me, might as well stay fixated on me, as you have clearly become.
You would be well served to invite others to participate in discussions rather than bite them off as you do if for no other reason than to give the impression of geniality.
And you might be well served to not be a hypocrite. Don't analyze me if you can't stand if I analyze you right back.
If you added geniality and humility to your arguments, they would be more persuasive.
I actually _do_ when I have less experience in an area. I'm very humble and I clearly concede to those more experienced in an area. Take DAG for instance.
But in the areas where I'm very strong, and someone is discussing something from second or third-hand (*0* actual experience), I tend to be "very confident" and I try to avoid being "cocky."
Yes, comparing you to Dilbert's boss was "cocky," I'll admit. But if you re-read your own analysis of myself -- let alone Les who didn't even respond to you -- you might consider how hypocritical your post looks.
This is another way of saying that persuasion is an art - not a test of wills
Apparently you like the "do as I say not do as I do" attitude?
nor a referendum of one's knowledge.
I'm not quoting credentials or anything else. But I am "very confident" on select technical matters. You'll see me _steer_clear_ of things I don't have a heafty level of first-hand experience.
If you read my posts again and again, all I'm doing is relaying first-hand experience. That's all I can do. Everything else would be postering and self-defeating.
On Fri, 2005-09-09 at 19:49, Craig White wrote:
I never offered smartpm as any solution to any problem referenced in this discussion. I merely asked if he had used smartpm.
No I haven't. I've had version-related problems with python code not part of the distribution in the past and try to avoid using it. But as someone else mentioned, it would probably make things worse by offering more options than just installing the latest it finds in the repository which I'd like to modify into the 'latest if not later-than' some timestamp.
OK ...
There have been some good suggestions on this thread ...
1. It might be good if you could pass a date as a command line option to yum ... and have yum not consider anything after that date as being in the repo.
That is a good suggestion for the yum mailing list: https://lists.linux.duke.edu/mailman/listinfo/yum
2. It might be good to develop a way to distribute update RPMS with only the changed (delta) information and not all the information, thus saving time and bandwidth and storage space. This is also a good suggestion, but not really for CentOS ... we use up2date and/or yum ... but if Red Hat where to change their method of distributing updates, then CentOS might too. This suggestion also really belongs upstream (if it is going to be acted upon).
3. It might be good to have a method for deploying different sets of packages to different machines and control them individually or as a group. The program "current" is working on doing that: http://current.tigris.org/
------------------- Please ... let's offer constructive and non attacking comments to this and all other threads on the mailing lists.
Johnny Hughes mailing-lists@hughesjr.com wrote:
OK ... There have been some good suggestions on this thread
...
- It might be good if you could pass a date as a command
line option to yum ... and have yum not consider anything after that date as being in the repo.
I don't think anyone disagrees with that. The problem is that the current format of meta-data in the YUM repository makes this difficult -- let alone how do you "define" the date?
I suggested a simple "hack" on the repository end that merely retains previous createrepo runs. That way you can go back to any previous "state" of the meta-data that someone else might have pulled from earlier.
That is a good suggestion for the yum mailing list: https://lists.linux.duke.edu/mailman/listinfo/yum
I will join and offer my simple "hack" as a suggestion for handling the repo end until a more formal concept can be devised.
- It might be good to develop a way to distribute update
RPMS with only the changed (delta) information and not all the information, thus saving time and bandwidth and storage
space.
For mirrors, it _might_ be a good way, I don't disagree. As long as only "a few" mirrors are yanking files. That will minimize the load of the ripple delta processing.
But for connecting dozens of clients, especially over the Internet, I think the space savings is going to be off-set by the massive overhead.
E.g., In most configuration management systems where TBs of binary data are involved, I've found the systems are intolerable after just after a few clients. In fact, what I typically do is have only the delta process run on the local disk, and then share out the resulting "assembly" via NFS.
So it would work as long as only a few clients connect -- such as limiting to mirrors. But when it comes to client operations, the load and temporary space used will not only be self-defeating, but far worse than just a flat/whole repository.
This is also a good suggestion, but not really for CentOS ... we use up2date and/or yum ... but if Red Hat where to change their method of distributing updates, then CentOS might too. This suggestion also
really
belongs upstream (if it is going to be acted upon).
Agreed.
- It might be good to have a method for deploying
different sets of packages to different machines and
control
them individually or as a group. The program "current" is working on doing that: http://current.tigris.org/
Yep. There's a lot of capability that Subversions and other version control systems are offering.
But there are -- how can I say this -- "real world deployment" and "server v. client v. bandwidth load/transfer" issues. For each solution offered that fixes one problem, several others seem to be introduced.
I don't think anyone is arguing these things are not wanted -- God knows I'd _love_ to have these capabilities. But their feasibility is entirely another issue, and it requires some first-hand experience with how YUM repositories work, as well as binary delta'ing loads and buffering/temporary space when links are slow.
Running an rsync or two between a few systems is somewhat of a load. But doing a compounding set of rysncs (which is basically what delta'ing is) to numerous systems is going to introduce a load on the service that is far removed from just HTTP file services. ;->
Please ... let's offer constructive and non attacking comments to this and all other threads on the mailing
lists.
I think that's all I'm trying to do. But I will admit I did get a bit "frustrated" when people assumed how something worked, and it didn't.
On Sat, 2005-09-10 at 07:42, Johnny Hughes wrote:
OK ...
There have been some good suggestions on this thread ...
- It might be good if you could pass a date as a command line option
to yum ... and have yum not consider anything after that date as being in the repo.
That is a good suggestion for the yum mailing list: https://lists.linux.duke.edu/mailman/listinfo/yum
Which fails if the repo you are pointing to has to restore the repo files and the datetime stamp changes....
IMHO a completely different application should be used. This application is mostly a database that tracks a list of rpms. If you want to build a copy of a system you select the particular snapshot (the list of rpm versions you decided was the image) and the new utility proceeds to pull those rpms from the repo and install them on the target system. This new application would allow you to create multiple snapshots and select which one you wanted to use.
A long time ago I used to use something like that with HP systems. I think they used something called a kickstart file or something similar. Been a very long time since I used that. But every system built with the same kickstart file had the same software load along with configuration options applied.
Trying to cram this into yum is IMHO going to make yum overly complex and more difficult to use.
On Sat, 2005-09-10 at 11:23, Scot L. Harris wrote:
- It might be good if you could pass a date as a command line option
to yum ... and have yum not consider anything after that date as being in the repo.
That is a good suggestion for the yum mailing list: https://lists.linux.duke.edu/mailman/listinfo/yum
Which fails if the repo you are pointing to has to restore the repo files and the datetime stamp changes....
If you don't restore in a way that maintains the timestamp, every mirror is going to have to suck a fresh copy of the whole repository. I'd expect the maintainers to already be careful about that.
IMHO a completely different application should be used. This application is mostly a database that tracks a list of rpms. If you want to build a copy of a system you select the particular snapshot (the list of rpm versions you decided was the image) and the new utility proceeds to pull those rpms from the repo and install them on the target system. This new application would allow you to create multiple snapshots and select which one you wanted to use.
That would be better in the sense that it could detect errors like files being removed from the repository. If the repository only has additions, the timestamp is all you need to recreate the list of rpms that were present at any time. If you are going to the trouble of doing something more complicated, it should involve tying repository update 'sets' of rpms together so that a client could tell if all needed files were present at a mirror site instead of just failing dependencies when a partial update requires a missing file.
Trying to cram this into yum is IMHO going to make yum overly complex and more difficult to use.
Repeatable operations are more than just a nice idea in the computer world... And making everyone who wants a repeatable yum update store a whole repository snapshot for every point they need just doesn't seem like an efficient way to get that.
On Sat, 2005-09-10 at 13:46, Les Mikesell wrote:
On Sat, 2005-09-10 at 11:23, Scot L. Harris wrote:
- It might be good if you could pass a date as a command line option
to yum ... and have yum not consider anything after that date as being in the repo.
That is a good suggestion for the yum mailing list: https://lists.linux.duke.edu/mailman/listinfo/yum
Which fails if the repo you are pointing to has to restore the repo files and the datetime stamp changes....
If you don't restore in a way that maintains the timestamp, every mirror is going to have to suck a fresh copy of the whole repository. I'd expect the maintainers to already be careful about that.
IMHO a completely different application should be used. This application is mostly a database that tracks a list of rpms. If you want to build a copy of a system you select the particular snapshot (the list of rpm versions you decided was the image) and the new utility proceeds to pull those rpms from the repo and install them on the target system. This new application would allow you to create multiple snapshots and select which one you wanted to use.
That would be better in the sense that it could detect errors like files being removed from the repository. If the repository only has additions, the timestamp is all you need to recreate the list of rpms that were present at any time. If you are going to the trouble of doing something more complicated, it should involve tying repository update 'sets' of rpms together so that a client could tell if all needed files were present at a mirror site instead of just failing dependencies when a partial update requires a missing file.
Trying to cram this into yum is IMHO going to make yum overly complex and more difficult to use.
Repeatable operations are more than just a nice idea in the computer world... And making everyone who wants a repeatable yum update store a whole repository snapshot for every point they need just doesn't seem like an efficient way to get that.
I think you missed the point. The new application stores a list of the rpms that make up your snapshot not the actual rpms. You can have many snapshots in the database. The repo is just that a repo of packages. The new app pulls the packages that are part of your snapshot. Of course if you were doing this for a large enterprise you would be running your own repo anyways with packages moved from a testing repo to the staging and then production repos as you verify each package does what is expected. That is how you get your repeatable operations.
But then I think this has been discussed previously several times in this thread.
Nothing of value will come from any further discussion of this topic in this thread.
KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD.KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD.KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD.
On 9/10/05, Les Mikesell lesmikesell@gmail.com wrote:
On Sat, 2005-09-10 at 11:23, Scot L. Harris wrote:
- It might be good if you could pass a date as a command line option
to yum ... and have yum not consider anything after that date as being in the repo.
That is a good suggestion for the yum mailing list: https://lists.linux.duke.edu/mailman/listinfo/yum
Which fails if the repo you are pointing to has to restore the repo files and the datetime stamp changes....
If you don't restore in a way that maintains the timestamp, every mirror is going to have to suck a fresh copy of the whole repository. I'd expect the maintainers to already be careful about that.
IMHO a completely different application should be used. This application is mostly a database that tracks a list of rpms. If you want to build a copy of a system you select the particular snapshot (the list of rpm versions you decided was the image) and the new utility proceeds to pull those rpms from the repo and install them on the target system. This new application would allow you to create multiple snapshots and select which one you wanted to use.
That would be better in the sense that it could detect errors like files being removed from the repository. If the repository only has additions, the timestamp is all you need to recreate the list of rpms that were present at any time. If you are going to the trouble of doing something more complicated, it should involve tying repository update 'sets' of rpms together so that a client could tell if all needed files were present at a mirror site instead of just failing dependencies when a partial update requires a missing file.
Trying to cram this into yum is IMHO going to make yum overly complex and more difficult to use.
Repeatable operations are more than just a nice idea in the computer world... And making everyone who wants a repeatable yum update store a whole repository snapshot for every point they need just doesn't seem like an efficient way to get that.
-- Les Mikesell lesmikesell@gmail.com
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Why?
On Sat, 2005-09-10 at 17:41 -0400, Jim Perrin wrote:
KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD.KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD.KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD.
On 9/10/05, Les Mikesell lesmikesell@gmail.com wrote:
On Sat, 2005-09-10 at 11:23, Scot L. Harris wrote:
- It might be good if you could pass a date as a command line option
to yum ... and have yum not consider anything after that date as being in the repo.
That is a good suggestion for the yum mailing list: https://lists.linux.duke.edu/mailman/listinfo/yum
Which fails if the repo you are pointing to has to restore the repo files and the datetime stamp changes....
If you don't restore in a way that maintains the timestamp, every mirror is going to have to suck a fresh copy of the whole repository. I'd expect the maintainers to already be careful about that.
IMHO a completely different application should be used. This application is mostly a database that tracks a list of rpms. If you want to build a copy of a system you select the particular snapshot (the list of rpm versions you decided was the image) and the new utility proceeds to pull those rpms from the repo and install them on the target system. This new application would allow you to create multiple snapshots and select which one you wanted to use.
That would be better in the sense that it could detect errors like files being removed from the repository. If the repository only has additions, the timestamp is all you need to recreate the list of rpms that were present at any time. If you are going to the trouble of doing something more complicated, it should involve tying repository update 'sets' of rpms together so that a client could tell if all needed files were present at a mirror site instead of just failing dependencies when a partial update requires a missing file.
Trying to cram this into yum is IMHO going to make yum overly complex and more difficult to use.
Repeatable operations are more than just a nice idea in the computer world... And making everyone who wants a repeatable yum update store a whole repository snapshot for every point they need just doesn't seem like an efficient way to get that.
-- Les Mikesell lesmikesell@gmail.com
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Les Mikesell lesmikesell@gmail.com wrote:
Please explain what would go wrong if yum simply ignored the presence of files newer than a specified date.
No offense, but build a YUM repository and you'll know _exactly_ why we think you are _clueless_ in this thread.
The YUM client does _not_ access the RPMs, it accesses the repodata which is a meta-data listing of what is in the repository. That meta-data listing is set to a _specific_ set of RPMs at a _specific_ time.
To get what you want, you have to modify the _repository_ end. The YUM client does _not_ and can_not_ arbitrarily pick'n choose different RPMs that are physically on the server. It has to rely on the meta-data listing that is on the server -- generated by another process on the server.
I have described a "hack" that would do what you wish, by maintaining multiple repodata -- what I call repodelta -- sets. That would be a "nice option" but it can only regenerate the repodata for the state of the repository when it is run.
To do anything more advanced, you'd need to delta at least the headers of the packages in the repository (which would be the _only_ sane operation IMHO). And when we say "delta" -- we are talking about "CVS-like" operations.
I'm not asking to deal with arbitrary revisions to same-named files here. I'm asking for repeatable actions with all the same stuff available plus possibly some new things I'd prefer to ignore.
Then you need the YUM repository to maintain _multiple_ copies of the repodata from _different_ times. I described one way this might be done with my "repodelta" conceptual hack.
Everything still exists in the repository and would be available if you extracted the version numbers from the machine with the prior run with rpm -q and fed it explicitly to yum to install.
Then that's what you need to do. You _can_ feed a filelist into the YUM client. It's that simple.
What you're talking about requires changes at the repository end. That is far more involved.
If there is a reason that yum would not make exactly the same decision again by itself if it just could be told to ignore the subsequently-added files in the repository I am missing something.
You seem to continually be missing the point that the YUM client does _not_ directly access the RPMs for resolution, but a meta-data list that is _pre-generated_ at the repository.
Then I'm missing why yum would make different decisions than it did when the files actually weren't there.
EXACTLY! Becuse you don't understand how YUM repositories work. Just like you don't understand how delta'ing works.
Hence why this thread never dies, you keep responding to different people, and some of us (at least me) are too stupid to just not response (although I think I'm getting smarter, I think this could be my last post).
That's all there, and all been done. That's the beauty of the current system and the value of using yum in the first place.
No, the value of what _you_think_ YUM is.
This is very, very typical of someone who has only used a client piece of a system, but not actually managed the service end. In YUM, there is no "YUM service." It is a tree of files, shared out via HTTP. The YUM client figures out what is available in the repository by reading the meta-data that is pre-generated on the server -- *NOT* by reading the entire YUM repository of RPMs, or otherwise "interacting" with some "YUM service."
And that's why I've pressed this conversation this far.
Yes, we know.
Everything in the repository has already been version-numbered, dependency checked and well tested.
At a _single_, _discrete_ point-in-time when the meta-data was built. It is _neither_ a revisioning back-end _nor_ a service.
It's insane to think that I can do a better job of this than has already been done.
The problem is that you are assuming what "has already been done" in YUM, and your assumptions are _dead_wrong_.
The configuration management is already done.
No it's not! That's the problem.
YUM repositories are a "list of files" with a meta-data listing "done at date X." Your YUM client does _not_ resolve by looking through all the RPMs, it resolves by downloading that meta-data file which was pre-generated on the server.
That meta-data file was created at date X and can offer _no_other_information_ on the state of the repository at any other time. The only way around that is to build a _true_ delta'ing backend (version control like CVS) or maintain multiple copies of the meta-data from different times (which my conceptual hack suggested).
All I'm asking for is a way to access what was done last week (which is still there)
*WRONG*! The RPMs are still there, yes. But the meta-data listing is _not_, it's _gone_.
and ignore available additions. Yes it can be done with a complete snapshot of every repository state that I might want to repeat an update from. It could also be done by making a snapshot of a current repository and physically removing the files I don't want yet. It could also be done if the yum program just ignored the files I don't want yet. The last option just seems nicer.
And that last option _could_only_exist_ in your mind. ;->
You keep thinking the YUM client can automatically and arbitrarily pick and choose among RPMs exist in the repository. The thing is that the YUM client _only_ picks'n chooses what the meta-data listing says.
And there is _no_ information in that meta-data to say "give me the state of the repository on date X" if the meta-data was generated on date Y. There is _no_ "YUM service" on the server that can dynamically generate this. And the pre-built meta-data only describes now, because to maintain deltas on just the repository itself is far more involved (and would only add to the initial delay).
Which is why I suggested my conceptual "hack." You have to provide the capability at the repository. Modification of the YUM client can do _nothing_ with the current way YUM repositories work. Build a YUM repository and you'll see _exactly_ what I mean (among others here).
I've explained this enough times. I'll shutup now.
Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
Please explain what would go wrong if yum simply ignored the presence of files newer than a specified date.
No offense, but build a YUM repository and you'll know _exactly_ why we think you are _clueless_ in this thread.
[snip]
People who have never had kids make statements which begin "All you have to do is...", while the real parents are hanging on by their fingernails just trying to survive.
Maybe something a little bit similar is happening here?
Mike
On Fri, 2005-09-09 at 15:29, Mike McCarty wrote:
Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
Please explain what would go wrong if yum simply ignored the presence of files newer than a specified date.
No offense, but build a YUM repository and you'll know _exactly_ why we think you are _clueless_ in this thread.
[snip]
People who have never had kids make statements which begin "All you have to do is...", while the real parents are hanging on by their fingernails just trying to survive.
Maybe something a little bit similar is happening here?
Mike
If it was "easy" then anyone would be able to do it. :)
The answer has been given several times, build and maintain your own repos. The production build repo will only have those packages you have tested and released. You can then build or update existing systems to that level by simply pointing to the production build repo. You would also have a test repo (possibly more than one) which would be used for testing new updates. Once you have a set that have passed QA you sync your production repo with the certified test repo and update your production systems.
This is something you have to manage yourself. I don't know why you would trust an external repo to update your production systems in the first place. This would be a carefully controlled process that would be run from internal servers. You would only pull the latest release into a test environment from external repos. This eliminates potential issues with repos having problems, network issues, etc. And gives you the control over what is installed on your production systems.
This is not an easy thing. It takes organization and work to setup, operate, and manage a large number of servers to strict standards.
I withdraw now to watch the rest of this continuing saga unfold. :)
On Fri, 2005-09-09 at 14:04, Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
Please explain what would go wrong if yum simply ignored the presence of files newer than a specified date.
No offense, but build a YUM repository and you'll know _exactly_ why we think you are _clueless_ in this thread.
The YUM client does _not_ access the RPMs, it accesses the repodata which is a meta-data listing of what is in the repository. That meta-data listing is set to a _specific_ set of RPMs at a _specific_ time.
I realize I'm asking for something yum doesn't already do.
How hard is it for you to find the timestamp of the rpm in question or the yum header file for it. You are right that I don't know what is already stored in that up-to-100k of metadata header file that yum gathers for every available rpm, but if a timestamp isn't already in there it doesn't seem like that much extra to add one. Or to just pick up the timestamp from the RPM file itself via http/ftp or any of the ways that mirroring techniques use.
All I'm asking for is a way to access what was done last week (which is still there)
*WRONG*! The RPMs are still there, yes. But the meta-data listing is _not_, it's _gone_.
Wait... Are you saying that if I tell yum to install program-1.2.3 after program-1.2.4 appears in the repository it will no longer correctly resolve the dependencies? I thought the metadata in the hdr files was derived from the rpm dependency listings and would thus be unchanged by subsequent additions.
You keep thinking the YUM client can automatically and arbitrarily pick and choose among RPMs exist in the repository. The thing is that the YUM client _only_ picks'n chooses what the meta-data listing says.
I thought the metadata consisted of the available hdr files.
And there is _no_ information in that meta-data to say "give me the state of the repository on date X" if the meta-data was generated on date Y. There is _no_ "YUM service" on the server that can dynamically generate this.
If the subsequently added hdr files were ignored, you should have the state of the repository at that earlier time. Otherwise what was the point of using separate files? The 'delta' that you keep talking about is exactly these additional hdr files. Remove them or pretend they aren't there and the yum client sees the prior state.
Les Mikesell wrote:
On Fri, 2005-09-09 at 12:05, Mike McCarty wrote:
Les, what you seem to want is to put a tag on the files so one can create a snapshot version. A simple date will not do what you want. I know that you think it will. But I did configuration management for years, and you just do not have any concept of how difficult the problem is. You need to think of yum as being a transport agent, because that's really all it is. Yes, it knows about some dependencies, but it can't know about everything there is to know.
Please explain what would go wrong if yum simply ignored the presence of files newer than a specified date.
Please explain to me how the date of a file describes its contents.
Until you can tell me how, merely by looking at the date of a file, one can know what its contents are, then you haven't shown how by using dates one can get yum to do consistent downloads.
I'm here to tell you that I did software development and participated in configuration management at a large corporation, and we had *teams* of over 100 engineers and testers working to do what you want, and we *still* had some holes in the process.
I'm not asking to deal with arbitrary revisions to same-named files here. I'm asking for repeatable actions with all the same stuff available plus possibly some new things I'd prefer to ignore. Everything still exists in the repository and would be available if you extracted the version numbers from the machine with the prior run with rpm -q and fed it explicitly to yum to install. If there is a reason that yum would not make exactly the same decision again by itself if it just could be told to ignore the subsequently-added files in the repository I am missing something.
If that's what you want, then the easiest way is simply to create your own repository. Use anything to create the repository, starting empty. Then draw only from that repository. This requires no cooperation between you, the developers, the repository maintainers, or anyone else. You will be in complete control. No changes to yum, or rsync, or any other tool is required. You can write a script which does the creation. Start the script and walk away. Have it e-mail you when it completes.
One day you decide you want to update some machines in a repeatable (though not necessarily consistent) manner. Start with an empty repository, and use any method you like to fill it. Then do all updates from that repository.
But, IIRC, you said yourself you wanted *consistency*. And that cannot be done simply.
You need to think of yum as just being a fancy version of wget. You can ask wget to get a whole web page and its pieces, and it will. And it'll even resolve some dependencies to fix up the hyperlinks between the files to fit the structure it builds on your machine. But that's it. It's a transport mechanism. So is yum. It's a transport.
The only new thing I'm asking is for it to pretend some new files weren't there.
The easiest way to do that is not to have those files exist. And using a date is not a reliable way to guarantee content.
All I want is to be able to pretend that additions more recent than a prior run weren't there.
No, that is not what you want. What you expressly requested was to be able to recreate a yum install, and that the results be consistent.
Then I'm missing why yum would make different decisions than it did when the files actually weren't there.
Perhaps you are using the word "consistent" in a way I don't understand.
Do you mean "repeatable" or do you mean "consistent"?
Either way, a file timestamp does not guarantee content.
[snip]
Until you run your _own_ YUM repository and use createrepo a few times, you will be oblivious to what a YUM repository is.
I believe you have struck the nail upon the head with perfect orthogonality.
In fact, he seems to have no concept of what configuration management is.
The configuration management is already done. All I'm asking for is
Not by using a timestamp it isn't.
a way to access what was done last week (which is still there) and ignore available additions. Yes it can be done with a complete snapshot of every repository state that I might want to repeat an update from. It could also be done by making a snapshot of
A snapshot is just that. What you want is named snapshots. And that is configuration management.
a current repository and physically removing the files I don't want yet. It could also be done if the yum program just ignored the files I don't want yet. The last option just seems nicer.
How can yum know what files you want to ignore, unless it has version tags associated, and a named version for the entire build?
Mike
On Fri, 2005-09-09 at 14:22, Mike McCarty wrote:
Les, what you seem to want is to put a tag on the files so one can create a snapshot version. A simple date will not do what you want. I know that you think it will. But I did configuration management for years, and you just do not have any concept of how difficult the problem is. You need to think of yum as being a transport agent, because that's really all it is. Yes, it knows about some dependencies, but it can't know about everything there is to know.
Please explain what would go wrong if yum simply ignored the presence of files newer than a specified date.
Please explain to me how the date of a file describes its contents.
Until you can tell me how, merely by looking at the date of a file, one can know what its contents are, then you haven't shown how by using dates one can get yum to do consistent downloads.
What I want is for yum to see the same set of .hdr files that it did in the prior run on a different machine even though some more have been added. I don't care about the contents of those new ones - I want to pretend they weren't added.
But, IIRC, you said yourself you wanted *consistency*. And that cannot be done simply.
I'm willing to trust the Centos repository maintainers not to remove existing RPM revs or make arbitrary changes to the same-named files within the window I need.
The only new thing I'm asking is for it to pretend some new files weren't there.
The easiest way to do that is not to have those files exist. And using a date is not a reliable way to guarantee content.
Yes, I'd prefer a tag, but time seems to move reliably in a single direction.
Then I'm missing why yum would make different decisions than it did when the files actually weren't there.
Perhaps you are using the word "consistent" in a way I don't understand.
Do you mean "repeatable" or do you mean "consistent"?
I don't understand the difference. I want yum to make the same decisions regardless of the possible presence of some new files in the repository.
Either way, a file timestamp does not guarantee content.
If the timestamps are correct I don't see how it can fail.
How can yum know what files you want to ignore, unless it has version tags associated, and a named version for the entire build?
I want to tell it which ones to ignore by telling it the time of a prior update that I would like it to repeat on a different machine. If it simply discarded the hdr files newer than that it would be looking at exactly the same input as that prior run had and should thus make the same decisions. Note that a subsequent update does not re-get all the old hdr files, it just pulls any newly available ones. I want it to pretend those weren't there.
On Friday 09 September 2005 13:49, Les Mikesell wrote:
files here. I'm asking for repeatable actions with all the same stuff available plus possibly some new things I'd prefer to ignore.
Ok, Les, try: rsync the header cache (found in /var/cache/yum/$repository/header) from the test box to production (they are small files) yum -C update (on the production box target of the rsync). Assuming the test box repository is populated from the internet (you said internet connectivity from production was better than to the test box), then the update on production should pull in the right files (assuming they still exist on that repository, which they might or might not).
And see if that helps. The 'yum -C' keeps yum from updating the cache; if you're good, you can edit these headers and remove things you don't want.
You seem to need a staging repo box out with the production boxes to help with your testbox -> production bandwidth bottleneck.
On Fri, 2005-09-09 at 14:26, Lamar Owen wrote:
On Friday 09 September 2005 13:49, Les Mikesell wrote:
files here. I'm asking for repeatable actions with all the same stuff available plus possibly some new things I'd prefer to ignore.
Ok, Les, try: rsync the header cache (found in /var/cache/yum/$repository/header) from the test box to production (they are small files) yum -C update (on the production box target of the rsync). Assuming the test box repository is populated from the internet (you said internet connectivity from production was better than to the test box), then the update on production should pull in the right files (assuming they still exist on that repository, which they might or might not).
And see if that helps. The 'yum -C' keeps yum from updating the cache; if you're good, you can edit these headers and remove things you don't want.
You seem to need a staging repo box out with the production boxes to help with your testbox -> production bandwidth bottleneck.
The testbox(s) are at a location with developers/QA people and so-so bandwidth. Production servers are at an assortment of places with excellent internet connectivity but so-so private line connectivity back to the location of the test boxes. It would be realistic to rsync the whole /var/cache/yum, packages and all, to one of the production boxes at each location, then using it as a staging relay from there to all the others. I can use the --bwlimit feature of rsync to throttle as needed. But poking around I see some xml gunk that I didn't expect (hadn't looked closely since it was just headers, packages, and header.info). I wonder if the installed RPM data gets cached now too.
On Fri, 2005-09-09 at 16:58 -0500, Les Mikesell wrote:
But poking around I see some xml
gunk that I didn't expect (hadn't looked closely since it was just headers, packages, and header.info). I wonder if the installed RPM data gets cached now too.
Yeah, that xml gunk isn't anything... it's only gives all the info that yam uses to update something... not too important.
Maybe you need to maybe look inside? Ya think?
On Fri, 2005-09-09 at 21:28, Roy wrote:
On Fri, 2005-09-09 at 16:58 -0500, Les Mikesell wrote:
But poking around I see some xml
gunk that I didn't expect (hadn't looked closely since it was just headers, packages, and header.info). I wonder if the installed RPM data gets cached now too.
Yeah, that xml gunk isn't anything... it's only gives all the info that yam uses to update something... not too important.
Maybe you need to maybe look inside? Ya think?
Maybe. There was a time when it didn't exist and yum did all the same things without it. I thought there was some philosophical reason for all of those individual .hdr files but maybe that has changed now.
Les Mikesell wrote:
On Thu, 2005-09-08 at 17:37, Bryan J. Smith wrote:
[snip]
- The repository to have every single package -- be it
packages as whole, or some binary delta'ing between RPMs (if possible)
It just needs to keep every package that it has ever had - at least as long as it might be useful for someone to install them. That seems to be the case now.
I don't know of any repo that keeps all packages ever released. CentOS moves them out of the repo so everyone doesn't have to mirror them. Fedora Core completely removes old packages (I have the rsync deletes to prove it). I only see the newest versions in Dags repo.
And again, I ask, who decides if it is "useful for someone to install them"? If the CentOS maintainers didn't feel the new packages were ready, they wouldn't release them.
Sorry, I just don't buy the concept that rsync'ing a whole repository is an efficient way to keep track of the timestamps on a few updates so you can repeat them later. Rsync imposes precisely that big load on the server side that you wanted to avoid having everyone do.
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
As I've said before, if you have your test machines pointed at the official mirrors, you have the RPMs you need and can just copy them into your repo and run createrepo, no rsync needed.
On Fri, 2005-09-09 at 08:54 -0400, William Hooper wrote:
Les Mikesell wrote:
On Thu, 2005-09-08 at 17:37, Bryan J. Smith wrote:
[snip]
- The repository to have every single package -- be it
packages as whole, or some binary delta'ing between RPMs (if possible)
It just needs to keep every package that it has ever had - at least as long as it might be useful for someone to install them. That seems to be the case now.
I don't know of any repo that keeps all packages ever released. CentOS moves them out of the repo so everyone doesn't have to mirror them. Fedora Core completely removes old packages (I have the rsync deletes to prove it). I only see the newest versions in Dags repo.
What we keep in the repo is just the latest point release (ie CentOS-4.1 ... which corresponds to el4 update1, CentOS-3.5, etc.) and updates to that. If you wanted anything older than that, you would need to get it from http://vault.centos.org/ and not our normal repos.
Even RH doesn't maintain every RPM for the release available via RHN ... just the most current.
And again, I ask, who decides if it is "useful for someone to install them"? If the CentOS maintainers didn't feel the new packages were ready, they wouldn't release them.
correct .. we have a seperate beta.centos.org for beta releases and a testing repo for released versions.
If we release it, we think it is production ready.
Sorry, I just don't buy the concept that rsync'ing a whole repository is an efficient way to keep track of the timestamps on a few updates so you can repeat them later. Rsync imposes precisely that big load on the server side that you wanted to avoid having everyone do.
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
The rsync load would be on our servers ... not yours ... you are running the client to download files ... we are running the servers :)
It would be much more load, cost, bandwidth to leave all the files in all the repos for all the arches and mirror that to 100,000 other places at more than 100GB extra per server.
As I've said before, if you have your test machines pointed at the official mirrors, you have the RPMs you need and can just copy them into your repo and run createrepo, no rsync needed.
That is true too ... it is just not a tree that could do full network installs, etc. And for just a little more space than the RPMS, you can do that too.
On Fri, 2005-09-09 at 08:54 -0400, William Hooper wrote:
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
Exactly! He seems to also fail to understand that there is a significant "cost savings" for _all_ parties to rsync the YUM repository.
On Fri, 2005-09-09 at 08:32, Bryan J. Smith wrote:
On Fri, 2005-09-09 at 08:54 -0400, William Hooper wrote:
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
Exactly! He seems to also fail to understand that there is a significant "cost savings" for _all_ parties to rsync the YUM repository.
The only reason there is even a possible savings is that yum circumvents standard http/ftp caching practices by randomizing the source locations. Even then, you'd have to update a vast number of server-type machines to make up for the fact that rsync'ing the repository is going to pull copies of updates for a gazillion programs that no machine has installed.
On Fri, 2005-09-09 at 10:21 -0500, Les Mikesell wrote:
On Fri, 2005-09-09 at 08:32, Bryan J. Smith wrote:
On Fri, 2005-09-09 at 08:54 -0400, William Hooper wrote:
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
Exactly! He seems to also fail to understand that there is a significant "cost savings" for _all_ parties to rsync the YUM repository.
The only reason there is even a possible savings is that yum circumvents standard http/ftp caching practices by randomizing the source locations. Even then, you'd have to update a vast number of server-type machines to make up for the fact that rsync'ing the repository is going to pull copies of updates for a gazillion programs that no machine has installed.
Yum doesn't do that at all ... we at CentOS do it on purpose.
We can't possibly provide access by one server to all the CentOS users who want to do updates. We transmit more than 18 TB of data per month for updates and rsyncs ... so we use something called rrdns (round robin DNS) to create mirror.centos.org (or us-mirror and eu-mirror) for yum, and msync.centos.org(or us-msync, eu-msync) for rsync. Those names all have multiple machines that respond in a round robin way to requests.
That way, we can utilize many different servers to provide CentOS yum and rsync servers.
On Fri, 2005-09-09 at 10:40 -0500, Johnny Hughes wrote:
On Fri, 2005-09-09 at 10:21 -0500, Les Mikesell wrote:
On Fri, 2005-09-09 at 08:32, Bryan J. Smith wrote:
On Fri, 2005-09-09 at 08:54 -0400, William Hooper wrote:
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
Exactly! He seems to also fail to understand that there is a significant "cost savings" for _all_ parties to rsync the YUM repository.
The only reason there is even a possible savings is that yum circumvents standard http/ftp caching practices by randomizing the source locations. Even then, you'd have to update a vast number of server-type machines to make up for the fact that rsync'ing the repository is going to pull copies of updates for a gazillion programs that no machine has installed.
Yum doesn't do that at all ... we at CentOS do it on purpose.
We can't possibly provide access by one server to all the CentOS users who want to do updates. We transmit more than 18 TB of data per month for updates and rsyncs ... so we use something called rrdns (round robin DNS) to create mirror.centos.org (or us-mirror and eu-mirror) for yum, and msync.centos.org(or us-msync, eu-msync) for rsync. Those names all have multiple machines that respond in a round robin way to requests.
One thing I wanted to point out though, since one name is used (ie, mirror.centos.org)... most caching proxy servers would cache the results.
That way, we can utilize many different servers to provide CentOS yum and rsync servers.
On Fri, 2005-09-09 at 10:51, Johnny Hughes wrote:
Yum doesn't do that at all ... we at CentOS do it on purpose.
We can't possibly provide access by one server to all the CentOS users who want to do updates. We transmit more than 18 TB of data per month for updates and rsyncs ... so we use something called rrdns (round robin DNS) to create mirror.centos.org (or us-mirror and eu-mirror) for yum, and msync.centos.org(or us-msync, eu-msync) for rsync. Those names all have multiple machines that respond in a round robin way to requests.
One thing I wanted to point out though, since one name is used (ie, mirror.centos.org)... most caching proxy servers would cache the results.
Yes, rrdns does work fine with standard caching techniques. It is the one where yum pulls a file containing a list of mirror urls and uses one of them that gets new copies all the time. Maybe it is the fedora boxes doing that. If so, its one of those things I wish I didn't need to know about.
Les Mikesell wrote:
On Fri, 2005-09-09 at 08:32, Bryan J. Smith wrote:
On Fri, 2005-09-09 at 08:54 -0400, William Hooper wrote:
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
Exactly! He seems to also fail to understand that there is a significant "cost savings" for _all_ parties to rsync the YUM repository.
The only reason there is even a possible savings is that yum circumvents standard http/ftp caching practices by randomizing the source locations.
We've been through this before. Yum only changes servers if you use the mirror list option. By default CentOS (at least 4) doesn't, so what is the problem?
Even then, you'd have to update a vast number of server-type machines to make up for the fact that rsync'ing the repository is going to pull copies of updates for a gazillion programs that no machine has installed.
So don't rsync it. Pull the RPMs from your test machine's yum cache and make your own repo.
On Fri, 2005-09-09 at 10:44, William Hooper wrote:
The only reason there is even a possible savings is that yum circumvents standard http/ftp caching practices by randomizing the source locations.
We've been through this before. Yum only changes servers if you use the mirror list option. By default CentOS (at least 4) doesn't, so what is the problem?
Most of my machines are running 3.5.
Even then, you'd have to update a vast number of server-type machines to make up for the fact that rsync'ing the repository is going to pull copies of updates for a gazillion programs that no machine has installed.
So don't rsync it. Pull the RPMs from your test machine's yum cache and make your own repo.
That's actually the most sensible suggestion so far. Is there a generic automation for this? Yum over ssh or something that doesn't take additional setup/infrastructure for every variation of Linux distributions or architecture I might like to use?
William Hooper wrote:
So don't rsync it. Pull the RPMs from your test machine's yum cache and make your own repo.
Les Mikesell wrote:
That's actually the most sensible suggestion so far.
@-PPP
Is there a generic automation for this?
@-PPP
Yum over ssh or something that doesn't take additional setup/infrastructure for every variation of Linux distributions or architecture I might like to use?
@-PPP
I'm at a loss for words now.
I think that's basically what everyone was saying earlier. Take one system, and use the RPMs in its YUM cache. It's straight-forward and easy to do.
On Fri, 2005-09-09 at 11:11, Bryan J. Smith wrote:
I'm at a loss for words now.
I think that's basically what everyone was saying earlier. Take one system, and use the RPMs in its YUM cache. It's straight-forward and easy to do.
Please quantify easy. Will you do it for me every time it needs to be done? Today I'd have a use for at least 6 variations, although I guess you'd double that with the suggested overlap of testing/staging instances. With a little thought about the process, yum updates could be made to be repeatable without extra work, network traffic or any other overhead. I just don't see why this is not considered desirable.
-- Les Mikesell lesmikesell@gmail.com
Les Mikesell wrote:
On Fri, 2005-09-09 at 11:11, Bryan J. Smith wrote:
I'm at a loss for words now.
I think that's basically what everyone was saying earlier. Take one system, and use the RPMs in its YUM cache. It's straight-forward and easy to do.
Please quantify easy.
It is very scriptable. A simple copy (rsync, cp, scp, etc.) and a single command to build the metadata (yum-arch in the CentOS 3 case) for each repo.
[snip]
Today I'd have a use for at least 6 variations, although I guess you'd double that with the suggested overlap of testing/staging instances.
Only if you choose to not use upstream for your testing.
With a little thought about the process, yum updates could be made to be repeatable without extra work, network traffic or any other overhead.
All three are required, you are just suggesting pushing them all to the mirror servers rather than on your local repo.
Les Mikesell lesmikesell@gmail.com wrote:
Please quantify easy. Will you do it for me every time it needs to be done? Today I'd have a use for at least 6 variations, although I guess you'd double that with the suggested overlap of testing/staging instances.
So, in other words, you want a service that provides custom tagging, revisioning and/or date-based retrieval. That includes your wish for a dynamic delta'ing repository and real-time RPM generation.
Again, I don't think you understand what delta'ing systems like CVS does compared to just a "HTTP accessable" repository trees. A world of difference!
With a little thought about the process,
Again, I don't think you understand how delta'ing systems like CVS work, and the massive mindshift in the "back-end" that is required. It's no "little thought about the process."
yum updates could be made to be repeatable without extra work, network traffic or any other overhead.
That's utter BS! Delta'ing is the _worst_ overhead! It works fine for textual data of a few MBs, but when you start rebuilding tens of MBs, you're going to kill your server after just a few clients!
I just don't see why this is not considered desirable.
I _never_ said it wasn't desirable. I just said it is _not_ feasible.
I have maintained version control systems for _large_ engineering components in my time -- everything from models to IC schematics. No matter how much you "break down" the files into smaller files, you still put a _lot_ of data around.
And that means I either have a massive Sun (and now Opteron) box that does major I/O with a massive amount of memory for "server-side" delta'ing assembly/disassembly, or a have the same for NFS performance for "client-side" delta'ing assembly/disassembly.
And that's _before_ we even get to the point of the "added delay" that users will see using the "YUM" or whatever client-side tool. It will take signficantly longer to resolve things -- regardless of who does it.
Although client-side resolution will be a crapload slower over the Internet than server. Which means these "servers" will need to be "intelligent" and take a crapload more load just for the resolution (even before we get to the actual package delta'ing/services) than just "dumbly" serving files up via HTTP.
Sorry, I've just built too many TB-sized engineering revision control repositories to even listen to this thread any longer. Revision control exponentially increases the load over just serving files whole. That's tolerable on small textual files, but intolerable on larger binaries (no matter how small you break down things) -- especially when tens of clients are hitting the server.
On Fri, 2005-09-09 at 13:28, Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
Please quantify easy. Will you do it for me every time it needs to be done? Today I'd have a use for at least 6 variations, although I guess you'd double that with the suggested overlap of testing/staging instances.
So, in other words, you want a service that provides custom tagging, revisioning and/or date-based retrieval. That includes your wish for a dynamic delta'ing repository and real-time RPM generation.
Is that what you were saying was easy?
Again, I don't think you understand what delta'ing systems like CVS does compared to just a "HTTP accessable" repository trees. A world of difference!
I've never said I wanted deltas. I've said I wanted yum to not consider additions to a repository past a certain timestamp so it will make the same update decision it did a week or so ago even if the repostitory has additions.
You said the alternative was easy. This question was just to find out what you mean by "easy". Put a price tag on what you mean when you say easy.
How much would it cost for you to do it your easy way for me - to have perhaps a dozen repository states saved and 10 clients configured to use each of them?
There is a program called yam on Dag's site. There are several others on the net. They build repos. Build one the way you want it built. Tell us your results. Show us how to do it your way.
Roy
Roy wrote:
There is a program called yam on Dag's site. There are several others on the net. They build repos. Build one the way you want it built. Tell us your results. Show us how to do it your way.
Roy
I like your suggestion, Roy!
As he said, "All you have to do..." and "All I want is..." and "It's simple...".
So, show us!
Alcatel sure could have used him all those years. And to think we had a whole staff just to do all that easy stuff.
Mike
Les Mikesell wrote:
On Fri, 2005-09-09 at 13:28, Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
[snip]
Again, I don't think you understand what delta'ing systems like CVS does compared to just a "HTTP accessable" repository trees. A world of difference!
I've never said I wanted deltas. I've said I wanted yum to not consider additions to a repository past a certain timestamp so it will make the same update decision it did a week or so ago even if the repostitory has additions.
Ok, you don't want to store deltas, you want to store multiple *copies*. That's even worse!
You said the alternative was easy. This question was just to find out what you mean by "easy". Put a price tag on what you mean when you say easy.
How much would it cost for you to do it your easy way for me - to have perhaps a dozen repository states saved and 10 clients configured to use each of them?
[snip]
Well, let's see, you want to have everyone in the world except you to have to maintain a dozen repository "states" (rather ill-defined word) so you can have the luxury of not doing so yourself. And, never having maintained one (let alone a dozen) repository "states" (whatever they are), you presume that it is easy.
Here's what I don't get about this thread: We have a fellow who has never done configuration management saying "All you have to do is..." when experts still don't have a complete solution to configuration management which works in all circumstances. And when people who *have* done configuration management try to point out that the task he wants performed is more complex than he imagines, he argues back.
Everything always looks simple from the outside, I guess.
Mike
On Fri, 2005-09-09 at 14:45, Mike McCarty wrote:
Well, let's see, you want to have everyone in the world except you to have to maintain a dozen repository "states" (rather ill-defined word) so you can have the luxury of not doing so yourself. And, never having maintained one (let alone a dozen) repository "states" (whatever they are), you presume that it is easy.
No, that's what I want to avoid. Everyone else responding *is* maintaining snapshot copies of repositories and I don't think anyone should have to. Yum's view of a repository consists of all the hdr files it has downloaded from it. I want it to pretend that files added after a certain time weren't there, thus creating a view of the state of the repository at a prior time. Given only that, nothing anyone has said yet has convinced me that that yum would not make the same decisions about update versions again.
Les Mikesell wrote:
On Fri, 2005-09-09 at 14:45, Mike McCarty wrote:
Well, let's see, you want to have everyone in the world except you to have to maintain a dozen repository "states" (rather ill-defined word) so you can have the luxury of not doing so yourself. And, never having maintained one (let alone a dozen) repository "states" (whatever they are), you presume that it is easy.
No, that's what I want to avoid. Everyone else responding *is* maintaining snapshot copies of repositories and I don't think anyone should have to. Yum's view of a repository consists of all the hdr files it has downloaded from it. I want it to pretend that files added after a certain time weren't there, thus creating a view of the state of the repository at a prior time. Given only that, nothing anyone has said yet has convinced me that that yum would not make the same decisions about update versions again.
I thought I mentioned something about file timestamps not guaranteeing file content. The date of a file is simply the moment in time when it got placed onto the web server. I'm not sure yum can even get that information, but if it could, it would be useless. A repository is really just a web page, and yum is just a wget with some control files which tell it what to pull. So I don't think that yum's view of a repository is quite what you think. The yum program is very nice for what it does, and it does it very well. It's very nice to use. I like it. It does some very clever things. But due to the nature of what a repository is, I don't see how yum could be used to accomplish the end you have in mind. Even if it had access to file dates. And making a repository contain the kind of information it would have to have in order to accomplish your goal would take quite a degree of co-operation among developers.
Mike
On Fri, 2005-09-09 at 16:16, Mike McCarty wrote:
I want it to pretend that files added after a certain time weren't there, thus creating a view of the state of the repository at a prior time. Given only that, nothing anyone has said yet has convinced me that that yum would not make the same decisions about update versions again.
I thought I mentioned something about file timestamps not guaranteeing file content. The date of a file is simply the moment in time when it got placed onto the web server. I'm not sure yum can even get that information, but if it could, it would be useless.
How would ftp based mirroring work or http caching work if you couldn't tell if a file was newer than a certain time? The moment in time a file was placed the repository is all yum would need to know to not consider it if it appeared after the specified timestamp.
A repository is really just a web page, and yum is just a wget with some control files which tell it what to pull. So I don't think that yum's view of a repository is quite what you think.
It used to be. I'm not sure about the new xml gunk and what all it caches.
Les Mikesell wrote:
No, that's what I want to avoid. Everyone else responding *is* maintaining snapshot copies of repositories and I don't think anyone should have to. Yum's view of a repository consists of all the hdr files it has downloaded from it. I want it to pretend that files added after a certain time weren't there, thus creating a view of the state of the repository at a prior time.
At this moment in time, to do what you want, you have to maintain your own repository.
If you want to create a custom repository and have control of it yourself, or make a snapshot of a yum repository, which is a good idea to save bandwidth if you are going to update multiple computers in one location, there is a very good program to do this easily. It is called Repo-janitor and the author's website is here: http://www.bioxray.dk/~mok/repo-janitor.php
I have written a howto and have rpms here for Centos 4.1: http://smeserver.sourceforge.net/howto/RepoJanitor
Hope this helps,
Greg Swallow
Les Mikesell lesmikesell@gmail.com wrote:
No, that's what I want to avoid. Everyone else responding *is* maintaining snapshot copies of repositories and I don't think anyone should have to. Yum's view of a repository consists of all the hdr files it has downloaded from it.
Yes, for a _specific_ set of RPMs. That is pre-generated, and available statically for the current RPM set.
YUM repositories cannot say ... "oh, what if this RPM wasn't uploaded?" and that type of arbitrary meta-data. To do so requires multiple copies of the meta-data.
I want it to pretend that files added after a certain time weren't there, thus creating a view of the state of the repository at a prior time.
Then you need a _different_ meta-data snapshot from that _earlier_ time. Either that, or a delta of _all_ headers, which significantly _bloats_ the meta-data.
Which means the client either has to download _all_ that meta-data history, and do its own delta-assembly/resolution, which means a _lot_ of traffic. *OR* the server has to have a dynamic service that does this for a client.
In _either_ case, the initial delay when yum is run and the meta-data is fetched/resolved is massive increased. You go from seconds to minutes.
Given only that, nothing anyone has said yet has convinced me that that yum would not make the same decisions about update versions again.
Understand YUM is making update decisions based on the meta-data provided by the repository, and that meta-data is for the _current_ state of the repository. There is absolutely _no_way_ that meta-data it can provide "here was the state of the repository 5 minutes ago" or "10 hours ago" or "when another system ran 5 days ago."
A YUM repository is a web site with RPMs. It has a meta-data listing for YUM clients. YUM clients do _not_ read the RPMs/headers. They only read the meta-data.
That meta-data is pre-built, against the repository. It cannot tell the YUM clients anything about the state of the repository at an earlier date.
To do so would either require a massive bloat of the size of the meta-data, or a dynamic service at the repository (instead of simple HTTP serving) to generate that information.
An alternative is to use my prposed "hack" that lets YUM clients retrieve any prior, pre-built meta-data list. That way anytime the meta-data is regenerated, old meta-data lists are not lost and still available.
But it requires a modification on the repo side. The meta-data does not and cannot (without _major_ changes) support any "history." That's what you're not seeing. And it falls on your usage of YUM only as a client, and not any experience of creating a YUM repository. It's a web site with an index (the meta-data) -- nothing more!
In fact, it's like the difference between an Apache server, and Subversions using Apache+WebDAV+DeltaV (WebDAV adds file management, DeltaV adds basic versioning).
Les Mikesell lesmikesell@gmail.com wrote:
I've never said I wanted deltas.
Okay, let's scratch RPM deltas then. That was someone else.
But you _still_ need at least a "delta" on the meta-data. Whether you delta the meta-data files, or you keep multiple copies, you _must_ do so.
YUM repositories do _not_ keep a running journal of changes in their meta-data. And YUM clients _only_ access that meta-data. YUM clients do _not_ inspect the RPM tree for resolution, only the meta-data.
My "conceptual hack" is a way to maintain multiple copies of the repo's meta-data, so you can look up older versions of the repository. It's not as good as a delta of the repository, and you "can't regenerate an old repository," but it does let you reference previous meta-data sets that were previously generated.
Which does most of what you want. You want to be able to pass a date that worked on another system, to any subsequent system. So you can access the repository's meta-data from that time by passing date. But, _again_, that requires the maintainance of _multiple_ sets of the repo meta-data.
The repo meta-data, which is what the YUM client uses, does _not_ contain a history -- only the current set of files, and there is no "date" information (nor can there be without deltas).
I've said I wanted yum to not consider additions to a repository past a certain timestamp so it will make the same update decision it did a week or so ago even if the repostitory has additions.
And I said that is _impossible_!
The YUM client _never_ accesses date information during resolution.
The YUM repository is a "static web site" and cannot provide that information.
The YUM repository contains meta-data matched the _discrete_date_ that "createrepo" or whatever tool that created the meta-data list.
And that's what the YUM client accesses.
To add in date information and all the meta-data associated would massively bloat the meta-data files. Which is why I suggested that conceptual hack, to serve up the meta-data from specific times in the past.
Do you understand this now?
Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
[snip]
I've said I wanted yum to not consider additions to a repository past a certain timestamp so it will make the same update decision it did a week or so ago even if the repostitory has additions.
And I said that is _impossible_!
The YUM client _never_ accesses date information during resolution.
The YUM repository is a "static web site" and cannot provide that information.
Until he realizes that yum is just wget-on-steroids, you aren't going to make any progress.
[snip]
Do you understand this now?
Not until he understands that yum is just wget-on-steroids.
Mike
Les Mikesell wrote:
On Fri, 2005-09-09 at 10:44, William Hooper wrote:
The only reason there is even a possible savings is that yum circumvents standard http/ftp caching practices by randomizing the source locations.
We've been through this before. Yum only changes servers if you use the mirror list option. By default CentOS (at least 4) doesn't, so what is the problem?
Most of my machines are running 3.5.
I just verified CentOS 3.5 is configured the same as 4.
Even then, you'd have to update a vast number of server-type machines to make up for the fact that rsync'ing the repository is going to pull copies of updates for a gazillion programs that no machine has installed.
So don't rsync it. Pull the RPMs from your test machine's yum cache and make your own repo.
That's actually the most sensible suggestion so far.
I've said it three or four times now.
Is there a generic automation for this?
Copy the files to an FTP/HTTP server and run yum-arch on them (since you are using CentOS 3.5).
Yum over ssh or something that doesn't take additional setup/infrastructure for every variation of Linux distributions or architecture I might like to use?
All can be served from a single FTP/HTTP server, just like the CentOS repos are now.
On Fri, 2005-09-09 at 07:54, William Hooper wrote:
I don't know of any repo that keeps all packages ever released. CentOS moves them out of the repo so everyone doesn't have to mirror them. Fedora Core completely removes old packages (I have the rsync deletes to prove it). I only see the newest versions in Dags repo.
And again, I ask, who decides if it is "useful for someone to install them"? If the CentOS maintainers didn't feel the new packages were ready, they wouldn't release them.
For the scope of what I'm asking, the "useful" period is the amount of time you need to run a test machine and would want to duplicate that tested update. Maybe a few weeks at most... I agree that updates are always desirable and that CentOS updates almost never break things so there is no reason to be far behind.
Sorry, I just don't buy the concept that rsync'ing a whole repository is an efficient way to keep track of the timestamps on a few updates so you can repeat them later. Rsync imposes precisely that big load on the server side that you wanted to avoid having everyone do.
Rsync only imposes that load the once or twice a month you sync, not every time a machine does a "yum update".
Caching network content without having to make a special effort for every different source is a problem that was solved eons ago by squid and similar caching proxies. My other complaint about yum is that it goes out of its way to prevent this from being useful in the standard way. No machine behind a caching proxy should have to pull a new copy of an unchanged file.
-- Les Mikesell lesmikesell@gmail.com
Les Mikesell lesmikesell@gmail.com wrote:
Caching network content without having to make a special effort for every different source is a problem that was solved eons ago by squid and similar caching proxies. My other complaint about yum is that it goes out of its way to prevent this from being useful in the standard way. No machine behind a caching proxy should have to pull a new copy of an unchanged file.
YUM _uses_ HTTP. But yes, any [good] HTTP proxy _will_ check for changes to the files retrieved.
I think you're failing to realize the problem is _not_ the tool, but the social/load aspects of something doing what you want.
I've posted my concept for a little "hack," but it's far from "ideal." "Ideal" is _only_ by maintaining your own, internal repository.
It is what Microsoft does. It is what Red Hat does. It is what Sun does. Etc...
You mirror their updates, and deploy internally. How you manage that varies, but it's _not_ a tool issue. It's a 100% social/load issue.
On 9/8/05, Les Mikesell lesmikesell@gmail.com wrote:
On Thu, 2005-09-08 at 16:24, Johnny Hughes wrote:
What is it that you don't understand about the "costs" of configuration management?
The part I don't understand is why the tool built for the purpose doesn't do what everyone needs it to do. Is that simple enough? Yes, I know I can build my own system. I know there are workarounds. I'd rather not.
Yum is not designed for configuration management ... unless you want to update to the latest releases in the repo. In that case, it works perfectly.
What I want is to be able to update more than one machine and expect them to have the same versions installed. If that isn't a very common requirement I'd be very surprised.
To be very clear - yum is an updating tool and is meant only to keep your system up to date. It is not meant in any way to do configuration management. People have found ways to make it do configuration management, and they have explained a variety of methods that will get it to do configuration management, but no, your request is absolutely not a common feature request for yum because people understand that it is meant simply to update systems, not do "configuration management" on them.
There are 2 separate issues: One is that yum doesn't know if a repository or mirror is consistent or in the middle of an update with only part of a set of RPM's that really need to be installed together.
Yeah, and I'm pretty sure that this is identified as an unlikely corner case worth fixing at some point, but I may be wrong.
The other is that if you update one machine and everything works, you have no reason to expect the same results on the next machine a few minutes later.
If you want a repository to be consistent, you will need to pay for it or manage it yourself. The latter is not difficult, so why is it such a problem for you aside from a poor network setup for your machines?
Both issues would be solved if there were some kind of tag mechanism that could be applied by the repository updater after all files are present and updates could be tied to earlier tags even if the repository is continuously updated.
I realize that yum doesn't do what I want - but lots of people must be having the same issues and either going to a lot of trouble to deal with them or just taking their chances.
Clearly people are not having this problem. They have made their own repositories and gotten on with life. I don't remember any discussion of it on the yum list over the last ~1.5 years I've been on the yum and yum-devel lists.
Greg
Les Mikesell wrote:
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there. (You do not need to build anything from SRPMS). You make that accessible from the web and run createrepo on it.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises.
Then you: point your test machines to the official repos, after testing, move the tested RPMs to your locally managed repo and run createrepo. Done.
That means I need to copy that whole repository (of a size you said was such a problem mirroring that you had to break it at the point releases)
I believe Mr. Hughes was referring to the size of the whole mirror (all releases and all updates). Using the method above you only have to download the packages you need.
[snip]
where someone has tagged the 'known good' states as the changes were added.
Someone who? When updates are released they are believed to be in 'known good' states, but yet you (and a good number of people) still test them in your environment. Having anyone else besides yourself tagging things doesn't work, so you will be keeping your own repos anyway.
You only put authorized RPMS in there, and you rerun createrepo every time you put a new RPM in there.
Normally I'll want to mirror the official repository to get the set for testing. How do I know when you are finished doing your updates so that I don't get an rpm with a dependency that you haven't copied in yet?
You see the errors when you do the yum update on your test machine (see above, test machines are the only ones looking at the official repos). Or you look at the announce list and verify that the newer packages are all there.
On Thursday 08 September 2005 15:12, Les Mikesell wrote:
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there. (You do not need to build anything from SRPMS). You make that accessible from the web and run createrepo on it.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises. That means I need to copy that whole repository (of a size you said was such a problem mirroring that you had to break it at the point releases) and repeat the copy for every state where I might want repeatable updates or I have to track every change. I do realize that both of these options are possible, I just don't see why anyone considers them desirable. Compare it to how you get a set of consistent updates from a cvs repository where someone has tagged the 'known good' states as the changes were added.
One of the key reasons that CVS works so well for source is that, once the initial import is done, everything is done via diffs and patches. This makes the repository smaller, and automatically makes the things CVS does well (multiple versions, consistent repository states) done. While a CVS commit is in progress, for instance, other users still see the previous state; this is not true for a YUM repository. When the mirroring of the repo takes so long, the result becomes that the repositories on the mirrors could, during heavy updates, be in an inconsistent state more often that not. As the percentage of inconsistent states versus consistent states rise, the usefullness of the repository falls exponentially.
This seems a good application for differential or 'patch' RPM's and a CVS-like RPM repository. The mirroring requirements alone make, in my mind at least, a good case for patch RPM's that take up far less space (and take far less time to mirror, and can be mirrored in a transaction so that the repository, to the yum enduser, is ALWAYS consistent), and then a CVS-like RPM repository that stores the initial import and all patches from that import, and builds the desired RPM on the fly. Really, this problem has been dealt with before in the various revision control systems. To mirror the whole repository, something like CVSup that gets a consistent copy could be built. A portion of this infrastructure is available now as yam, but the underlying repository is far from ACID compliant (and now we're talking databases).
The mirroring size difference alone might make it worthwhile. But, it is likely to require more CPU; one of those trade-offs: CPU versus disk and bandwidth.
Lamar Owen lowen@pari.edu wrote:
This seems a good application for differential or 'patch' RPM's and a CVS-like RPM repository. The mirroring requirements alone make, in my mind at least, a good case for patch RPM's that take up far less space (and take far less time to mirror, and can be mirrored in a transaction so that the repository, to the yum enduser, is ALWAYS consistent), and then a CVS-like RPM repository that stores the initial import and all patches from that import, and builds the desired RPM on the fly.
But now you're talking GBs of binary data, and "real-time" resolution by 1,000s of clients!
Yes, it's techncially possible to leverage XDelta and other support to do this. And you're going to burden your server with a massive amount of overhead that you don't have when you simply share stuff out via HTTP.
Think about it. ;->
Really, this problem has been dealt with before in the various revision control systems. To mirror the whole repository, something like CVSup that gets a consistent copy could be built. A portion of this infrastructure is available now as yam, but the underlying repository is far from ACID compliant (and now we're talking databases).
The mirroring size difference alone might make it worthwhile. But, it is likely to require more CPU; one of those trade-offs: CPU versus disk and bandwidth.
CPU, memory, disk, etc... will be _expoentially_ increased.
As I said, check in GBs of different binarie revisions to CVS and share the same out in multiple trees via HTTP. Now monitor the difference in load when you have 1, 2, 4, 8 ... 1024 clients connect!
Your CVS server is _crawling_ by just a half-dozen or so clients. Your Apache server is handling over 100 without much issue, depending on your I/O.
It is _not_ feasible, period.
On Friday 09 September 2005 11:43, Bryan J. Smith wrote:
Lamar Owen lowen@pari.edu wrote:
This seems a good application for differential or 'patch' RPM's and a CVS-like RPM repository. The mirroring
But now you're talking GBs of binary data, and "real-time" resolution by 1,000s of clients!
Yes, it's techncially possible to leverage XDelta and other support to do this. And you're going to burden your server with a massive amount of overhead that you don't have when you simply share stuff out via HTTP.
It depends on the implementation. You in your other delta message spell out essentially the same idea.
Think about it. ;->
I have, repeatedly. If the RPMs in question are stored with the payload unpacked, and binary deltas against each file (similar to the CVS repository v file) stored, then what is happening is not quite as CPU-intensive as you make it out to be. Most patches are a few bytes here and there in an up to a few megabyte executable, with most package patches touching one or a few files, but typically not touching every binary in the package. You store the patch (applied with xdelta or similar) and build the payload on the fly (simple CPIO here). You send an RPM out that was packed by the server, which is I/O bound, not CPU bound. With forethought to those things that can be prebuilt versus those things that have to be generated realtime, the amount of realtime generation can be minimized, I think.
CPU, memory, disk, etc... will be _expoentially_ increased.
Prove exponential CPU usage increase. If designed intelligently, it might be no more intensive than rsync, which is doing much of what is required already. Would need information on the loading of rsync on a server.
As I said, check in GBs of different binarie revisions to CVS and share the same out in multiple trees via HTTP. Now monitor the difference in load when you have 1, 2, 4, 8 ... 1024 clients connect!
That's because CVS as it stands is inefficient with binaries.
It is _not_ feasible, period.
Think outside the CVS box, Bryan. I did not say 'Use CVS for this'; I said 'Use a CVS-like system for this' meaning simply the guts of the mechanism. CVS per se would be horribly inefficient for this purpose.
Store the unpacked RPMs and binary deltas for each file. Store prebuilt headers if needed. Trust the server to sign on the fly rather than at build time (I/O bound). Pack the payload on the fly with CPIO (I/O bound). Send the RPM out (I/O bound) when needed. Mirrors rsync the whole unpacked repository (I/O bound).
Are there issues with this? Of course there are. But the tradeoff is mirroring many GB of RPM's (rsync has to take some CPU for mirroring this large of a collection) versus mirroring fewer GB of unpacked RPM's plus binary deltas, and signing the on-the-fly RPM. Yes, it will take more CPU, but I think linearly more CPU and not exponentially. Of course, it would have to be tried. The many GB of mirror has got to have many GB of redundancy in it.
The size of the updates is getting out of control; for those with limited bandwidth it becomes very difficult to stay up to date.
Lamar Owen lowen@pari.edu wrote:
It depends on the implementation. You in your other delta message spell out essentially the same idea.
No, my other message was a _completely_different_ idea. It is a hack to the HTTP-serviced repository that just keeps multiple sets of repodata directories.
A major difference between a true delta back-end and that hack is that while you can re-generate the meta-data at any point for the former, you can_not_ for the latter. In other words, the "repodelta" hack I described can _only_ generated repodata for the state of the repository then, there and at _no_ other time. I.e., you cannot "go back in time" to re-generate it.
There is no "database" or "interwoven history" of the repository in the repodelta hack. It is just a simple hack to keep multiple copies of the repodata meta-data, that's it.
I have, repeatedly. If the RPMs in question are stored with the payload unpacked, and binary deltas against each file (similar to the CVS repository v file) stored,
I don't think you're realizing what you're suggesting. Who is going to handle the load of the delta assembly?
It's one thing to do an off-line disassembly and "check-in" the files, that only happens once -- when you upload the file.
But the on-line, real-time, end-user assembly during "check-out" is going to turn even a high-end server into a big-@$$ door-stop (because it's not able to do much else) with just a few users checking things out! Do you understand this?
BTW/FYI: I know how deltas work -- not only text, but the larger issue of delta'ing binary files. And I have personally deployed XDelta as a binary delting application over the last 5 years, since CVS can only store binaries whole. I haven't looked into how Subversion stores binaries (same algorithm as XDelta?).
then what is happening is not quite as CPU-intensive as you
make it out to be.
Not true! Not true at all! You're talking GBs of transactions _per_user_.
You're going to introduce: - Massive overhead - Greatly increased "resolution time" (even before considering the server responsiveness) - Many other issues that will make it "unusable" from the standpoint of end-users
You can_not_ do this on an Internet server. At most, you can do it locally with NFS with GbE connections so the clients themselves off-load a lot of the overhead. That's not feasible over the Internet, so that falls back on the Internet server.
As I mentioned before, not my Internet server! ;->
Most patches are a few bytes here and there in an up to a few megabyte executable, with most package patches touching one or a few files, but typically not touching every binary in the package. You store the patch (applied with xdelta
or
similar) and build the payload on the fly (simple CPIO here). You send an RPM out that was packed by the server, which is I/O bound, not CPU bound.
Either you have to: - Do full xdelta revisions on the entire RPM (ustar/cpio) - Break up the RPM and use your own approach
In any case, it's a crapload more overhead than merely serving out files via HTTP. You're going to reduce your ability to service users by an order of magntitude, if not 2!
With forethought to those things that can be prebuilt versus those things that have to be generated realtime, the amount of realtime generation can be minimized, I think.
That's the key right there -- you think.
Again, keep in mind that repositories merely serve out files via HTTP today. Now you're adding in 10-100x the overhead. You're sending data back and forth, back and forth, back and forth, between the I/O, memory, CPU, etc... Just 1 single operation is going to choke most servers that can service 10-100 HTTP users.
Prove exponential CPU usage increase. If designed intelligently, it might be no more intensive than rsync, which is doing much of what is required already. Would need information on the loading of rsync on a server.
No, you're talking about facilities that go beyond what rsync does. You're not just doing simple file differences between one system and another. You're talking about _multiple_ steps through _multiple_ deltas and lineage.
There's a huge difference between traversing extensive delta files and just an rsync delta between existing copies. ;->
That's because CVS as it stands is inefficient with binaries.
I only referenced CVS because someone else made the analogy. So yes, I know CVS stores binaries whole. That aside, the XDelta is _still_ going to cause a sizeable amount of overhead. Far more than Rsync.
Think outside the CVS box, Bryan.
I am. I _only_ used CVS because it was used prior for analogy. Now I'm talking about XDelta, which I _did_ have in mind previously when I wrote my prior e-mails.
I did not say 'Use CVS for this'; I said 'Use a CVS-like system for this' meaning simply the guts of the mechanism.
I know. I was already thinking ahead, but since the original poster doesn't even understand how delta'ing works, I didn't want to burden him with further understanding.
CVS per se would be horribly inefficient for this purpose.
Delta'ing _period_ is horribly inefficient for this purpose. In fact, storing the revisions whole would actually be _faster_ than reverse deltas of _huge_ binary files.
I don't care how you "break it up" -- it's going to _kill_ your server compared to just an HTTP stream.
Store the unpacked RPMs and binary deltas for each file.
You're talking about cpio operations _en_masse_ on a server! Have you ever done just a few smbtar operations from a server before? Do you _know_ what happens to your I/O?
_That's_ what I'm talking about.
Store prebuilt headers if needed.
As far as I'm concerned, that's the _only_ thing you should _ever_ delta. I don't relish the idea of a repository of delta'd cpio archives. It's just ludicrious to me -- and even more so over the Internet.
Because on the Internet, now you have to start "buffering" or "temporarily storing" packages. When you have tens of systems getting updates, you're duplicating a lot. Case-in-point: You'd be better off just storing the RPMs whole on the filesystem itself.
Only revision headers, period.
Trust the server to sign on the fly rather than at build time (I/O bound).
No, sorry. I sign _off-line_ for a reason.
Pack the payload on the fly with CPIO (I/O bound).
But the problem is you have duplicate I/O streams -- back and forth. That's a PITA when you've got tens of operations going on.
Again, have you _ever_ run smbtar from your server to just a few Windows clients for backup? Same problem.
Send the RPM out (I/O bound) when needed.
And buffer it, temporarily store it, etc... for 10+ connections.
Mirrors rsync the whole unpacked repository (I/O bound).
But it does a delta against 2 existing files -- not an entire lineage of deltas. I really don't think you've thought this through.
Are there issues with this? Of course there are. But the tradeoff is mirroring many GB of RPM's (rsync has to take some CPU for mirroring this large of a collection) versus mirroring fewer GB of unpacked RPM's plus binary deltas,
I think your minimizing the binary delta operation, big time. I don't think you're going to save any size in the end for mirrors either.
and signing the on-the-fly RPM.
Again, for security reasons, I very much consider this to be a "disadvantage." I like to sign _off-line_ for a reason -- still automated -- but from an _internal_ system.
Yes, it will take more CPU, but I think linearly more CPU and not exponentially.
Here's a "real world" test for you.
Write a Apache script or even C program that takes XDelta version files, makes them into a cpio archive, and services them up.
Now just services up the cpio archive without all the processing.
How many clients can you serve for each?
Of course, it would have to be tried. The many GB of mirror has got to have many GB of redundancy in it. The size of the updates is getting out of control; for those with limited bandwidth it becomes very difficult to stay up to date.
I think you've underestimated the resources required to XDelta -- not "two points" like in rsync, but _multiple_. The cpio operation actually pales in comparison.
On Fri, 2005-09-09 at 13:18, Bryan J. Smith wrote:
I have, repeatedly. If the RPMs in question are stored with the payload unpacked, and binary deltas against each file (similar to the CVS repository v file) stored,
I don't think you're realizing what you're suggesting. Who is going to handle the load of the delta assembly?
I don't particularly want to promote the delta idea, but if you keep in mind that the RPMS are already stored in increasing version-numbered revs, if a binary delta between each version were also available without changing anything else, it would be trivial for a client to decide whether it is more efficient to apply a delta to an existing cached or locally available version or pull the latest. It would take more storage, but wouldn't break anything already working and could reduce network traffic considerably.
BTW/FYI: I know how deltas work -- not only text, but the larger issue of delta'ing binary files. And I have personally deployed XDelta as a binary delting application over the last 5 years, since CVS can only store binaries whole. I haven't looked into how Subversion stores binaries (same algorithm as XDelta?).
Likewise, how does the style used in rdiff-backup compare? It claims to be similar to rsync which has proven very efficient in being able to transmit the differences between two files. With rdiff the server side work only has to be done once.
Not true! Not true at all! You're talking GBs of transactions _per_user_.
You really only need to create a delta once per RPM version update creation.
You're going to introduce:
Then you need to store the delta in addition to the full versions, so there is more disk storage needed for this approach.
That's the key right there -- you think.
Again, keep in mind that repositories merely serve out files via HTTP today.
They could still do that. The only overhead added would be for the storage of the deltas and the traffic of the client checking the sizes. You would trade that off against the network traffic saved when the client chooses the smaller delta. But, for this to work you need an on-line local cache of the base rpms. Yum saves one for a while for the updates but I doubt if enough people would set up the local cache of the base files to make this approach work unless that step is automated during the OS install.
Les Mikesell lesmikesell@gmail.com wrote:
it would be trivial for a client to decide whether it is more efficient to apply a delta to an existing cached or locally available version or pull the latest.
But how does the "appropriate delta" get built? At the server! That's more server overhead, let alone the "service" that the client uses to query.
Now let's say you just have the client download the _entire_ delta for the RPM to avoid all that extra server overhead. Now you're actually _increasing_ the amount you download. ;->
There's no way to do it _remotely_, over the Internet without add a lot of load -- either in the total amount of transfer, or in the amount of overhead in the service. You're no longer offering a simple HTTP-serviced tree, but an intelligent service on the server that requires a lot more overhead.
It would take more storage, but wouldn't break anything already working
??? You really don't know what a YUM repository is, do you ???
It's a web site. Not only that, the YUM client only accesses the meta-data on the web site for resolution, _not_ the actual RPMs. That's the problem.
To do otherwise requires far more bandwidth. Or requires the server to have an "intelligent" service, and not just a "dumb" web site.
Likewise, how does the style used in rdiff-backup compare?
rdiff-backup does 1 delta, against 2 files.
A delta versions system requires you to _ripple_ through every delta, _rebuilding_ every change each time.
That either requires you to have a beefy server to do that. Or it requires you to have a massive amount of bandwidth for the client to download all the deltas and it will do it.
It claims to be similar to rsync which has proven very efficient in being able to transmit the differences between two files.
rsync does *1* delta. Furthermore, it's not as efficient as a straight HTTP stream when it comes to the server.
With rdiff the server side work only has to be done once.
Not for rippling through multiple deltas, which is what versioned files are. Several of you seem to forget that aspect.
Let alone the repository "service" will need to temporarily house what you need, or handle multiple accesses for each "rippled" delta the client then re-assembles.
You're talking a lot more overhead than just a HTTP access.
You really only need to create a delta once per RPM version update creation.
Yes, per RPM version update creation. You're talking about the "disassembly" of the "check-in." That's cake! You're right!
But what do clients need? Re-assembly in the check-out! So if there have been 5 version updates since, then _all_5_ deltas will need to be "rippled through."
Again, at what point do people realize YUM is just HTTP access, and now you're doing a lot more. As someone who has maintained engineering part files, semiconductor layouts, etc... in various revision control systems -- I can tell you this is a _lot_ of overhead versus just file access like via HTTP, NFS, etc...
You can't service more than a few clients locally, much less the nightmare of an Internet server where operations are much slower, temporary files must be created, etc...
Then you need to store the delta in addition to the full versions, so there is more disk storage needed for this approach.
So why not just store the _whole_ versions? That's my point.
Maybe I can make a better analogy ... backup servers.
It's one thing to have a full backup and a few incrementals and rebuilt them for a restore. It's a completely different thing to have tens of clients wanting the same!
They could still do that. The only overhead added would be for the storage of the deltas and the traffic of the client checking the sizes.
I give up.
You would trade that off against the network traffic saved when the client chooses the smaller delta. But, for this to work you need an on-line local cache of the base rpms.
Hence why you should have a local repository! Isn't that what you were arguing against?!?!?!
Yum saves one for a while for the updates but I doubt if enough people would set up the local cache of the base files to make this approach work unless that step is automated during the OS install.
Wait, you're actually making sense now! No way!
On Fri, 2005-09-09 at 14:41, Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
it would be trivial for a client to decide whether it is more efficient to apply a delta to an existing cached or locally available version or pull the latest.
But how does the "appropriate delta" get built? At the server! That's more server overhead, let alone the "service" that the client uses to query.
It would be built by the person who maintains the master repository - only once per new RPM rev.
Now let's say you just have the client download the _entire_ delta for the RPM to avoid all that extra server overhead. Now you're actually _increasing_ the amount you download.
Huh? You don't build the delta on demand, you offer a choice of full revs and deltas and their sizes. The client can easily compute which to download based on what it already has.
Likewise, how does the style used in rdiff-backup compare?
rdiff-backup does 1 delta, against 2 files.
Yes, that would be the way to do it. As a final step of creating an RPM update, build that delta. Offer it to the millions of clients who already have the original file. It would be a nice proof-of-concept to feed the updates at the end of a fedora cycle to it to see what the savings would be to use deltas vs full downloads.
rsync does *1* delta. Furthermore, it's not as efficient as a straight HTTP stream when it comes to the server.
With rdiff the server side work only has to be done once.
Not for rippling through multiple deltas, which is what versioned files are. Several of you seem to forget that aspect.
The delta between any version and its next update will never change. Make it once, store it, mirror it, whatever. Do it again separately for the next version (delta from it's immediately prior version only). Leave it up to the client to figure out what it has to start with and whether it is cheaper to apply a series of deltas or pull the full copy of the version it wants.
You're talking a lot more overhead than just a HTTP access.
No, just one extra, probably automated step in adding a new update version that only gets done once.
But what do clients need? Re-assembly in the check-out! So if there have been 5 version updates since, then _all_5_ deltas will need to be "rippled through."
Done on the client side - and only after computing that the operation is cheaper than jumping directly to the desired version.
So why not just store the _whole_ versions? That's my point.
You should. Just give the client the choice.
You would trade that off against the network traffic saved when the client chooses the smaller delta. But, for this to work you need an on-line local cache of the base rpms.
Hence why you should have a local repository! Isn't that what you were arguing against?!?!?!
I'm not against having local repositories. I'm against anything that takes human time/intervention. Put a price tag on it and you'll see why. I have no problem tossing a few hundred gigs of disk space out if no one has to manually edit anything anywhere to use it. If my install procedure starts by copying the base rpms somewhere or the NFS-shared directory where the downloaded isos were specified during the install is saved it would be fine with me.
On Friday 09 September 2005 14:18, Bryan J. Smith wrote:
I don't think you're realizing what you're suggesting.
Yes, I do. I've suggested something like this before, and there has been some work on it (see Fedora lists archives from nearly a year or more ago).
Who is going to handle the load of the delta assembly?
The update generation process. Instead of building just an RPM, the buildsystem builds the delta package to push to the package server.
But the on-line, real-time, end-user assembly during "check-out" is going to turn even a high-end server into a big-@$$ door-stop (because it's not able to do much else) with just a few users checking things out!
Do benchmarks on a working system of this type, then come back to me about the unbearable server load.
Do you understand this?
Do you understand how annoyingly arrogant you sound? I am not a child, Bryan.
Not true! Not true at all! You're talking GBs of transactions _per_user_.
I fail to see how a small update of a few files (none of which approach 1GB in size!) can produce multiple GB's of transactions per user. You seem to not understand how simple this system could be, nor do you seem willing to even try to understand it past your own preconceived notions.
You're going to introduce:
- Massive overhead
In your opinion.
- Greatly increased "resolution time" (even before
considering the server responsiveness)
- Many other issues that will make it "unusable" from the
standpoint of end-users
All in your opinion.
You can_not_ do this on an Internet server. At most, you can do it locally with NFS with GbE connections so the clients themselves off-load a lot of the overhead. That's not feasible over the Internet, so that falls back on the Internet server.
How in the world would sending an RPM down the 'net built from a delta use more bandwidth than sending that same file as is sent now? Being that HTTP is probably the transport for EITHER.
As I mentioned before, not my Internet server! ;->
That is your choice, and your opinion.
In any case, it's a crapload more overhead than merely serving out files via HTTP. You're going to reduce your ability to service users by an order of magntitude, if not 2!
Have you even bothered to analyze this in an orderly fashion, instead of flying off the handle like Chicken Little? Calm down, Bryan.
With forethought to those things that can be prebuilt versus those things that have to be generated realtime, the amount of realtime generation can be minimized, I think.
That's the key right there -- you think.
Against your opinion, because neither of us has empirical data on this.
Again, keep in mind that repositories merely serve out files via HTTP today. Now you're adding in 10-100x the overhead. You're sending data back and forth, back and forth, back and forth, between the I/O, memory, CPU, etc... Just 1 single operation is going to choke most servers that can service 10-100 HTTP users.
And this is balanced to the existing rsync-driven mirroring that is doing multiple gigabytes worth of traffic. If the size of the files being rsync'd is reduced by a sufficient percentage, wouldn't that lighten that portion of the load? Have you worked the numbers for a balance? I know that if I were contracting with you on any of my upcoming multi-terabyte-per-day radio astronomy research projects, and you started talking to me this way, you'd be looking for another client.
No, you're talking about facilities that go beyond what rsync does. You're not just doing simple file differences between one system and another. You're talking about _multiple_ steps through _multiple_ deltas and lineage.
If you have, say, ten updates. You apply the ten update deltas in sequence and send it down the pike. Is applying a delta to a binary file that is a few kilobytes in length that stressful? What single binary in a typical CentOS installation is over a few megs?
There's a huge difference between traversing extensive delta files and just an rsync delta between existing copies. ;->
Yes, there is. The rsync delta is bidirectional traffic.
I only referenced CVS because someone else made the analogy.
You not even paying enough attention to know who said what; why should I listen to a rant about something you have no empirical data to back?
I made an analogy to CVS, and I really think things could be made more bandwidth and storage efficient for mirrors, master repositories, and endusers without imposing an undue CPU load at the mirror. Feel free to disagree with me, but at least keep it civil, and without insulting my intelligence.
So yes, I know CVS stores binaries whole. That aside, the XDelta is _still_ going to cause a sizeable amount of overhead.
How much? Why not try it (read the Fedora lists archives for some folks who have indeed tried it).
Far more than Rsync.
You think.
I know. I was already thinking ahead, but since the original poster doesn't even understand how delta'ing works, I didn't want to burden him with further understanding.
Oh, just get off the arrogance here, please. You are not the only genius out here, and you don't have anything to prove with me. I am not impressed with resumes, or even with an IEEE e-mail address. Good attitude beats brilliance any day of the week.
CVS per se would be horribly inefficient for this purpose.
Delta'ing _period_ is horribly inefficient for this purpose. In fact, storing the revisions whole would actually be _faster_ than reverse deltas of _huge_ binary files.
But then there's still the many GB for the mirror. There are only two reasons to do deltas, in my opinion: 1.) Reduce mirror storage space. 2.) Reduce bandwidth required to mirror, and/or reduce bandwidth to the enduser (which I didn't address in this, but could be addressed, even though it is far more complicated to send deltas straight to the user).
You're talking about cpio operations _en_masse_ on a server! Have you ever done just a few smbtar operations from a server before? Do you _know_ what happens to your I/O?
_That's_ what I'm talking about.
It once again depends on the process used. A streaming process could be used that would not impact I/O as badly as you state (although you first said it would kill my CPU, not my I/O). But tests and development would have to be done.
Again, the tradeoff is between the storage and bandwidth required at the mirrors to processing. Of course, if the mirror server is only going to serve http, doing it in the client isn't good.
Store prebuilt headers if needed.
As far as I'm concerned, that's the _only_ thing you should _ever_ delta. I don't relish the idea of a repository of delta'd cpio archives. It's just ludicrious to me -- and even more so over the Internet.
So you think I'm stupid for suggesting it. (That's how it comes across). Ok, I can deal with that.
*PLONK*
Lamar Owen wrote:
On Friday 09 September 2005 14:18, Bryan J. Smith wrote:
I don't think you're realizing what you're suggesting.
Yes, I do. I've suggested something like this before, and there has been some work on it (see Fedora lists archives from nearly a year or more ago).
Who is going to handle the load of the delta assembly?
The update generation process. Instead of building just an RPM, the buildsystem builds the delta package to push to the package server.
I recall you mentioning something which caused me to think you meant this, and I pointed out to you that this requires a degree of co- operation between developers which is difficult to maintain in a single organization, like a corporation, let alone between developers which have nothing to do with each other than that they happen to produce software which runs on the same platform.
But the on-line, real-time, end-user assembly during "check-out" is going to turn even a high-end server into a big-@$$ door-stop (because it's not able to do much else) with just a few users checking things out!
Do benchmarks on a working system of this type, then come back to me about the unbearable server load.
Yes, if things were managed as a total package, it would be fast and easy to check out from it. But they aren't, and doing so in the present circumstances is infeasible, for the reasons I pointed out before.
Do you understand this?
Do you understand how annoyingly arrogant you sound? I am not a child, Bryan.
You aren't a child, but you are naive. You seem intelligent, but, I hope you won't get offended, as I mean no offence, you are very ignorant. Remember, ignorance, like epoxy, can be cured. There is no cure for stupidity. And I don't think that you are stupid.
When I first got involved with the kinds of things you want to try, I also thought that the problems would be trivial. I found out, and very quickly, that they were not trivial.
Not true! Not true at all! You're talking GBs of transactions _per_user_.
I fail to see how a small update of a few files (none of which approach 1GB in size!) can produce multiple GB's of transactions per user. You seem to not understand how simple this system could be, nor do you seem willing to even try to understand it past your own preconceived notions.
They wouldn't, at checkout, if everything were managed as a single, released package.
But as I pointed out, that is difficult with open-source web-developed independent programs.
You're going to introduce:
- Massive overhead
In your opinion.
- Greatly increased "resolution time" (even before
considering the server responsiveness)
- Many other issues that will make it "unusable" from the
standpoint of end-users
All in your opinion.
Not just in his opinion. That work must be done in order to accomplish what you want. It can be done at check-out time (as Bryan is suggesting) or it can be done at build time (as you are suggesting). You fail to see that it is infeasible to do at build time, because there is no such thing as a single object with an associated state, there is just a web site and a glorified wget to pull it. And, as Brian points out, it is infeasible to do at check-out time because enormous file diffs would have to be done OVER THE WEB.
In order to do it at build time (as you suggest) one would need a QA team managing release points of a whole package which contained everything which was being released to Linux. And this isn't going to happen when everyone in the world is a potential developer of variants of open-source programs.
[snip]
Have you even bothered to analyze this in an orderly fashion, instead of flying off the handle like Chicken Little? Calm down, Bryan.
I agree that Bryan is a little agitated. Frankly, I'm finding it a little bit difficult to retain complete equanimity, myself.
It seems that you haven't maintained a repository, but you insist that you know more than those who have. Then complain that others are being arrogant.
Isn't that getting a little close to the pot and the kettle?
[snip]
Oh, just get off the arrogance here, please. You are not the only genius out here, and you don't have anything to prove with me. I am not impressed with resumes, or even with an IEEE e-mail address. Good attitude beats brilliance any day of the week.
I'd rather have an arrogant, competent bastard running my repositories, than a nice well-mannered incompetent any day. I speak from experience, having been in both circumstances.
[snip]
Store prebuilt headers if needed.
As far as I'm concerned, that's the _only_ thing you should _ever_ delta. I don't relish the idea of a repository of delta'd cpio archives. It's just ludicrious to me -- and even more so over the Internet.
So you think I'm stupid for suggesting it. (That's how it comes across). Ok, I can deal with that.
*PLONK*
Ah, so you kill-filed one of your best hopes of actually coming to grips with the subject you're discussing.
I really, really suggest that you build your own repository, and use it to pull from. That way, you'll have repeatable (perhaps not consistent) updates using yum. And you'll also learn a little bit of why what you want to do is not as easy as it seems to you today.
Using yum to manage staged releases makes as much sense as using Internet Explorer to do so. Unfortunately, it's difficult to explain to one who has no experience with it why that is so. It just isn't the tool for the job. It's just a transport and install mechanism, with some control files to tell it what to pull. It's nice and all, and no criticism of the developers of yum. It just isn't designed, intended, or capable of doing what you want. It's a wonderful tool for what it *is* designed to do, which is manage installs.
Mike
Lamar Owen wrote:
Do you understand how annoyingly arrogant you sound? I am not a child, Bryan.
There is a fine line between confidence and cockiness. I really try to avoid crossing it, but I really have my experience to go on.
I've used revision control systems for binary file management -- again, in the CAM world as well as EDA. The problem is the sheer data involved. It's much, much easier to serve things whole than to "ripple" deltas when dynamically "re-assembling" a binary -- even if it's just part of it.
Mike McCarty mike.mccarty@sbcglobal.net wrote:
You aren't a child, but you are naive. You seem intelligent, but, I hope you won't get offended, as I mean no offence, you are very ignorant. Remember, ignorance, like epoxy, can be cured. There is no cure for stupidity. And I don't think that you are stupid.
Well, I'm trying to avoid the "ignorant" word, although I have used it in the past. It's the one word that makes it too easy to transition to "cockiness" when you're just trying to "confidentally" share experience.
In the end, I really just need to avoid these threads. If I have a series of points to make, I'll do it on my blog. After all, 9 times out of 10, I'm trying to share my first-hand experience against second or third-hand.
I'd rather have an arrogant, competent bastard running my repositories, than a nice well-mannered incompetent any
day.
Dude, you wouldn't know how many times I get work because of these types of threads on lists! ;->
I've literally been in a "debate" where I'm the "sole minority" and I've had someone call me on the phone and say ... "Thank God! Someone on the list who knows what works best and, more importantly, why! Are you available?"
Especially when it comes to LAN file/database servers, as well as configuration management. Especially for binary CAM/EDA files and other high-overhead, large file operations.
I speak from experience, having been in both circumstances.
Ditto.
On Fri, 2005-09-09 at 16:56, Bryan J. Smith wrote:
I've used revision control systems for binary file management -- again, in the CAM world as well as EDA. The problem is the sheer data involved.
There are lots of ways to do things wrong. The fact that you have experience with some of them does not mean that it is impossible to get it right.
It's much, much easier to serve things whole than to "ripple" deltas when dynamically "re-assembling" a binary -- even if it's just part of it.
That is obviously an overgeneralization and incorrect in any number of cases. Change a couple of bytes in an iso image on the other end of a dialup line. Would you rather let rsync find and send the difference or wait for the whole thing? A pre-built rdiff delta could give you that even faster.
Les Mikesell lesmikesell@gmail.com wrote:
That is obviously an overgeneralization and incorrect in any number of cases. Change a couple of bytes in an iso image on the other end of a dialup line. Would you rather let rsync find and send the difference or wait for the whole thing? A pre-built rdiff delta could give you that even faster.
Let me say that word again ... R-I-P-P-L-E
What that means is that you have not 1, but *N* diffs. Maybe I can explain it in another way.
If you do an incremental backup, is there a difference from doing a backup from the last full backup and the previous incremental? Of course!
If you do an incremental from the last full, you only need the last full backup and that incremental. But if you do an incremental from the previous incremental, then you need not only the full, not only that incremental, but every incremental that is based on each incremental before that.
Diffs are merely the difference between two files. Deltas are the differences between _multiple_ files.
And that's before we even consider the sheer size of these files. Text is a crapload easier to delta (just like compression) than binaries, especially when you're talking very, very large files that are starting to take significant chunks of your server's memory.
-- Les Mikesell lesmikesell@gmail.com
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Friday 09 September 2005 17:03, Mike McCarty wrote:
Lamar Owen wrote:
Do you understand how annoyingly arrogant you sound? I am not a child, Bryan.
You aren't a child, but you are naive. You seem intelligent, but, I hope you won't get offended, as I mean no offence, you are very ignorant. Remember, ignorance, like epoxy, can be cured. There is no cure for stupidity. And I don't think that you are stupid.
Well, let me just put it this way. I maintained the PostgreSQL RPMs, doing releases and such, for over five years. This is an issue I have thought long and hard about, since a full set of PostgreSQL RPMs is over 10MB. There were many times a bugfix would come from upstream, and it would be maybe a hundred bytes of compiled object code, touching maybe half a dozen object files. But then, due to the interdependencies, because of a HUNDRED BYTE CHANGE users had to pull down nearly TEN MEGABYTES of code. That, my friend, is what is ludicrous. There has got to be a better way.
The existing 'glorified wget' as you put it isn't really the greatest solution on earth; up2date and the RHN backend (with the XMLRPC stuff) is probably a better one for delivery of this kind of thing. An RHN-workalike backend in the form of current is available.
But the space and bandwidth issues will only get worse; when a full update to KDE comes down, it may be sparked by a change of a few kilobytes up to a megabyte of actual binary change, but then it requires hundreds of megabytes downloaded (and MIRRORED!) to fix. Or tens of bytes change in OpenOffice.org; now you're downloading hundreds of megabytes for a kilobyte binary change. That is ridiculous.
No, I am not ignorant of the issues, and I know far better than to think the issues are trivial; they are not. But they are not 'unfeasible' as Bryan put it, and just saying 'that's ludicrous, don't even bother!' is even more ridiculous.
I am not a novice; I would like to think I am not ignorant. I may be guilty of being optimistic in that I think the open source community can solve this problem. Are you going to tell the community that this is an unsolvable problem?
Lamar Owen wrote:
[snip]
I am not a novice; I would like to think I am not ignorant. I may be guilty of being optimistic in that I think the open source community can solve this problem. Are you going to tell the community that this is an unsolvable problem?
No, I am not. Already, several people, myself included, have given a way for you to accomplish what you seem to want.
Mike
On Friday 09 September 2005 20:19, Mike McCarty wrote:
Lamar Owen wrote:
solve this problem. Are you going to tell the community that this is an unsolvable problem?
No, I am not. Already, several people, myself included, have given a way for you to accomplish what you seem to want.
Well, first, Les and I are not the same person, and what Les wants and what I'd like to see are two different but related things. I believe that incremental updates (rpm-deltas) are desireable from a bandwidth and storage point of view, and highly desireable from a user point of view. They do present issues for repository operators and packagers, this is true.
But then Johnny mentions that the mirroring load is 50GB for the tree. This is a lot of data to move around, really.
Now, bandwidth doesn't scare me; we have one research project here that will be collecting 12TB of data per day (if it captures a full day at a time; currently not possible, but desirable). (The project involves a phased array, with the raw data being stored and rephased after collection; this is like being able to repoint a dish to an observation in the past for conventional radio telescopes). This would require 2/5ths of an OC-48 to mirror; doable, yes, but not desireable or affordable. Drive space doesn't scare me (except cost; got a quote on a petabyte-class storage array (it was 1.4PB and cost upwards of $3 million). CPU horsepower doesn't scare me, either, as I'm getting a MAPstation as part of a different research project (now this box has a interesting interconnect called SNAP that scales to 64 MAP DEL processors and 32 host P4's on a crossbar type switch; you can do the research on google too). The MAPstation runs on Linux, FWIW. For the application (cross-correlation of interferometry data, 2 frequencies, 2 polarizations, and 2 antennas) a MAP processor will have the equivalent power of an 800GHz P4, but be clocked at only 200MHz due to the massively paralleled pipelining available with this kind of direct-execution-logic (non-Von Neumann) processor. But all of that is irrelevant.
What is relevant is that I have seen the end user's response to having to download multiple megabytes for a hundred byte or less change. While it doesn't bother me, it did bother my users (speaking of the PostgreSQL users I built and released packages for).
So the enduser potentially could reap the best benefit of a rpmdelta system. SuSE is or has been doing rpmdeltas for a year now, and I seem to recall that the results were pretty good.
Les wanted similar to CVS functionality where you can tag a repository as consistent at a certain branch (not necessarily by date, as you mentioned), and be able to consistently grab a set of packages.
I mentioned CVS worked on a diff principle, and that that might be an interesting way of doing it (all the while thinking about my PostgreSQL users). Maybe I confused the two issues; possible.
The dumb client glorified webserver type system will be very difficult to make work this we, this is true. But who says we have to stick to a glorified wget? But the key question is, cost-benefit analysis-wise, is it worth the effort (both development and execution)? Maybe it is, maybe it isn't. But I do believe it is worth a try, if only to help the enduser (which could be a small server at a parish library, for instance.... :-)).
Lamar Owen wrote:
On Friday 09 September 2005 20:19, Mike McCarty wrote:
Lamar Owen wrote:
solve this problem. Are you going to tell the community that this is an unsolvable problem?
No, I am not. Already, several people, myself included, have given a way for you to accomplish what you seem to want.
Well, first, Les and I are not the same person, and what Les wants and what I'd like to see are two different but related things. I believe that
Oops. Sorry. I should have checked the attributions more carefully.
[snip]
What is relevant is that I have seen the end user's response to having to download multiple megabytes for a hundred byte or less change. While it doesn't bother me, it did bother my users (speaking of the PostgreSQL users I built and released packages for).
So the enduser potentially could reap the best benefit of a rpmdelta system. SuSE is or has been doing rpmdeltas for a year now, and I seem to recall that the results were pretty good.
Les wanted similar to CVS functionality where you can tag a repository as consistent at a certain branch (not necessarily by date, as you mentioned), and be able to consistently grab a set of packages.
Yes, that is what Les seems to want to happen, though he argues that he does not.
I mentioned CVS worked on a diff principle, and that that might be an interesting way of doing it (all the while thinking about my PostgreSQL users). Maybe I confused the two issues; possible.
The dumb client glorified webserver type system will be very difficult to make work this we, this is true. But who says we have to stick to a glorified wget? But the key question is, cost-benefit analysis-wise, is it worth the effort (both development and execution)? Maybe it is, maybe it isn't. But I do believe it is worth a try, if only to help the enduser (which could be a small server at a parish library, for instance.... :-)).
AFAIK, no one has made the claim that yum is capable of doing the job of making named (some would say tagged) versions of release, nor has anyone suggested that we must remain with yum.
I have pointed out that, in my experience, the degree of cooperation needed to create a consistent unified release package is difficult to achieve in a single corporation, let alone given the way Linux is handled.
I agree that what you suggest would be worthwhile. That's why there are people who make money doing it. An example is the LynxOs corporation. Other names come to mind.
Mike
On Tue, 2005-09-13 at 16:25, Mike McCarty wrote:
Les wanted similar to CVS functionality where you can tag a repository as consistent at a certain branch (not necessarily by date, as you mentioned), and be able to consistently grab a set of packages.
Yes, that is what Les seems to want to happen, though he argues that he does not.
I want that functionality, but I was arguing that all it would take to get it is sequentially increasing timestamps on files being added to the repository and knowledge of the final timestamp of each consistent update set - and letting the yum client have that information to figure out the rest.
I mentioned CVS worked on a diff principle, and that that might be an interesting way of doing it (all the while thinking about my PostgreSQL users). Maybe I confused the two issues; possible.
The dumb client glorified webserver type system will be very difficult to make work this we, this is true. But who says we have to stick to a glorified wget?
One thing that no one mentioned about CVS is that it always stores the full ready-to-go copy of the latest version and builds the diffs backwards to earlier versions on the assumption that you are most likely to want the most recent version. In a yum-ish adaptation of this you would want the diffs between each version to be available for the likely possibility that the client has the previous version and wants to go to the latest.
I have pointed out that, in my experience, the degree of cooperation needed to create a consistent unified release package is difficult to achieve in a single corporation, let alone given the way Linux is handled.
No one in this entire thread has had any complaints about the Centos repository management at the points when updates are completed. The only issues are when mirrors are only partly in sync and my wish for the ability to repeat an update regardless of subsequent additions to the repository.
Updating via binary diffs might be a good idea too, but it would need to be very different from CVS because the goal would be to minimize the traffic and make the client side do all the work.
PREFACE: I hope people have noted I've been saying out of this thread for awhile. I'll really try to limit my comment to 1 here and leave it at that.
Les Mikesell lesmikesell@gmail.com wrote:
I want that functionality, but I was arguing that all it would take to get it is sequentially increasing timestamps on files being added to the repository and knowledge of the final timestamp of each consistent update set - and letting the yum client have that information to figure out the
rest.
But the meta-data and its dependency tree _changes_ for each point in time. What was a dependency tree one createrepo changes on the next run. That's the problem.
The only way to fix it currently is to have the YUM client access RPMs directly, instead of relying on the YUM repository's meta-dta. Otherwise, there has to be some major changes at the repository-level. I offered my suggestion, a simple "hack" in the meantime.
One thing that no one mentioned about CVS is that it always stores the full ready-to-go copy of the latest version and builds the diffs backwards to earlier versions on the assumption that you are most likely to want the most recent version.
Reverse deltas. Instead of taking the original revision and rippling deltas forward, you take the latest, and do ripple deltas backward.
Xdelta does this for binaries as well.
In a yum-ish adaptation of this you would want the diffs between each version to be available for the likely possibility that the client has the previous version and wants to go to the latest.
Again, that's what a delta are! The difference between each revision. Forward deltas start with the original. Reverse deltas start with the latest.
Forward deltas are like doing a full backup, and then doing incrementals upon incrementals. Each successive incremental requires each other to work. That's a PITA.
Reverse deltas don't solve the "ripple differences" problem, but they do minimize it. They typically cut the number of deltas required if people people are pulling the last few revisions. That is typically the case in software.
If you're at revision 1.4 and you want version 1.7, the version control service of a forward delta must build all the way from 1.1 to 1.7 -- and ripple through 6 differences. In the reverse delta, it would only need to ripple 3 times -- from 1.7 back to 1.4.
*UNLESS* you aren't talking about deltas ... but *PATCHES*
Patches are _not_ Deltas. Patches are like doing a full backup and an incremental since the last full backup. So if you need to restore, you only need the latest incremental and last full. There is no "ripple." So you only need *1* file for an update.
So what's the catch? Space!
Instead of a set of deltas (be they forward or reverse) in a single file, well minimized, you now maintain _separate_ patch files. In the case above, 1.1 to 1.7, you'll need to maintain _all_ permutations. That's 6 + 5 + 4 + 3 + 2 + 1 = 21 patches!
So while you drastically reduce the ripple load on the server, you increase the storage. Catch-22.
Updating via binary diffs might be a good idea too, but it would need to be very different from CVS because the goal would be to minimize the traffic and make the client side do all the work.
You can_not_ do deltas without the _original_ delta files. So you would have to transfer the _entire_ delta file to the client, which is _larger_ than just the RPM. ;->
That's the impossibility I'm talking about! ;->
The only way is by maintaining patches on the server. That removes the overhead of run-time generation of differences via a "ripple delta" because the patches are only generated once. But that then _bloats_ the server storage.
Again, I don't think you understand how deltas work. ;->
"Bryan J. Smith" b.j.smith@ieee.org wrote:
*UNLESS* you aren't talking about deltas ... but *PATCHES* Patches are _not_ Deltas ... So what's the catch? Space! So while you drastically reduce the ripple load on the server, you increase the storage. Catch-22. ... You can_not_ do deltas without the _original_ delta files. So you would have to transfer the _entire_ delta file to the client, which is _larger_ than just the RPM. ;-> That's the impossibility I'm talking about! ;-> The only way is by maintaining patches on the server. That removes the overhead of run-time generation of differences via a "ripple delta" because the patches are only generated once.
Here is how an "ideal" 3-tier Delta-Patch-Client approach could work.
- Master Repos: Maintains full deltas of all packages - Server Repos: Also maintains full deltas of all packages, generates patches of all package permutations - Clients: Download patches
The Master Repos only push delta changes to Server Repos. The Clients only pull patches from Server Repos.
The Master Repos _only_ need to ripple through deltas when Server Repos request updates. If server repos do this frequently enough, this should only be 1 delta. In fact, the Master Repo should "cache" the "last patch" (HEAD - 1 rev) for the Server Repos.
The Server Repos actually serve the clients. They generate all necessary revision permutations. I.e., if 1.7 has just been downloaded at the server repo, it needs to generate 6 patches (1.1 -> 1.7, 1.2-> 1.7 ... 1.6 -> 1.7) -- but it _only_ does that once. It then keeps the patches for the clients to use.
This is the _most_efficient_ way to both Master Repo to Server Repo transfers as well as Server Repro to Client transfers. But because the Master Repos are not serving clients, the "ripple delta" overhead is virtually eliminated for the Master Repos (and it caches the last delta, as most Server Repos will typically be near HEAD). It also _exponentially_ reduces the number of "ripple deltas" a Server Repo has to do -- as it only does one "patch set" one-time for the clients.
BUT WHAT THIS DOES _NOT_ DO IS ADDRESS THE "DEPENDENCY" META-DATA ISSUE.
Because the only way to address the "dependency" meta-data issue is to maintain _all_ delta changes. That means -- yet again -- a version control repository at the client itself. So you're back to mirroring repositories (although the delta approach _does_ reduce the amount necessary to mirror).
So we _still_ need a "bloated" meta-data format at the Server Repo so the clients can figure out dependencies without first having to download the patch. It's no different than full RPMs, except the patches are smaller than full RPMs. But you still don't want clients downloading one patch, checking it only to discover they need another patch for another package, etc...
So, again, while it solves the traffic issue, it does _not_ solve the larger issue of "give me all changes through date X" when the current date is Y. That's why you can_not_ avoiding having to maintain your own, _internal_ repository. Only with the _full_ repository can you do this _internally_.
On Tue, 2005-09-13 at 18:03, Bryan J. Smith wrote:
I want that functionality, but I was arguing that all it would take to get it is sequentially increasing timestamps on files being added to the repository and knowledge of the final timestamp of each consistent update set - and letting the yum client have that information to figure out the
rest.
But the meta-data and its dependency tree _changes_ for each point in time. What was a dependency tree one createrepo changes on the next run. That's the problem.
Yes, there is no argument that yum would have to change. However it could be a small change.
The only way to fix it currently is to have the YUM client access RPMs directly, instead of relying on the YUM repository's meta-dta.
Yum does its dependency computations on the client side based on the contents of the .hdr files (otherwise it wouldn't work when combining the contents of different repositories). It needs the .hdr files, not the RPMS. There is some magic in the repo metadata that makes the client only download the latest .hdr files but if you update often you end up with them all anyway and use only the latest. The needed change is that if you specify a point-in-time the client should toss/ignore .hdr files past that and get downrevs if available. Note that you could do this yourself with nothing but an ftp view of the repository and you'll see the client could do it directly, although I agree that repository support could make it easier.
Otherwise, there has to be some major changes at the repository-level.
I'd call it a minor change to expose an option to get backrev .hdr files when wanted.
One thing that no one mentioned about CVS is that it always stores the full ready-to-go copy of the latest version and builds the diffs backwards to earlier versions on the assumption that you are most likely to want the most recent version.
Reverse deltas. Instead of taking the original revision and rippling deltas forward, you take the latest, and do ripple deltas backward.
Yes, but what you really want to do is give the client the least he needs to make what it has into what it wants. You are always going to be going forward and clients that update regularly will always need only the diff between the current and last prior RPM.
Reverse deltas don't solve the "ripple differences" problem, but they do minimize it. They typically cut the number of deltas required if people people are pulling the last few revisions. That is typically the case in software.
If you're at revision 1.4 and you want version 1.7, the version control service of a forward delta must build all the way from 1.1 to 1.7 -- and ripple through 6 differences. In the reverse delta, it would only need to ripple 3 times -- from 1.7 back to 1.4.
*UNLESS* you aren't talking about deltas ... but *PATCHES*
If you work only 2 revs at a time there is no difference.
Patches are _not_ Deltas. Patches are like doing a full backup and an incremental since the last full backup. So if you need to restore, you only need the latest incremental and last full. There is no "ripple." So you only need *1* file for an update.
Yes, one file for the difference between any two revs which is almost always what you want - or you should be updating more often. If you need to repeat the process with multiple steps, the client can easily calculate whether it is better to collect multiple deltas and apply them or just grab the complete version it wants.
So what's the catch? Space!
So be sensible about what you keep around and make the client fall back to existing procedure if the delta it might use isn't there.
The only way is by maintaining patches on the server. That removes the overhead of run-time generation of differences via a "ripple delta" because the patches are only generated once. But that then _bloats_ the server storage.
Keep only 1 or 2 delta/patch files for the latest revs where the traffic will actually be happening and thus reduced. In the unlikely event you want something else, use the existing procedure.
Again, I don't think you understand how deltas work. ;->
I didn't realize that you wouldn't call them deltas unless you cram more than one in the same file. Do you call the first one a patch, then change the name when you append the next run? The piece everyone will want is currrent-1->current so the most benefit would come from keeping that in it's own file.
On Tue, 2005-09-13 at 21:21 -0500, Les Mikesell wrote:
Yes, there is no argument that yum would have to change. However it could be a small change.
Ah ... no, it's _more_ than a "small change." Simply timestamping and tracking timestamps will _not_ do what you want.
Yes, but what you really want to do is give the client the least he needs to make what it has into what it wants. You are always going to be going forward and clients that update regularly will always need only the diff between the current and last prior RPM.
But that adds to the start-up time. The more meta-data you generate to do what you want, the longer this stuff will take on the client side.
If you work only 2 revs at a time there is no difference.
Can you guarantee this?
Yes, one file for the difference between any two revs which is almost always what you want - or you should be updating more often. If you need to repeat the process with multiple steps, the client can easily calculate whether it is better to collect multiple deltas and apply them or just grab the complete version it wants.
Again, this is not a "small change" like you think it is.
So be sensible about what you keep around and make the client fall back to existing procedure if the delta it might use isn't there.
Again, this is not a "small change." And you're going to start introducing loads at the server if you try to keep transfer sizes down.
Keep only 1 or 2 delta/patch files for the latest revs where the traffic will actually be happening and thus reduced. In the unlikely event you want something else, use the existing procedure.
More subjective approaches. What happens if the respositories don't do what you want? Better yet, how can you ensure all repositories are so synchronized?
I mean, I saw complaints about repositories not offering the same packages. It will only get worse when you start timestamping, using deltas, etc...
I didn't realize that you wouldn't call them deltas unless you cram more than one in the same file. Do you call the first one a patch, then change the name when you append the next run? The piece everyone will want is currrent-1->current so the most benefit would come from keeping that in it's own file.
Are you so sure? Sometimes there are more than one update. In fact, this still does not solve the problem of the fact that repositories change every few days, and gives you a way to timestamp and resolve all those changes.
I think people are asking for the "holy grail" here and calling it a "small change" without thinking through the real issues. A lot of what I see above is _not_ "tied down" into a methodical approach and more like "subjective" and making the chances of inconsistency even worse.
--- "Bryan J. Smith" b.j.smith@ieee.org wrote:
On Tue, 2005-09-13 at 21:21 -0500, Les Mikesell wrote:
Yes, there is no argument that yum would have to
change.
However it could be a small change.
Ah ... no, it's _more_ than a "small change." Simply timestamping and tracking timestamps will _not_ do what you want.
Yes, but what you really want to do is give the
client
the least he needs to make what it has into what
it
wants. You are always going to be going forward
and
clients that update regularly will always need
only
the diff between the current and last prior RPM.
But that adds to the start-up time. The more meta-data you generate to do what you want, the longer this stuff will take on the client side.
If you work only 2 revs at a time there is no
difference.
Can you guarantee this?
Yes, one file for the difference between any two
revs which
is almost always what you want - or you should be
updating
more often. If you need to repeat the process
with multiple
steps, the client can easily calculate whether it
is better
to collect multiple deltas and apply them or just
grab the
complete version it wants.
Again, this is not a "small change" like you think it is.
So be sensible about what you keep around and make
the
client fall back to existing procedure if the
delta
it might use isn't there.
Again, this is not a "small change." And you're going to start introducing loads at the server if you try to keep transfer sizes down.
Keep only 1 or 2 delta/patch files for the latest
revs where
the traffic will actually be happening and thus
reduced. In
the unlikely event you want something else, use
the existing
procedure.
More subjective approaches. What happens if the respositories don't do what you want? Better yet, how can you ensure all repositories are so synchronized?
I mean, I saw complaints about repositories not offering the same packages. It will only get worse when you start timestamping, using deltas, etc...
I didn't realize that you wouldn't call them
deltas unless
you cram more than one in the same file. Do you
call the
first one a patch, then change the name when you
append the
next run? The piece everyone will want is
currrent-1->current
so the most benefit would come from keeping that
in it's
own file.
Are you so sure? Sometimes there are more than one update. In fact, this still does not solve the problem of the fact that repositories change every few days, and gives you a way to timestamp and resolve all those changes.
I think people are asking for the "holy grail" here and calling it a "small change" without thinking through the real issues. A lot of what I see above is _not_ "tied down" into a methodical approach and more like "subjective" and making the chances of inconsistency even worse.
-- Bryan J. Smith b.j.smith@ieee.org http://thebs413.blogspot.com
----------------------------------------------------------------------
The best things in life are NOT free - which is why life is easiest if you save all the bills until you can share them with the perfect woman
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
sir_funzone pulls out the shotgun and aims, calls the word pull, seeing the thread Re: [CentOS] Why is yum not liked by some? in his gunsite. Pulls the trigger and hears the load pop of the shotgun going off, hoping the sluge will hit its target. Whamo target sought and hopefully destroyed with exploding force that it is never seen again..... waiting to see if that little rascal has survived, sir_funzone reloads his shotgun and points it toward the direction of the thread Re: [CentOS] Why is yum not liked by some? hoping not to see it come alive again...
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
sir_funzone pulls out the shotgun and aims, calls the word pull, seeing the thread Re: [CentOS] Why is yum not liked by some? in his gunsite. Pulls the trigger and hears the load pop of the shotgun going off, hoping the sluge will hit its target. Whamo target sought and hopefully destroyed with exploding force that it is never seen again..... waiting to see if that little rascal has survived, sir_funzone reloads his shotgun and points it toward the direction of the thread Re: [CentOS] Why is yum not liked by some? hoping not to see it come alive again...
So discussions about how to manage a farm of centos boxes should be done elsewhere?
Feizhou feizhou@graffiti.net wrote:
So discussions about how to manage a farm of centos boxes should be done elsewhere?
I am just sick of being the "coincidental person" who is quoted. People can argue it is volume -- but I _ignore_ the thread for 3 days, and then make 3 posts, and I'm quoted right afterwards!
Sometimes I think it's the technical detail that I put in my post that pisses people off more than the volume. This was a perfect example. I do my _damnest_ to try to detail _real_ technical _solutions_ and not a "wish list rant" for 5+ days.
If other people are ranting about what they want, and I come in and list specific, detailed technical information, possible solutions, how they will and won't work, why oh why am I held up as an example? I'm sorry, the "it's because you think you're smarter than anyone else" argument does _not_ hold.
I have tried to provide possible solutions to people, not to sport my "knowledge," but to try to get them to be more productive. I'm sorry I even re-entered the thread. But it was just as "noisy" for 3 days when I let it be.
On Wed, 2005-09-14 at 06:00 -0700, Bryan J. Smith wrote:
Sometimes I think it's the technical detail that I put in my post that pisses people off more than the volume. This was a perfect example. I do my _damnest_ to try to detail _real_ technical _solutions_ and not a "wish list rant" for 5+ days.
No, I don't think so. Technical detail is just that. Technical detail. It's the paragraphs of annotation explaining why your technical detail is correct that gets people thinking your flaunting your knowledge.
If other people are ranting about what they want, and I come in and list specific, detailed technical information, possible solutions, how they will and won't work, why oh why am I held up as an example? I'm sorry, the "it's because you think you're smarter than anyone else" argument does _not_ hold.
Of course not. You provide accurate and helpful information. It's just that the way you present it, it's hard not to draw the conclusion that you have secondary, albeit strange, motives for being so thorough in explaining why it is you know all that you know.
I have tried to provide possible solutions to people, not to sport my "knowledge," but to try to get them to be more productive. I'm sorry I even re-entered the thread. But it was just as "noisy" for 3 days when I let it be.
Of course it was. But this isn't the first time the topic of you "sporting your knowledge" has come up. And I would say, after a certain period of time, wouldn't you see this and realize that it's not some grand conspiracy or a coincidence. There is something real happening. And it's not EVERY PERSON on this list that is wrong. Something to think about, at least. If I walked into a room and annoyed almost every person in the room I wouldn't think they were jerks, out to get me, I would wonder what I said or how I said what I said.
Just some helpful advice.
Preston
Preston Crawford me@prestoncrawford.com wrote:
Of course it was. But this isn't the first time the topic of you "sporting your knowledge" has come up.
*WHEN* did I "sport my knowledge" *1* time in this *ENTIRE* thread! God, I think the only things I mentioned were:
1. I always run my own, internal repositories for clients
2. You have to have run your own repositories to understand the limits of YUM, or understand how deltas work to know they don't solve the problem for clients (only mirrors)
And it's not EVERY PERSON on this list that is wrong.
And I have _explicitly_ said "Select People" over and over. I have explicitly said I don't care if it's "subscience" or "dilberate," I want it to _stop_ from "Select People." People always assuming things are due to 1 person, even if subsconsicence and not intentional, have a nasty habit of becoming habit. ;->
Furthermore, several _other_ people affirmed #1 and #2 above in addition to myself. So I'm not the only one who said basically the same thing (even if I might have been the first).
Something to think about, at least. If I walked into a room and annoyed almost every person in the room
Now _you_ are saying and asserting "annoyed almost every person!" Stop stating what you want me to have said, not what I actually said! God, are purposely trying to _drive_ the so-called "paranoia."
I wouldn't think they were jerks, out to get me, I would wonder what I said or how I said what I said.
Actually, I'm _not_ the one calling you guys "jerks" or otherwise. I said I'm just tired of what I _see_ of "select people" do on-list that is _only_ directed towards me an _no_one_else_! That's the reality, what you "select people" do!
The hypocrisy is in the fact that you, among a handful of others, are psycho-analyzing me, and contributing to the "indirect paranoia" in the first place. Stop and it will end. Continue and you wonder why I want to prevent people from constantly attributing _everything_ to myself and myself only.
I've had other, very illegal comments/deeds attributed to myself on other lists because there quickly becomes the default of "oh, that was Bryan Smith, he's always the troublemaker and capable of anything."
Just some helpful advice.
Bull, and you know it. The hypocrisy of your agenda in this message and the prior message has _you_ putting things in more "absolutes" than anyone else!
Again, re-read your comments on: - "sporting your knowledge" - "If I walked into a room and annoyed almost every person in the room"
You are _purposely_ trying to feed it right there! And I'll I'm saying ... "Select people ... dudes! Please stop!" ;->
After that, I'm _still_ saying _only_ select people. And I'm merely saying I'm "tired of the coincidences, even if they are not deliberate and subsonscience."
But here's the deal. It's clear that you do not like the fact that I only "temporarily" stop posting. So I'll now make it permanent for the benefit of all you "select people" think it is for.
On Wed, 2005-09-14 at 09:38 -0700, Bryan J. Smith wrote:
Preston Crawford me@prestoncrawford.com wrote:
Of course it was. But this isn't the first time the topic of you "sporting your knowledge" has come up.
*WHEN* did I "sport my knowledge" *1* time in this *ENTIRE* thread! God, I think the only things I mentioned were:
I always run my own, internal repositories for clients
You have to have run your own repositories to understand
the limits of YUM, or understand how deltas work to know they don't solve the problem for clients (only mirrors)
I didn't say you did, Bryan. To be honest, most of the time I ignore your posts, because they grate on me. I simply said this isn't the first time this topic has come up. Trying to point out that if multiple people think you're saying things in the same manner, then is EVERYONE wrong? That's all I was saying. That this issue has come up before.
And it's not EVERY PERSON on this list that is wrong.
And I have _explicitly_ said "Select People" over and over. I have explicitly said I don't care if it's "subscience" or "dilberate," I want it to _stop_ from "Select People." People always assuming things are due to 1 person, even if subsconsicence and not intentional, have a nasty habit of becoming habit. ;->
?
Something to think about, at least. If I walked into a room and annoyed almost every person in the room
Now _you_ are saying and asserting "annoyed almost every person!" Stop stating what you want me to have said, not what I actually said! God, are purposely trying to _drive_ the so-called "paranoia."
See, there you go again. It's someone else's fault. You're being "driven" to paranoia. Here's a clue for you. You're already in paranoia- land, my friend. Step 1 is realizing you're there. Step 2 is walking away, taking a deep breath and not taking this so seriously. I was obviously overreaching in my example. It was an example. A ficticious example to point out that if there were a large number of people who said almost the exact same thing about me at a certain point I would realize it's me and that it's real. It's just common sense.
I wouldn't think they were jerks, out to get me, I would wonder what I said or how I said what I said.
Actually, I'm _not_ the one calling you guys "jerks" or otherwise. I said I'm just tired of what I _see_ of "select people" do on-list that is _only_ directed towards me an _no_one_else_! That's the reality, what you "select people" do!
I never said you used the exact word "jerks". I was, once again, positing a hypothetical in which an individual annoys a bunch of people and then thinks they're the problem.
The hypocrisy is in the fact that you, among a handful of others, are psycho-analyzing me, and contributing to the "indirect paranoia" in the first place. Stop and it will
Yes. We're forcing you to be paranoid, Bryan. I'm one of them now? I'm to blame? Great.
end. Continue and you wonder why I want to prevent people from constantly attributing _everything_ to myself and myself only.
No. It will never end. Because eventually you will say something that will grate on people and they will respond and then you will say that we're ganging up on you and then eventually that we're making you paranoid and then... rinse, repeat. It will never end until you chill out.
I've had other, very illegal comments/deeds attributed to myself on other lists because there quickly becomes the default of "oh, that was Bryan Smith, he's always the troublemaker and capable of anything."
I have no idea what you're talking about, but once again, if it's happened before, I might consider that it's not a coincidence.
Just some helpful advice.
Bull, and you know it. The hypocrisy of your agenda in this message and the prior message has _you_ putting things in more "absolutes" than anyone else!
No. I am trying to be helpful. I want you see you not piss off what you call "the select few" on this list. And I want this list to be conducted in a professional manner. I am trying to be helpful
Again, re-read your comments on:
- "sporting your knowledge"
That was me quoting you, Bryan.
- "If I walked into a room and annoyed almost every person in
the room"
That was a hypothetical. Trying to illustrate how I might react if people reacted to me as they seemingly often react to you.
You are _purposely_ trying to feed it right there! And I'll I'm saying ... "Select people ... dudes! Please stop!" ;->
No, I'm not purposely trying to feed it. You're looking for things to pick at.
After that, I'm _still_ saying _only_ select people. And I'm merely saying I'm "tired of the coincidences, even if they are not deliberate and subsonscience."
But here's the deal. It's clear that you do not like the fact that I only "temporarily" stop posting. So I'll now make it permanent for the benefit of all you "select people" think it is for.
I don't care if you post or don't post. All I care is that this list not be clogged by arguing about whether or not there is a cabal of "select people" trying to "make you paranoid". It's ridiculous. Can't you see this?
Preston
can this pissing contest be tkaen offlist please?
Preston Crawford wrote:
On Wed, 2005-09-14 at 09:38 -0700, Bryan J. Smith wrote:
Preston Crawford me@prestoncrawford.com wrote:
Of course it was. But this isn't the first time the topic of you "sporting your knowledge" has come up.
*WHEN* did I "sport my knowledge" *1* time in this *ENTIRE* thread! God, I think the only things I mentioned were:
I always run my own, internal repositories for clients
You have to have run your own repositories to understand
the limits of YUM, or understand how deltas work to know they don't solve the problem for clients (only mirrors)
I didn't say you did, Bryan. To be honest, most of the time I ignore your posts, because they grate on me. I simply said this isn't the first time this topic has come up. Trying to point out that if multiple people think you're saying things in the same manner, then is EVERYONE wrong? That's all I was saying. That this issue has come up before.
And it's not EVERY PERSON on this list that is wrong.
And I have _explicitly_ said "Select People" over and over. I have explicitly said I don't care if it's "subscience" or "dilberate," I want it to _stop_ from "Select People." People always assuming things are due to 1 person, even if subsconsicence and not intentional, have a nasty habit of becoming habit. ;->
?
Something to think about, at least. If I walked into a room and annoyed almost every person in the room
Now _you_ are saying and asserting "annoyed almost every person!" Stop stating what you want me to have said, not what I actually said! God, are purposely trying to _drive_ the so-called "paranoia."
See, there you go again. It's someone else's fault. You're being "driven" to paranoia. Here's a clue for you. You're already in paranoia- land, my friend. Step 1 is realizing you're there. Step 2 is walking away, taking a deep breath and not taking this so seriously. I was obviously overreaching in my example. It was an example. A ficticious example to point out that if there were a large number of people who said almost the exact same thing about me at a certain point I would realize it's me and that it's real. It's just common sense.
I wouldn't think they were jerks, out to get me, I would wonder what I said or how I said what I said.
Actually, I'm _not_ the one calling you guys "jerks" or otherwise. I said I'm just tired of what I _see_ of "select people" do on-list that is _only_ directed towards me an _no_one_else_! That's the reality, what you "select people" do!
I never said you used the exact word "jerks". I was, once again, positing a hypothetical in which an individual annoys a bunch of people and then thinks they're the problem.
The hypocrisy is in the fact that you, among a handful of others, are psycho-analyzing me, and contributing to the "indirect paranoia" in the first place. Stop and it will
Yes. We're forcing you to be paranoid, Bryan. I'm one of them now? I'm to blame? Great.
end. Continue and you wonder why I want to prevent people from constantly attributing _everything_ to myself and myself only.
No. It will never end. Because eventually you will say something that will grate on people and they will respond and then you will say that we're ganging up on you and then eventually that we're making you paranoid and then... rinse, repeat. It will never end until you chill out.
I've had other, very illegal comments/deeds attributed to myself on other lists because there quickly becomes the default of "oh, that was Bryan Smith, he's always the troublemaker and capable of anything."
I have no idea what you're talking about, but once again, if it's happened before, I might consider that it's not a coincidence.
Just some helpful advice.
Bull, and you know it. The hypocrisy of your agenda in this message and the prior message has _you_ putting things in more "absolutes" than anyone else!
No. I am trying to be helpful. I want you see you not piss off what you call "the select few" on this list. And I want this list to be conducted in a professional manner. I am trying to be helpful
Again, re-read your comments on:
- "sporting your knowledge"
That was me quoting you, Bryan.
- "If I walked into a room and annoyed almost every person in
the room"
That was a hypothetical. Trying to illustrate how I might react if people reacted to me as they seemingly often react to you.
You are _purposely_ trying to feed it right there! And I'll I'm saying ... "Select people ... dudes! Please stop!" ;->
No, I'm not purposely trying to feed it. You're looking for things to pick at.
After that, I'm _still_ saying _only_ select people. And I'm merely saying I'm "tired of the coincidences, even if they are not deliberate and subsonscience."
But here's the deal. It's clear that you do not like the fact that I only "temporarily" stop posting. So I'll now make it permanent for the benefit of all you "select people" think it is for.
I don't care if you post or don't post. All I care is that this list not be clogged by arguing about whether or not there is a cabal of "select people" trying to "make you paranoid". It's ridiculous. Can't you see this?
Preston
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos .
--- Feizhou feizhou@graffiti.net wrote:
sir_funzone pulls out the shotgun and aims, calls
the
word pull, seeing the thread Re: [CentOS] Why is
yum
not liked by some? in his gunsite. Pulls the
trigger
and hears the load pop of the shotgun going off, hoping the sluge will hit its target. Whamo target sought and hopefully destroyed with exploding
force
that it is never seen again..... waiting to see if that little rascal has survived, sir_funzone
reloads
his shotgun and points it toward the direction of
the
thread Re: [CentOS] Why is yum not liked by some? hoping not to see it come alive again...
So discussions about how to manage a farm of centos boxes should be done elsewhere? _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
no! but when a subject has died and then resurected and died again and then comes back....dont you think it has been beat to death especially when the same stuff has been said all three times... that is what i am saying... its like the nails on the chalk board, it fine and funny the first time or maybe the second but when it comes down to repeating it over and over for 10 to 20 hours straight, it get a little annoying. Dont you think? hmmm!!!
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
no! but when a subject has died and then resurected and died again and then comes back....dont you think it has been beat to death especially when the same stuff has been said all three times... that is what i am saying... its like the nails on the chalk board, it fine and funny the first time or maybe the second but when it comes down to repeating it over and over for 10 to 20 hours straight, it get a little annoying. Dont you think? hmmm!!!
I didn't know the thread has died. Especially since one of the principal contributors, Les, is not finished yet.
Feizhou wrote:
no! but when a subject has died and then resurected and died again and then comes back....dont you think it has been beat to death especially when the same stuff has been said all three times... that is what i am saying... its like the nails on the chalk board, it fine and funny the first time or maybe the second but when it comes down to repeating it over and over for 10 to 20 hours straight, it get a little annoying. Dont you think? hmmm!!!
I didn't know the thread has died. Especially since one of the principal contributors, Les, is not finished yet.
He said that the *subject* has died, then resurrected.
The thread itself is, seemingly, immortal.
Mike
Mike McCarty wrote:
Feizhou wrote:
no! but when a subject has died and then resurected and died again and then comes back....dont you think it has been beat to death especially when the same stuff has been said all three times... that is what i am saying... its like the nails on the chalk board, it fine and funny the first time or maybe the second but when it comes down to repeating it over and over for 10 to 20 hours straight, it get a little annoying. Dont you think? hmmm!!!
I didn't know the thread has died. Especially since one of the principal contributors, Les, is not finished yet.
He said that the *subject* has died, then resurrected.
nitpick...he aimed his shotgun at the *thread* and then he comes back and says subject?
--- Feizhou feizhou@graffiti.net wrote:
Mike McCarty wrote:
Feizhou wrote:
no! but when a subject has died and then
resurected
and died again and then comes back....dont you
think
it has been beat to death especially when the
same
stuff has been said all three times... that is
what i
am saying... its like the nails on the chalk
board, it
fine and funny the first time or maybe the
second but
when it comes down to repeating it over and over
for
10 to 20 hours straight, it get a little
annoying.
Dont you think? hmmm!!!
I didn't know the thread has died. Especially
since one of the
principal contributors, Les, is not finished yet.
He said that the *subject* has died, then
resurrected.
nitpick...he aimed his shotgun at the *thread* and then he comes back and says subject? _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
hmmm. i thought you can say subject=thread and thread=subject where the two can walk hand and hand down the street and basically mean the same. Unless you are just trying to be nit picking to start another flame war on the list. if that is your motive i am not biting, sorry to burst your bubble!
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
doh, u did [delete]
Steven Vishoot wrote:
--- Feizhou feizhou@graffiti.net wrote:
Mike McCarty wrote:
Feizhou wrote:
no! but when a subject has died and then
resurected
and died again and then comes back....dont you
think
it has been beat to death especially when the
same
stuff has been said all three times... that is
what i
am saying... its like the nails on the chalk
board, it
fine and funny the first time or maybe the
second but
when it comes down to repeating it over and over
for
10 to 20 hours straight, it get a little
annoying.
Dont you think? hmmm!!!
I didn't know the thread has died. Especially
since one of the
principal contributors, Les, is not finished yet.
He said that the *subject* has died, then
resurrected.
nitpick...he aimed his shotgun at the *thread* and then he comes back and says subject? _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
hmmm. i thought you can say subject=thread and thread=subject where the two can walk hand and hand down the street and basically mean the same. Unless you are just trying to be nit picking to start another flame war on the list. if that is your motive i am not biting, sorry to burst your bubble!
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux." _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Steven Vishoot quoted Bryan J. Smith today, instead of _anyone_else_ over the last 3 days (when Mr. Smith didn't bother responding):
sir_funzone pulls out the shotgun and aims, calls the word pull, seeing the thread Re: [CentOS] Why is yum not liked by some? in his gunsite. Pulls the trigger and hears the load pop of the shotgun going off, hoping the sluge will hit its target. Whamo target sought and hopefully destroyed with exploding force that it is never seen again..... waiting to see if that little rascal has survived, sir_funzone reloads his shotgun and points it toward the direction of the thread Re: [CentOS] Why is yum not liked by some? hoping not to see it come alive again...
Thank you for quoting _my_ post even though I had _not_ responded to this thread for 3 days prior! And I _only_ responded 3 times yesterday -- 2 of them with an _explicit_ "real world" solution.
In other words, even if I _ignore_ a thread, these things _still_ continue. And most of them are meta-discussions among people who actually don't know how much of the things work.
But go ahead, make me the scapegoat. Just show how unobjective people are, they like to focus on me -- maybe it's because I challenge people too much because ... oh, I don't know ... I actually OFFER REAL-WORLD SOLUTIONS instead of pondering and hoping.
--- "Bryan J. Smith" b.j.smith@ieee.org wrote:
Steven Vishoot quoted Bryan J. Smith today, instead of _anyone_else_ over the last 3 days (when Mr. Smith didn't bother responding):
sir_funzone pulls out the shotgun and aims, calls
the
word pull, seeing the thread Re: [CentOS] Why is
yum
not liked by some? in his gunsite. Pulls the
trigger
and hears the load pop of the shotgun going off, hoping the sluge will hit its target. Whamo target sought and hopefully destroyed with exploding
force
that it is never seen again..... waiting to see if that little rascal has survived, sir_funzone
reloads
his shotgun and points it toward the direction of
the
thread Re: [CentOS] Why is yum not liked by some? hoping not to see it come alive again...
Thank you for quoting _my_ post even though I had _not_ responded to this thread for 3 days prior! And I _only_ responded 3 times yesterday -- 2 of them with an _explicit_ "real world" solution.
In other words, even if I _ignore_ a thread, these things _still_ continue. And most of them are meta-discussions among people who actually don't know how much of the things work.
But go ahead, make me the scapegoat. Just show how unobjective people are, they like to focus on me -- maybe it's because I challenge people too much because ... oh, I don't know ... I actually OFFER REAL-WORLD SOLUTIONS instead of pondering and hoping.
-- Bryan J. Smith | Sent from Yahoo Mail mailto:b.j.smith@ieee.org | (please excuse any http://thebs413.blogspot.com/ | missing headers) _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
hmmmmmmmmmmmmmmmmmmmmmmmmmmmm,,,,,, Did i mention Bryan in the post....if i read it again it says thread! unless i am going blind deaf and dumb...please inform me if i am....
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
On Tue, 2005-09-13 at 21:21 -0500, Les Mikesell wrote: [snip]
Yum does its dependency computations on the client side based on the contents of the .hdr files (otherwise it wouldn't work when combining the contents of different repositories). It needs the .hdr files, not the RPMS. There is some magic in the repo metadata that makes the client only download the latest .hdr files but if you update often you end up with them all anyway and use only the latest. The needed change is that if you specify a point-in-time the client should toss/ignore .hdr files past that and get downrevs if available. Note that you could do this yourself with nothing but an ftp view of the repository and you'll see the client could do it directly, although I agree that repository support could make it easier.
Yum doesn't do it the same way anymore.
The old way (a bunch of header files ... one for each package in the repo {yum upto 2.0.x}) has been replaced by a new way in yum > 2.1.x. This new way is with files called repomd.xml and primary.xml.gz (yum 2.1.x and 2.2.x) or primary.xml.gz.sqlite (yum 2.3.x and 2.4.x). Old header files are not really available.
Using the date added to the mirror is not good. A copy with the wrong switches ... signing with a different key, etc. changes that (when the package is actually the same). Not to mention that we maintain several repos that get rebuilt at different times.
I'd call it a minor change to expose an option to get backrev .hdr files when wanted.
It is a major change ... the entire repo is looked at as a whole at rebuild time for the metadata, not as 10,000 packages but as one entity. Because of this fact (as Bryan has pointed out), you would need to keep older entire repo snapshots of the metadata to use to resolve your dependencies separately.
The more I look at this problem, the more I see that a local repo maintained by the local user is the right answer. It works right now, requires no changes, and let's you control EXACTLY what you want in your repo (including files from other places in a single repo).
You can freeze package xxxxx and it dependencies as you see fit, and add only tested packages to the repo. It is just the right way to do version control if you don't want to just use the version control that is published by the repo maintainer.
Anyone know of a good IRCD and a text based IRC client (cuz I gotta login through SSH)?
I'm on centos.
thanks
centos-bounces@centos.org <> scribbled on Wednesday, September 14, 2005 7:22 AM:
Anyone know of a good IRCD and a text based IRC client (cuz I gotta login through SSH)?
I'm on centos.
thanks _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
It seems like BitchX was the console based client of choice back when irc was cool :) ircd was also the main daemon back then too. http://www.funet.fi/~irc/server/
Mike
Hi
centos-bounces@centos.org <> scribbled on Wednesday, September 14, 2005 7:22 AM:
Anyone know of a good IRCD and a text based IRC client (cuz I gotta login through SSH)?
Most people doing text based IRC use irssi in screen these days (as far as I can tell...):
It is in FC Extras (why it's not the default IRC text client I don't know) and DAG also packages it for RHEL4:
http://dag.wieers.com/packages/irssi/
Chris
Abilash Praveen M wrote:
Anyone know of a good IRCD and a text based IRC client (cuz I gotta login through SSH)?
ircd and irssi. And *do not* steal threads (by posting your question as a reply to an existing mail). Especially not *THIS* thread.
Mail your questions as "New Mail" to centos@centos.org.
Ralph
centos-bounces@centos.org wrote on 14.09.2005 14:22:24:
Anyone know of a good IRCD and a text based IRC client (cuz I gotta
login
through SSH)?
For IRCD I'd recommend unrealircd. Been running it on several platforms including CentOS for quite some time.
Regards, Harald
Text based client I would recommend IRSSI
On Wed, 14 Sep 2005, Abilash Praveen M wrote:
Anyone know of a good IRCD and a text based IRC client (cuz I gotta login through SSH)?
I'm on centos.
thanks _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Johnny Hughes mailing-lists@hughesjr.com wrote:
Because of this fact (as Bryan has pointed out),
I think I'm just going to talk to people off-list and asking getting people to answer for me. Apparently when it comes out of my mouth, instead of being a bunch of technical facts and possible solutions around the pros/cons, it comes out at "Hey everyone, look how smart I am?!?!?! Come worship me!"
And in the famous words of Captain James T. Kirk, "We're busy."
I appreciate your contributions Bryan, there will always be dickheads who don't understand, but ignore them please, your input is valuable to me at least, I have learned much from you.
regards tom
Bryan J. Smith wrote:
Johnny Hughes mailing-lists@hughesjr.com wrote:
Because of this fact (as Bryan has pointed out),
I think I'm just going to talk to people off-list and asking getting people to answer for me. Apparently when it comes out of my mouth, instead of being a bunch of technical facts and possible solutions around the pros/cons, it comes out at "Hey everyone, look how smart I am?!?!?! Come worship me!"
And in the famous words of Captain James T. Kirk, "We're busy."
Bryan, you are captain Kirk and you and Preston are driving the CentOS's powered Enterprise in unknown thread's levels. Life on board is not always easy but all the crew(including me) are gratefull for what you 're doing here. This list is often too technical for me and that's why I'm learning so much from it. Thanks for sharing the knowledge, and now let's talk about *nix things. Manuel
Barry L. Kline wrote:
Tom wrote:
I appreciate your contributions Bryan, there will always be dickheads who don't understand, but ignore them please, your input is valuable to me at least, I have learned much from you.
regards tom
I agree with Tom.
BK _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, 2005-09-14 at 19:10 +0200, Manuel BERTRAND wrote:
Bryan, you are captain Kirk and you and Preston are driving the CentOS's powered Enterprise in unknown thread's levels. Life on board is not always easy but all the crew(including me) are gratefull for what you 're doing here. This list is often too technical for me and that's why I'm learning so much from it. Thanks for sharing the knowledge, and now let's talk about *nix things. Manuel
I agree. I apologize for my 8 contributions to this thread. But I honestly was just trying to talk sense into Bryan so he'd stay, but just chill. His knowledge is welcome. He doesn't need to be a martyr and leave the list. But threads like this must die. I know I wasn't helping, but.... Anyway, I'm sorry for my part. Just let it go.
Preston
On Wed, 2005-09-14 at 06:03 -0700, Bryan J. Smith wrote:
Johnny Hughes mailing-lists@hughesjr.com wrote:
Because of this fact (as Bryan has pointed out),
I think I'm just going to talk to people off-list and asking getting people to answer for me. Apparently when it comes out of my mouth, instead of being a bunch of technical facts and possible solutions around the pros/cons, it comes out at "Hey everyone, look how smart I am?!?!?! Come worship me!"
And in the famous words of Captain James T. Kirk, "We're busy."
My comment was not meant to be negative. You were correct to point out that yum:
1. Doesn't look at files 2. Creates metadata that is only accurate when it is run.
I was just agreeing with you.
Johnny Hughes mailing-lists@hughesjr.com wrote:
My comment was not meant to be negative.
I know. I really appreciate you mentioning I was correct. Which is why I think I'll just get people like yourself to answer for me next time. ;->
You were correct to point out that yum:
- Doesn't look at files
- Creates metadata that is only accurate when it is run.
I was just agreeing with you.
I know, and I greatly appreciate it.
I don't think people realize that regardless of whatever solution, to do what most people want on the YUM client side, you still need an internal copy of the repository -- delta-based or not.
--- "Bryan J. Smith" b.j.smith@ieee.org wrote:
Johnny Hughes mailing-lists@hughesjr.com wrote:
Because of this fact (as Bryan has pointed out),
I think I'm just going to talk to people off-list and asking getting people to answer for me. Apparently when it comes out of my mouth, instead of being a bunch of technical facts and possible solutions around the pros/cons, it comes out at "Hey everyone, look how smart I am?!?!?! Come worship me!"
And in the famous words of Captain James T. Kirk, "We're busy."
-- Bryan J. Smith | Sent from Yahoo Mail mailto:b.j.smith@ieee.org | (please excuse any http://thebs413.blogspot.com/ | missing headers) _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Bryan,
if i was going to be a dick head and go after you for posting on here i would.....and you would know it too...because i would mention your name.....so get the f%^$ off your high horse and everyone is not after you.....i think it is time for you to take your pysch meds again your getting a little paranoid again!!!!!
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
Bryan, if i was going to be a dick head and go after you for posting on here i would.....and you would know it too...because i would mention your name.....so get the f%^$ off your high horse and everyone is not after you.....i think it is time for you to take your pysch meds again your getting a little paranoid again!!!!!
I would just once like someone _else_ to be quoted.
The coincidence between the first complaint and my self-removal of the thread ... then several days later, I respond again and another complaint is made, is just getting old.
I'm not the only person in these threads.
But it appears for whatever "consistency" -- my verbage, my technical detail, my offering of solutions, my requests that people actually have first-hand experience before pondering (and refusing to see things another way from those that do), etc..., etc..., etc... -- I'm _always_ the one quoted (if not directly accused).
That's what drives "paranoia."
I _never_ said people are "out to get me." I just said it would be nice if people just didn't always pick my posts, quote me, etc... That's all. Sounds like _other_ people read too much into things than myself. They believe what _they_ want to believe about my posts.
Which is my #1 pet-peeve, hypocrisy.
On Wed, 2005-09-14 at 08:58 -0700, Bryan J. Smith wrote:
Bryan, if i was going to be a dick head and go after you for posting on here i would.....and you would know it too...because i would mention your name.....so get the f%^$ off your high horse and everyone is not after you.....i think it is time for you to take your pysch meds again your getting a little paranoid again!!!!!
Here here.
I would just once like someone _else_ to be quoted.
And if someone else continually ruins this list by being pompous, I'm sure they will be quoted.
The coincidence between the first complaint and my self-removal of the thread ... then several days later, I respond again and another complaint is made, is just getting old.
These self-removals don't last long.
I'm not the only person in these threads.
But it appears for whatever "consistency" -- my verbage, my technical detail, my offering of solutions, my requests that people actually have first-hand experience before pondering (and refusing to see things another way from those that do), etc..., etc..., etc... -- I'm _always_ the one quoted (if not directly accused).
That's what drives "paranoia."
No. Being paranoid drives paranoia. Thus the reference to seeking help from a psychiatrist or listening to him/her if you're already seeking help.
Which is my #1 pet-peeve, hypocrisy.
Ummm.... Huh?
Preston
Preston Crawford me@prestoncrawford.com wrote:
And if someone else continually ruins this list by being pompous, I'm sure they will be quoted.
Thanx, you just reminded me that hypocrisy is alive and well on this list.
These self-removals don't last long.
Okay, I'll make it permanent.
No. Being paranoid drives paranoia. Thus the reference to seeking help from a psychiatrist or listening to him/her if you're already seeking help.
It's nice to know that select people here are as objective in their own minds as a psychiatrist. Again, the hypocrisy never ends.
Ummm.... Huh?
Yes, I thought so.
My apologies to the silent majority, but I will not bother anyone here anymore.
On Wed, 2005-09-14 at 09:12 -0700, Bryan J. Smith wrote:
Preston Crawford me@prestoncrawford.com wrote:
And if someone else continually ruins this list by being pompous, I'm sure they will be quoted.
Thanx, you just reminded me that hypocrisy is alive and well on this list.
Do you know the definition of the word hypocrisy? If so please explain how I was being hypocritical to point out that there really isn't anyone else who continually does what you do?
These self-removals don't last long.
Okay, I'll make it permanent.
Martyr. Are we supposed to beg you to come back now? Or is this your attempt at trying to align the forces of the list against myself or others?
No. Being paranoid drives paranoia. Thus the reference to seeking help from a psychiatrist or listening to him/her if you're already seeking help.
It's nice to know that select people here are as objective in their own minds as a psychiatrist. Again, the hypocrisy never ends.
Once again, do you know the definition of that word. How is it hypocrisy to point out that it appears you are actually just paranoid and that you can't blame this on other people rationally?
I'm not even sure what the line about "It's nice to know that select people here are as objective in their own minds as a psychiatrist" means, to be honest. I was just joking that your continued paranoia does make one wonder.
Ummm.... Huh?
Yes, I thought so.
My apologies to the silent majority, but I will not bother anyone here anymore.
There is another path other than martyrdom. You could just stop being paranoid. But even in silence you're trying to stir things up here.
Preston
Preston Crawford wrote:
On Wed, 2005-09-14 at 09:12 -0700, Bryan J. Smith wrote:
Preston Crawford me@prestoncrawford.com wrote:
And if someone else continually ruins this list by being pompous, I'm sure they will be quoted.
Thanx, you just reminded me that hypocrisy is alive and well on this list.
Do you know the definition of the word hypocrisy? If so please explain how I was being hypocritical to point out that there really isn't anyone else who continually does what you do?
Ah, but there is at least one other...
This portion of the thread is suffering from serious topic drift.
[snip]
Mike
--- "Bryan J. Smith" b.j.smith@ieee.org wrote:
Bryan, if i was going to be a dick head and go after you
for
posting on here i would.....and you would know it too...because i would mention your name.....so get
the
f%^$ off your high horse and everyone is not after you.....i think it is time for you to take your
pysch
meds again your getting a little paranoid
again!!!!!
I would just once like someone _else_ to be quoted.
The coincidence between the first complaint and my self-removal of the thread ... then several days later, I respond again and another complaint is made, is just getting old.
I'm not the only person in these threads.
But it appears for whatever "consistency" -- my verbage, my technical detail, my offering of solutions, my requests that people actually have first-hand experience before pondering (and refusing to see things another way from those that do), etc..., etc..., etc... -- I'm _always_ the one quoted (if not directly accused).
That's what drives "paranoia."
I _never_ said people are "out to get me." I just said it would be nice if people just didn't always pick my posts, quote me, etc... That's all. Sounds like _other_ people read too much into things than myself. They believe what _they_ want to believe about my posts.
Which is my #1 pet-peeve, hypocrisy.
-- Bryan J. Smith | Sent from Yahoo Mail mailto:b.j.smith@ieee.org | (please excuse any http://thebs413.blogspot.com/ | missing headers) _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
well Bryan,
like i said if i am going to go after anyone in paticular i would mention names....other wise i am saying in general. Also it just happens to coincidence that your thread was the one i picked....But if you want me to go after you by name i can.. But i dont have any real beefs with you. talking about pet peeves, one of them is when a thread has been beaten the crap out of it and keeps on going that gets annoying especially how this thread has gone. imho this thread has circled the band wagon three times and all three times it has said the same stuff again and again!
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
On Wednesday 14 September 2005 05:23, Johnny Hughes wrote:
The more I look at this problem, the more I see that a local repo maintained by the local user is the right answer. It works right now, requires no changes, and let's you control EXACTLY what you want in your repo (including files from other places in a single repo).
You can freeze package xxxxx and it dependencies as you see fit, and add only tested packages to the repo. It is just the right way to do version control if you don't want to just use the version control that is published by the repo maintainer.
Sounds like an opportunity for someone to write a local repo management app that can pick rsync repos to mirror, selectively freeze packages, and such. This would be nice.
Lamar Owen lowen@pari.edu wrote:
Sounds like an opportunity for someone to write a local repo management app that can pick rsync repos to mirror, selectively freeze packages, and such. This would be nice.
But _who_ decides to "freeze"? I mean, that still does not solve the problem of someone wanting an older revision of the repository than even the "freeze."
Which is why I'm working on the simple hack to maintain all prior createrepo runs, and add the small logic required to support it on the YUM client.
On Wednesday 14 September 2005 10:06, Bryan J. Smith wrote:
Lamar Owen lowen@pari.edu wrote:
Sounds like an opportunity for someone to write a local repo management app that can pick rsync repos to mirror, selectively freeze packages, and such. This would be nice.
But _who_ decides to "freeze"? I mean, that still does not solve the problem of someone wanting an older revision of the repository than even the "freeze."
The local repository manager would, perhaps, click on a checkbox that says 'freeze this package in my local repository http://some.local.machine/yum-repo-path'
Configuration would include the means by which you seed your local repository; yam or straight rsync for instance. Pick some repositories (not unlike the baseurl you already have in yum.repos.d/random-repo.repo) to configure, then let it populate your local repo. When you want to freeze a package, you (the local repo manager) click 'freeze' next to the package (need some depsolving here, since updated versions of other packages might require the ackage to be thawed). Then the local yum clients use this local repo, and the frozen packages aren't updated until the local repo manager thaws the package and updates his local repo.
No, it doesn't solve time-travel issues. But it does allow a stop point to be set at a particular package(s).
Which is why I'm working on the simple hack to maintain all prior createrepo runs, and add the small logic required to support it on the YUM client.
And that's a laudable goal.
But the whole 'maintain a local repository' mantra begs a utility to help one maintain a local repo. And maybe yam can already do all that; don't know.
Johnny Hughes wrote:
[snip]
Using the date added to the mirror is not good. A copy with the wrong switches ... signing with a different key, etc. changes that (when the package is actually the same). Not to mention that we maintain several repos that get rebuilt at different times.
[snip]
It is a major change ... the entire repo is looked at as a whole at rebuild time for the metadata, not as 10,000 packages but as one entity. Because of this fact (as Bryan has pointed out), you would need to keep older entire repo snapshots of the metadata to use to resolve your dependencies separately.
The more I look at this problem, the more I see that a local repo maintained by the local user is the right answer. It works right now, requires no changes, and let's you control EXACTLY what you want in your repo (including files from other places in a single repo).
[snip]
Everyone who has actually done any real configuration management has said this exact thing several times in this thread, and it seems to do absolutely no good.
You can freeze package xxxxx and it dependencies as you see fit, and add only tested packages to the repo. It is just the right way to do version control if you don't want to just use the version control that is published by the repo maintainer.
This has been repeated until people are blue in the face, and it doesn't make a dent.
Mike
Mike McCarty mike.mccarty@sbcglobal.net wrote:
Everyone who has actually done any real configuration management has said this exact thing several times in this thread, and it seems to do absolutely no good.
I only re-entered it to offer a larger, 3-tier (Delta-Patch-Client) solution. But even then, it doesn't remove the fact that to do what you want, you still need to maintain an _internal_ repository.
This has been repeated until people are blue in the face, and it doesn't make a dent.
Well, the good news is that a new tangent is starting!
I'm anxiously awaiting to find out why I think everyone is "out to get me." Apparently when I say it would be nice for someone else to get quoted (if not directly criticized) than myself, it is taken as ...
"BS thinks everyone is out to get him."
Sorry my pointing out the obviousness of the truth is being taken as such. When you're the _only_ person that seems to get constantly questioned and _no_one_else_ involved does, that tends to drive the concept that I'm the "scapegoat" in people's eyes.
Even if it's subconscience and not deliberate by many, the repeat, repeat and more REPEAT examples cannot deny that's _exactly_ what happens. There have been claims made on-list and off-list -- both jokingly and serious -- that it is only my presence on the list that these things happen.
Well, I can not post, unsubscribe and ... back when I was a "lurker" (pre mid-April), they _still_ happened!
Sorry, the more people don't stop to realize that I am not the only person in a thread, the more that gets "attributed" to myself. It's already happened a couple of times on this list by a few people -- where a quote of someone else gets attributed to myself, because people don't stop to realize there are _other_ "negative/problematic/etc... posts" than from myself.
I just want it to _stop_ -- even if it's not deliberate. I'm not letting illegal suggestions or even actual, illegally acts attributed to myself because people aren't conscience enough to realize it's not me stating or doing them -- that's what has happened on a number of lists in the past because of such "repeat/unicast assumptions/posts."
On Wed, 2005-09-14 at 09:07 -0700, Bryan J. Smith wrote:
Sorry my pointing out the obviousness of the truth is being taken as such. When you're the _only_ person that seems to get constantly questioned and _no_one_else_ involved does, that tends to drive the concept that I'm the "scapegoat" in people's eyes.
"Scapegoat"? For what? Paranoid man, back it off, seriously.
I just want it to _stop_ -- even if it's not deliberate. I'm not letting illegal suggestions or even actual, illegally acts attributed to myself because people aren't conscience enough to realize it's not me stating or doing them -- that's what has happened on a number of lists in the past because of such "repeat/unicast assumptions/posts."
If this has happened in the past, ask yourself what the common denominator is. Because I wasn't on those lists. Nor were most posters here.
Preston
Preston Crawford me@prestoncrawford.com wrote:
If this has happened in the past, ask yourself what the common denominator is.
Yes, that "select people" will attribute everything to me.
Because I wasn't on those lists. Nor were most posters here.
Someone posted that he posts timebombs in programs and they go off if he doesn't get paid (and sends in the code to stop the timebomb). That was attributed to myself, quite widespread, even though I didn't say it.
And when someone's server got hacked, several people told the FBI I did it. I basically didn't sleep for several nights in fear of being raided and having all my equipment taken, given the circumstances involved. Especially given the fact that my wife earns a chunk of her living teaching on-line.
It's the "use Bryan by default" -- even if subconscience, that I can_not_ tolerate to start. It has cost me dearly in the past, because people do _exactly_ what you're doing.
On Wed, 2005-09-14 at 09:47 -0700, Bryan J. Smith wrote:
Preston Crawford me@prestoncrawford.com wrote:
If this has happened in the past, ask yourself what the common denominator is.
Yes, that "select people" will attribute everything to me.
So on every list you're on, there are always "select people"? Are they the same people? Because the first time I ever met you was on this list. So I'm thinking the common denominator may be you.
Because I wasn't on those lists. Nor were most posters here.
Someone posted that he posts timebombs in programs and they go off if he doesn't get paid (and sends in the code to stop the timebomb). That was attributed to myself, quite widespread, even though I didn't say it.
No idea what you're talking about.
And when someone's server got hacked, several people told the FBI I did it. I basically didn't sleep for several nights in fear of being raided and having all my equipment taken, given the circumstances involved. Especially given the fact that my wife earns a chunk of her living teaching on-line.
Okay, once again, no idea what you're talking about. That sounds terrible.
It's the "use Bryan by default" -- even if subconscience, that I can_not_ tolerate to start. It has cost me dearly in the past, because people do _exactly_ what you're doing.
But.... *sigh* don't you understand that for most of us things like you mention above NEVER happen. So what's the common denominator again?
Preston
On Wed, 2005-09-14 at 09:07 -0700, Bryan J. Smith wrote:
Well, the good news is that a new tangent is starting!
I'm anxiously awaiting to find out why I think everyone is "out to get me." Apparently when I say it would be nice for someone else to get quoted (if not directly criticized) than myself, it is taken as ...
"BS thinks everyone is out to get him."
Sorry my pointing out the obviousness of the truth is being taken as such. When you're the _only_ person that seems to get constantly questioned and _no_one_else_ involved does, that tends to drive the concept that I'm the "scapegoat" in people's eyes.
---- It's the manner of delivery that you use.
It's the volume of your answers.
It's the focus on exactness and your absolute requirement that you define the terms which we use.
It's you jumping on questions not intended for you, which you could not possibly answer for the intended, yet you feel compelled to inject, to provide commentary and your inimitable analysis anyway.
You know I like you anyway. I can get used to you and simply delete your posts or the threads that I have finished with.
It's clear you aren't going to change, so I suppose you just keep on being you and we'll all adjust or deal with it in our own ways.
Craig
On Wed, 2005-09-14 at 09:17 -0700, Craig White wrote:
You know I like you anyway. I can get used to you and simply delete your posts or the threads that I have finished with.
I agree. Bryan (if you can cut through the volume and the presentation) is right about a lot of things and has important things to say. Most of the time these little tiffs are white noise to me too. It's just that it flares up into these battles on the list sometimes and it gets ridiculous eventually.
Preston
Preston Crawford me@prestoncrawford.com wrote:
I agree. Bryan (if you can cut through the volume and the presentation)
Sorry, I do hands-on training for people in real-life. Due to the lack of a classroom, sometimes you've gotta just end it with, "go use it first-hand, then come back with questions." No amount of explaination will do, which is why I often stop responding.
is right about a lot of things and has important things to say.
Don't patronize me. You have already established the fact that I walk into a room and piss almost everyone off, then not wonder why. That is not true, and it's a pure bait.
Most of the time these little tiffs are white noise to me too. It's just that it flares up into these battles on the list sometimes and it gets ridiculous eventually.
And that's when *I'm* the "Common Denominator," right? Sorry you are oblivous to the others that happen.
They've happened before I was here. They've happened when I wasn't in the thread. They will happen after I'm here.
But keep up the psycho-analyzation. And the fact that I want to be a martyr.
On Wed, 2005-09-14 at 09:54 -0700, Bryan J. Smith wrote:
Preston Crawford me@prestoncrawford.com wrote:
I agree. Bryan (if you can cut through the volume and the presentation)
Sorry, I do hands-on training for people in real-life. Due to the lack of a classroom, sometimes you've gotta just end it with, "go use it first-hand, then come back with questions." No amount of explaination will do, which is why I often stop responding.
is right about a lot of things and has important things to say.
Don't patronize me. You have already established the fact that I walk into a room and piss almost everyone off, then not wonder why. That is not true, and it's a pure bait.
It was a hypothetical example of how I personally would react if I was met with a reaction from other people, Bryan. I never said YOU did this. Come on. You're smarter than that. I'm not patronizing you.
Most of the time these little tiffs are white noise to me too. It's just that it flares up into these battles on the list sometimes and it gets ridiculous eventually.
And that's when *I'm* the "Common Denominator," right? Sorry you are oblivous to the others that happen.
They've happened before I was here. They've happened when I wasn't in the thread. They will happen after I'm here.
But keep up the psycho-analyzation. And the fact that I want to be a martyr.
????
Preston Crawford wrote:
On Wed, 2005-09-14 at 09:17 -0700, Craig White wrote:
You know I like you anyway. I can get used to you and simply delete your posts or the threads that I have finished with.
I agree. Bryan (if you can cut through the volume and the presentation) is right about a lot of things and has important things to say. Most of the time these little tiffs are white noise to me too. It's just that it flares up into these battles on the list sometimes and it gets ridiculous eventually.
It's been noted that all threads on the 'net eventually degrade into people calling each other "Nazi".
Mike
On Wed, 2005-09-14 at 13:20 -0500, Mike McCarty wrote:
Preston Crawford wrote:
On Wed, 2005-09-14 at 09:17 -0700, Craig White wrote:
You know I like you anyway. I can get used to you and simply delete your posts or the threads that I have finished with.
I agree. Bryan (if you can cut through the volume and the presentation) is right about a lot of things and has important things to say. Most of the time these little tiffs are white noise to me too. It's just that it flares up into these battles on the list sometimes and it gets ridiculous eventually.
It's been noted that all threads on the 'net eventually degrade into people calling each other "Nazi".
---- fwiw
:Godwin's Law: prov.
[Usenet] "As a Usenet discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches one." There is a tradition in many groups that, once this occurs, that thread is over, and whoever mentioned the Nazis has automatically lost whatever argument was in progress. Godwin's Law thus practically guarantees the existence of an upper bound on thread length in those groups. However there is also a widely- recognized codicil that any intentional triggering of Godwin's Law in order to invoke its thread-ending effects will be unsuccessful. Godwin himself has discussed the subject.
Craig
Guys,
please stop this. You've all seen my "fight" with Bryan. He might annoy you. He might piss you off. You might think he passes things off as facts - I have accused him of all of that.
BUT HE DESERVES BETTER THAN THIS!
If you argue technical and you are of different opinion that is one thing. Attacking him with stuff like his Captain Kirk comments is just silly.
And besides that - didn't we just have a talk about moderation and that everyone should be more civilized?
Peter.
On Wed, 14 Sep 2005, Peter Arremann wrote:
Guys,
please stop this. You've all seen my "fight" with Bryan. He might annoy you. He might piss you off. You might think he passes things off as facts - I have accused him of all of that.
You're the second person that's trying to put a stop to a fight that has ended.
BUT HE DESERVES BETTER THAN THIS!
If you argue technical and you are of different opinion that is one thing. Attacking him with stuff like his Captain Kirk comments is just silly.
And besides that - didn't we just have a talk about moderation and that everyone should be more civilized?
Peter.
This only flows one direction, though? Everyone has to be civil when one member of the community is lashing out? Either way it's done. I'm a Nazi. There, thread over. Jebus!
Preston
Craig White craigwhite@azapple.com wrote:
It's you jumping on questions not intended for you, which you could not possibly answer for the intended, yet you feel compelled to inject, to provide commentary and your inimitable analysis anyway.
There are plenty of threads I _avoid_ because I am not knowlegeable enough. And I _do_ drop things many times.
But it's always gotta be my post. Flames, threads, tangents, etc... all around, it's always _my_ posts by "select" people.
On Wed, 2005-09-14 at 10:03, Mike McCarty wrote:
Johnny Hughes wrote:
It is a major change ... the entire repo is looked at as a whole at rebuild time for the metadata, not as 10,000 packages but as one entity. Because of this fact (as Bryan has pointed out), you would need to keep older entire repo snapshots of the metadata to use to resolve your dependencies separately.
Yet I can look, for example, at: http://mirror.centos.org/centos/4.1/updates/i386/headers/ and have no trouble knowing exactly which files were there at any given date. Yum could be at least as smart...
The more I look at this problem, the more I see that a local repo maintained by the local user is the right answer. It works right now, requires no changes, and let's you control EXACTLY what you want in your repo (including files from other places in a single repo).
[snip]
Everyone who has actually done any real configuration management has said this exact thing several times in this thread, and it seems to do absolutely no good.
The Centos people are doing an excellent job of configuration management. If they say they are planning to start deleting and randomly modifying existing files in their repository instead of just adding newer ones, I'll give up on it being possible to tell what was previously present at the points the .hdr files were generated. Otherwise, while I agree that yum currently uses some repo metadata to quickly ignore .hdr files other than the latest, an option to work with timestamps could let it construct a view of what was there earlier just as I could construct a copy of the whole repository as of a certain time simply by observing the timestamps of all .hdr files - something that is already viewable.
On Wed, 2005-09-14 at 14:15 -0500, Les Mikesell wrote:
On Wed, 2005-09-14 at 10:03, Mike McCarty wrote:
Johnny Hughes wrote:
It is a major change ... the entire repo is looked at as a whole at rebuild time for the metadata, not as 10,000 packages but as one entity. Because of this fact (as Bryan has pointed out), you would need to keep older entire repo snapshots of the metadata to use to resolve your dependencies separately.
Yet I can look, for example, at: http://mirror.centos.org/centos/4.1/updates/i386/headers/ and have no trouble knowing exactly which files were there at any given date. Yum could be at least as smart...
That directory is for up2date (not yum), it would also work for yum prior to 2.1.x (not what we use in CentOS-4.x) ... this directory is for yum 2.1 and greater, which is what we use for CentOS-4.x:
http://mirror.centos.org/centos/4.1/updates/i386/repodata/
The more I look at this problem, the more I see that a local repo maintained by the local user is the right answer. It works right now, requires no changes, and let's you control EXACTLY what you want in your repo (including files from other places in a single repo).
[snip]
Everyone who has actually done any real configuration management has said this exact thing several times in this thread, and it seems to do absolutely no good.
The Centos people are doing an excellent job of configuration management. If they say they are planning to start deleting and randomly modifying existing files in their repository instead of just adding newer ones, I'll give up on it being possible to tell what was previously present at the points the .hdr files were generated.
I am the CentOS people :)
We currently remove all the update information every time a new point release is done ... for example, when 4.2 is released, the paths:
http://mirror.centos.org/centos/4/
and
http://mirror.centos.org/centos/4.2/
will be the same ... there will be no files in the updates tree. The 4.0 and 4.1 trees will look like this:
http://mirror.centos.org/centos/4.0/
This is the way we have been doing the trees since Jan 2004.
Otherwise, while I agree that yum currently uses some repo metadata to quickly ignore .hdr files other than the latest, an option to work with timestamps could let it construct a view of what was there earlier just as I could construct a copy of the whole repository as of a certain time simply by observing the timestamps of all .hdr files - something that is already viewable.
Johnny Hughes CentOS-4 Lead Developer
On Wed, 2005-09-14 at 14:33, Johnny Hughes wrote:
It is a major change ... the entire repo is looked at as a whole at rebuild time for the metadata, not as 10,000 packages but as one entity. Because of this fact (as Bryan has pointed out), you would need to keep older entire repo snapshots of the metadata to use to resolve your dependencies separately.
Yet I can look, for example, at: http://mirror.centos.org/centos/4.1/updates/i386/headers/ and have no trouble knowing exactly which files were there at any given date. Yum could be at least as smart...
That directory is for up2date (not yum), it would also work for yum prior to 2.1.x (not what we use in CentOS-4.x) ... this directory is for yum 2.1 and greater, which is what we use for CentOS-4.x:
I've mostly used the Centos 3.x version so far. And believed the philosophy section of the yum README file where it said the dependency decisions were made based on the contents of the hdr files... Will any or all of apt/up2date/yum do the right thing if you ask it to install a version of a package that is in the repository but not the most current. I was under the impression that yum could do that, but have not been able to with the yum version in Centos 3.5.
On Wed, 2005-09-14 at 19:00 -0500, Les Mikesell wrote:
On Wed, 2005-09-14 at 14:33, Johnny Hughes wrote:
It is a major change ... the entire repo is looked at as a whole at rebuild time for the metadata, not as 10,000 packages but as one entity. Because of this fact (as Bryan has pointed out), you would need to keep older entire repo snapshots of the metadata to use to resolve your dependencies separately.
Yet I can look, for example, at: http://mirror.centos.org/centos/4.1/updates/i386/headers/ and have no trouble knowing exactly which files were there at any given date. Yum could be at least as smart...
That directory is for up2date (not yum), it would also work for yum prior to 2.1.x (not what we use in CentOS-4.x) ... this directory is for yum 2.1 and greater, which is what we use for CentOS-4.x:
I've mostly used the Centos 3.x version so far. And believed the philosophy section of the yum README file where it said the dependency decisions were made based on the contents of the hdr files... Will any or all of apt/up2date/yum do the right thing if you ask it to install a version of a package that is in the repository but not the most current. I was under the impression that yum could do that, but have not been able to with the yum version in Centos 3.5.
The new versions of yum and smartpm use the repomd (repodata directory) info for updates. Neither will use the old (headers) directory structure. ----------- Up2date still uses the old (headers) directory structure. ---------- Apt (i386 only, because of no multi-lib arch support) uses this structure: http://mirror.centos.org/centos/4/apt/i386/
These are just symlinks to places in the main tree and apt metadata is in the base directory. Apt is no longer being maintained upstream (they are now doing smartpm), but we will maintain the repo for i386. Smartpm can use either the apt tree or the yum repomd tree. -------- We have an FAQ that discusses upgrades from CentOS-3.x to CentOS-4.x. The CentOS team does not recommend an upgrade via yum, apt or up2date. You can see our recommendation (and a link to a forum thread that explains how to do an upgrade via the non-recommended way too) here:
On Wed, 2005-09-14 at 19:44, Johnny Hughes wrote:
Will any or all of apt/up2date/yum do the right thing if you ask it to install a version of a package that is in the repository but not the most current. I was under the impression that yum could do that, but have not been able to with the yum version in Centos 3.5.
The new versions of yum and smartpm use the repomd (repodata directory) info for updates. Neither will use the old (headers) directory structure.
It looks like under Centos4 I can: yum install nx-0-1.4.0-4.1.centos4 even though nx-0-1.5.0-0.centos4 is also in the repository and would be the default for 'yum install nx'.
But under Centos3.5 I can't do the equivalent yum install gaim-1-1.3.1-0.el3 or even specify the latest gaim-1-1.3.1-0.el3.3
Is that a new feature or am I doing something wrong?
Les Mikesell wrote:
On Wed, 2005-09-14 at 10:03, Mike McCarty wrote:
Johnny Hughes wrote:
It is a major change ... the entire repo is looked at as a whole at rebuild time for the metadata, not as 10,000 packages but as one entity. Because of this fact (as Bryan has pointed out), you would need to keep older entire repo snapshots of the metadata to use to resolve your dependencies separately.
Yet I can look, for example, at: http://mirror.centos.org/centos/4.1/updates/i386/headers/ and have no trouble knowing exactly which files were there at any given date. Yum could be at least as smart...
As pointed out previously, these files are not used by yum in CentOS 4. http://lists.centos.org/pipermail/centos/2005-September/011912.html
The Centos people are doing an excellent job of configuration management. If they say they are planning to start deleting and randomly modifying existing files in their repository instead of just adding newer ones, I'll give up on it being possible to tell what was previously present at the points the .hdr files were generated.
As it has been pointed out previously, old updates are already removed from the repos and moved into the "CentOS Vault". http://lists.centos.org/pipermail/centos/2005-September/011342.html
Les Mikesell wrote:
I want that functionality, but I was arguing that all it would take to get it is sequentially increasing timestamps on files being added to the repository and knowledge of the final timestamp of each consistent update set - and letting the yum client have that information to figure out the rest.
I suspect that a simple time-stamp freeze wouldn't be sufficient. It is a good start. Knowing that what you tested, certified and are rolling out will stay as a constant is a good thing. Achievable today when you control your own rep. However, the problem of configuration management gets more complex quickly when you may want to add additional freezes on different branches.
For example - your developers have a dependency on App X @ v2-2-3.4.5 (Sept 4 2005) and library Y @ v16.4-5 (Aug 11 2005). You need to freeze that app and that library, but may be ok with, or want, a security fix on App Z released Sept 13, 2005 (Again achievable with your own rep). It gets plain ugly when your developers (or app owners) want incompatible freezes (lib v3.4 for Bob & lib v5.1 for Sue).
regards Dave www.hornfordassociates.com
Mike McCarty wrote:
I agree that what you suggest would be worthwhile. That's why there are people who make money doing it. An example is the LynxOs corporation. Other names come to mind.
Mike
Mike, Who else comes to mind? Who is best-in-class with configuration management? (I'm ok with non-linux answers for best-in-class, then I know what to look for)
regards Dave www.hornfordassociates.com
On Fri, 9 Sep 2005, Lamar Owen wrote:
So you think I'm stupid for suggesting it. (That's how it comes across). Ok, I can deal with that.
It seems to be more important to win the discussion than to understand each other or resolve the differences. Otherwise the (public) name calling serves no purpose than to try and shut up the opponent.
Please everybody involved, go private. Little has been added since the first few posts and it's getting boring. Besides, I'm sure as soon as this is a private discussion it dies silently as no honor has to be defended.
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
On Fri, 2005-09-09 at 11:01 -0400, Lamar Owen wrote:
On Thursday 08 September 2005 15:12, Les Mikesell wrote:
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises.
One of the key reasons that CVS works so well for source is that, once the initial import is done, everything is done via diffs and patches. This makes the repository smaller, and automatically makes the things CVS does well (multiple versions, consistent repository states) done. While a CVS commit is in progress, for instance, other users still see the previous state; this is not true for a YUM repository.
Hmm. This sounds like the crux of the problem. If the mirroring software could be tricked into copying the repodata last, wouldn't this problem (and this thread) go away?
-David
On Sun, 11 Sep 2005, David Johnston wrote:
On Fri, 2005-09-09 at 11:01 -0400, Lamar Owen wrote:
On Thursday 08 September 2005 15:12, Les Mikesell wrote:
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises.
One of the key reasons that CVS works so well for source is that, once the initial import is done, everything is done via diffs and patches. This makes the repository smaller, and automatically makes the things CVS does well (multiple versions, consistent repository states) done. While a CVS commit is in progress, for instance, other users still see the previous state; this is not true for a YUM repository.
Hmm. This sounds like the crux of the problem. If the mirroring software could be tricked into copying the repodata last, wouldn't this problem (and this thread) go away?
rsync does not allow you to specify an order, however rsync has 2 options. --delay-updates will update the mirror at the end of the sync, which is near atomic (this is functionality that Jeff Pitman wrote when I needed it for my repository) and you have an atomic-script that comes with rsync that hardlinks the tree, makes updates in that new tree and finally atomically puts it all back.
Mirrors (that copy data as well as metadata) should start using the --delay-updates option. It requires more diskspace during the sync though.
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
On Sun, 2005-09-11 at 14:54 +0200, Dag Wieers wrote:
On Sun, 11 Sep 2005, David Johnston wrote:
On Fri, 2005-09-09 at 11:01 -0400, Lamar Owen wrote:
On Thursday 08 September 2005 15:12, Les Mikesell wrote:
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises.
One of the key reasons that CVS works so well for source is that, once the initial import is done, everything is done via diffs and patches. This makes the repository smaller, and automatically makes the things CVS does well (multiple versions, consistent repository states) done. While a CVS commit is in progress, for instance, other users still see the previous state; this is not true for a YUM repository.
Hmm. This sounds like the crux of the problem. If the mirroring software could be tricked into copying the repodata last, wouldn't this problem (and this thread) go away?
rsync does not allow you to specify an order, however rsync has 2 options. --delay-updates will update the mirror at the end of the sync, which is near atomic (this is functionality that Jeff Pitman wrote when I needed it for my repository) and you have an atomic-script that comes with rsync that hardlinks the tree, makes updates in that new tree and finally atomically puts it all back.
Mirrors (that copy data as well as metadata) should start using the --delay-updates option. It requires more diskspace during the sync though.
More disk space is an understatement :) ... it would several gigabytes more space when we roll out a new tree. That might be acceptable to most mirrors.
Another problem is that this requires a version of rsync newer that the one in CentOS-3 or CentOS-4. CentOS-3 has rsync-2.5.7-5.3E.src.rpm ... CentOS-4 has rsync-2.6.3-1. Neither have a --delay-updates (at least not listed in man rsync).
On Sun, 11 Sep 2005, Johnny Hughes wrote:
On Sun, 2005-09-11 at 14:54 +0200, Dag Wieers wrote:
On Sun, 11 Sep 2005, David Johnston wrote:
On Fri, 2005-09-09 at 11:01 -0400, Lamar Owen wrote:
On Thursday 08 September 2005 15:12, Les Mikesell wrote:
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises.
One of the key reasons that CVS works so well for source is that, once the initial import is done, everything is done via diffs and patches. This makes the repository smaller, and automatically makes the things CVS does well (multiple versions, consistent repository states) done. While a CVS commit is in progress, for instance, other users still see the previous state; this is not true for a YUM repository.
Hmm. This sounds like the crux of the problem. If the mirroring software could be tricked into copying the repodata last, wouldn't this problem (and this thread) go away?
rsync does not allow you to specify an order, however rsync has 2 options. --delay-updates will update the mirror at the end of the sync, which is near atomic (this is functionality that Jeff Pitman wrote when I needed it for my repository) and you have an atomic-script that comes with rsync that hardlinks the tree, makes updates in that new tree and finally atomically puts it all back.
Mirrors (that copy data as well as metadata) should start using the --delay-updates option. It requires more diskspace during the sync though.
More disk space is an understatement :) ... it would several gigabytes more space when we roll out a new tree. That might be acceptable to most mirrors.
If you mean with new tree 4.1 -> 4.2, then I would expect most to be hard-linked and therefor not consume much extra diskspace during transit. Only the amount of what has been updated + the amount of what will be removed, in the CentOS case nothing is being removed (at least not right away), so actually no additional diskspace is required during transit.
If you mean with new tree a 5.0, then you're not consuming any additional diskspace during transit either, since what you're updating is exactly what will be consumed.
Unless I missed something.
Another problem is that this requires a version of rsync newer that the one in CentOS-3 or CentOS-4. CentOS-3 has rsync-2.5.7-5.3E.src.rpm ... CentOS-4 has rsync-2.6.3-1. Neither have a --delay-updates (at least not listed in man rsync).
Correct, --delay-updates has been added since 2.6.4.
If you look at the changelog you'll see that there are many benefits to upgrade to 2.6.6. (Better error-reporting is one of the important changes making it worthwhile if you use it in production with complex include/exclude lists)
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
On Sun, 11 Sep 2005, David Johnston wrote:
On Fri, 2005-09-09 at 11:01 -0400, Lamar Owen wrote:
(multiple versions, consistent repository states) done. While a CVS
commit
is in progress, for instance, other users still see the previous
state; this
is not true for a YUM repository.
Hmm. This sounds like the crux of the problem. If the mirroring software could be tricked into copying the repodata last, wouldn't this problem (and this thread) go away?
rsync does not allow you to specify an order, however rsync has 2 options. --delay-updates will update the mirror at the end of the sync, which is near atomic (this is functionality that Jeff Pitman wrote when I needed it for my repository) and you have an atomic-script that comes with rsync that hardlinks the tree, makes updates in that new tree and finally atomically puts it all back.
This one thing right there will help tremendously. Thanks for the pointer to --delay-updates (any idea which version of rsync this first appeared?).
While a CVS commit is in progress, for instance, other users still see the previous state; this is not true for a YUM repository.
Hmm. This sounds like the crux of the problem. If the mirroring software could be tricked into copying the repodata last, wouldn't this problem (and this thread) go away?
rsync does not allow you to specify an order, however rsync has 2 options. --delay-updates will update the mirror at the end of the sync, which is near atomic (this is functionality that Jeff Pitman wrote when I needed it for my repository) and you have an atomic-script that comes with rsync that hardlinks the tree, makes updates in that new tree and finally atomically puts it all back.
This one thing right there will help tremendously. Thanks for the pointer to --delay-updates (any idea which version of rsync this first appeared?).
OK guys ... the answer is that is was added in version 2.6.4 of rsync ... and that is newer than both CentOS-3 and CentOS-4 :(
BUT - this is such a good feature, we have upgraded the CentOS mirrors to have version 2.6.6 of rsync from Dag's repo:
http://dag.wieers.com/packages/rsync/
We recommend that if you are rsyncing from centos.org that you get the .el3 or .el4 version of rsync (depending on the version of your mirror) from above and then add the --delay-updates switch to your rsync script when rsyncing from us.
Many thanks to Dag Wieers for the info on this issue, for submitting the request to get this included in rsync ... and for his outstanding repos.
On Sun, 2005-09-11 at 08:40 -0400, David Johnston wrote:
Hmm. This sounds like the crux of the problem. If the mirroring software could be tricked into copying the repodata last, wouldn't this problem (and this thread) go away?
That's an issue _solved_ between mirrors and their original by my hack suggestion described here: http://lists.centos.org/pipermail/centos/2005-September/011426.html
Les Mikesell wrote:
I'm looking for something like a tag that can be applied to a CVS repository that would be applied by someone who knows the state is consistent and can be used by anyone else to retrieve exactly that state regardless of ongoing changes.
Again, if you run the repository, you get to decide on the changes.
But you would have to install all of your own spec files and build all the rpms so you control the dependencies for that to help.
Huh?
Perhaps I'm mis-interpreting your goals? I thought the stated goal was to keep package versions consistent from the QAed configuration and the production configuration. There is nothing in that statement that requires rebuilding anything.
The point of having your own repository is that you decide when newer pacakges get introduced, without having to worry about anyone accidently upgrading. These packages still come from the upstream, there is no need to rebuild anything.
On 9/8/05, Les Mikesell lesmikesell@gmail.com wrote: <snip>
Les,
You are looking for a piece of software made to update your system to automagically do configuration management of multiple production servers that are located in places that make them harder to manage than your test server.
This is not a reasonable goal in my opinion, and while several people have provided good ideas to help get you closer to your goal the bottom line is that config-management is a hard thing and takes actual work.
There are people working on RHN replacements which would ease your task, but even that still takes a lot of work.
Regards, Greg
Les Mikesell lesmikesell@gmail.com wrote:
I provide the QA people with a machine with the latest updates and when they say everything works I try to duplicate those updates into the production boxes. As you imply, this is something nearly everyone has to do, so I find it surprising that the package management tools don't give you a simple way to do it.
??? Have you actually tried building a YUM repository ??? It's pretty straight-forward.
And you can get your package list from RPM. A quick diff makes it cake.
Also note that there are as many risks in waiting for testing and QA approval as not and you have to balance
them.
I'm confused. You're setting up a system, then duplicating the same packages. How would maintaining a YUM repository internally be any different?
I typically maintain 3. 1. Rsync of the latest packates from an external source 2. Packages under test 3. Packages designated as production
Using rsync, yum and a few other tools (including RCS ci/co/diff), I find it rather simplistic and easy to do.
So, I consider the 'real' test to be the first small set of of production boxes that are updated after QA's blessing and watch for problems before rolling out to the rest.
Then maintain such a repository separate from the others.
On Tue, 2005-09-06 at 10:58 +1000, John Newbigin wrote:
I look after 3 versions of CentOS, 2, 3 & 4. Each has it's own different version of yum. Different versions have different command line parameters, different header formats, different config file layouts etc.
That happens with any program over that period of time ... configuration of the apache or any other programs over that period are also very different. We are basically talking about a span between RH-7.2 to FC-3. Look at the differences between gnome and kde for that span (for example).
Yum headers are also not very robust. You can't safely use yum while a) updating your mirror or b) running yum-arch (with -c which takes along time, esp on openoffice). This is a PITA when you are patching a lot of machines and want to obtain new software at the same time.
It is the safest thing to do to create your own yum headers with yum- arch ... but we do not so that for the 14 mirrors that get mirrored from the master mirror. For at least CentOS-3 and 4, these work fine without running yum-arch (or createrepo) on any mirror except the master mirror and rsync'ing it to the rest.
I also think that yum needs a way to track certain packages only from a specific repository, rather than the entire repo (ie. I want 1 package from Dag, not everything). (I don't know if new versions can do this...)
New versions of yum (> 2.2.x) have "includepkgs=" which allows you to specifically include only the packages that you want from a specific repo.
There is also "exclude=" to exclude specific packages from a repo if that is easier.
I also think yum is too slow.
That is what the sqlite database and the md5 caching for createrepo are addressing. Also, the whole repomd versus a header for each package addresses that slowness as well.
These are all in CentOS-4, as CentOS-2 and CentOS-3 have an older version of python and don't work with the newer yum versions.
All those issues aside, every other solution seems to have similar problems. On CentOS-2 I normally use arrghpm which is a tool I wrote to do what I want. It does not rely on headers at all but it is not designed to solve dependencies (because rpm already does that).
(OT Side note. Mirroring updates for CentOS 3 & 4 is also a PITA because I need to have multiple directories, one for each point release. It is just me???)
Using rsync on the /centos/4/ or /centos/3/ tree and adding the proper "--exclude" statements can do it in one command line.
The structure has positives and negatives.
One thing we wanted between 3 and 4 was consistency in the mirror structure, since the CentOS 3 mirror structure was already in place and operating for more than a year when CentOS-4 was released, and for nearly 2 years now.
Whether or not that was the best structure to adopt then is a different question.
At this point, I would not want to change the structure, because it does work and it is the way it has been since the beginning with C3.
On Tue, 6 Sep 2005, Johnny Hughes wrote:
On Tue, 2005-09-06 at 10:58 +1000, John Newbigin wrote:
Yum headers are also not very robust. You can't safely use yum while a) updating your mirror or b) running yum-arch (with -c which takes along time, esp on openoffice). This is a PITA when you are patching a lot of machines and want to obtain new software at the same time.
It is the safest thing to do to create your own yum headers with yum- arch ... but we do not so that for the 14 mirrors that get mirrored from the master mirror. For at least CentOS-3 and 4, these work fine without running yum-arch (or createrepo) on any mirror except the master mirror and rsync'ing it to the rest.
Jeff Pitman added something to rsync I requested and it was accepted in rsync 0.6.3. It allows to do near-atomic updates when mirroring. What it does is during rsyncing all updates are made to a ghost-directory and at the end of the updates all changes are applied together.
This narrows the window of opportunity where people might be using yum/apt/smart when the metadata had been replaced but not all the files were in place or vice versa.
Before this change I had to be careful to sync my main mirror in between intervals when the public mirror was scheduled to sync (to avoid the same kind of conflict). With this functionality I have much more flexibility as the metadata and data are being updated almost atomically (instead of sometimes 2 to 3 hours delay between my first file and my last file due to slow upstream bandwidth).
The functionallity is called --delay-updates and both server and client need support for this to make it work.
Kind regards, -- dag wieers, dag@wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]
Todd Cary wrote:
I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
Unlike Linux gurus, for whom the rest of the user world is like themselves, there are a great many users who like ease of use and graphical applications and yum is strictly command line. I know yumex exists but it does not come automatically installed when one installs CentOS ( nor does Synaptic for that matter ). So perhaps yum would be more popular if yumex, or whatever is the popular GUI front end for yum now, were also installed, and added to the menu system, when CentOS is installed.
I have used both yumex and synaptic, and both have strengths and weaknesses. But quite frankly I find Synaptic easier and clearer to use. I realize that for experienced Linux users that argument holds no weight but for the casual Linux desktop user it is very important.
For the sanity of the list and its members take this to a private discussion. It's getting absurd here.
Top posting warranted. this is no longer a functional thread, but a plague - a cancer if you will - on an otherwise useful list.
KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD. KILL THIS THREAD.
On 9/10/05, Edward Diener eddielee@tropicsoft.com wrote:
Todd Cary wrote:
I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
Unlike Linux gurus, for whom the rest of the user world is like themselves, there are a great many users who like ease of use and graphical applications and yum is strictly command line. I know yumex exists but it does not come automatically installed when one installs CentOS ( nor does Synaptic for that matter ). So perhaps yum would be more popular if yumex, or whatever is the popular GUI front end for yum now, were also installed, and added to the menu system, when CentOS is installed.
I have used both yumex and synaptic, and both have strengths and weaknesses. But quite frankly I find Synaptic easier and clearer to use. I realize that for experienced Linux users that argument holds no weight but for the casual Linux desktop user it is very important.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Sat, 2005-09-10 at 17:31 -0400, Edward Diener wrote:
Todd Cary wrote:
I have seen messages posted on the Fedora oriented forums that imply that "yum" is antiquated. Not being a Linux guru, I do not have the experience to make a thorough evaluation, but so far it has been just great.
Unlike Linux gurus, for whom the rest of the user world is like themselves, there are a great many users who like ease of use and graphical applications and yum is strictly command line. I know yumex exists but it does not come automatically installed when one installs CentOS ( nor does Synaptic for that matter ). So perhaps yum would be more popular if yumex, or whatever is the popular GUI front end for yum now, were also installed, and added to the menu system, when CentOS is installed.
I have used both yumex and synaptic, and both have strengths and weaknesses. But quite frankly I find Synaptic easier and clearer to use. I realize that for experienced Linux users that argument holds no weight but for the casual Linux desktop user it is very important.
The reason apt and synaptic are not included in CentOS is that they do not work with Multiple Library arches (that is basically all arches except i386).
The reason yumex isn't included is that it sucks.