I am attempting to create a rpm of the latest version of a program. The rpm for the previous version contains a number of patch files that make numerous changes various files in the tar.gz as downloaded from the project's website so it will work properly on Linux.
The latest version of the program has changed enough stuff that some of the patches now fail to apply. "1 out of 1 hunk FAILED" and so on. Upon comparing the previous version's files to the latest version, I see that the problem is that some of the files that need to be patched have had some stuff moved around a bit, just enough to (apparently) cause patch to fail.
By way of experimentation, I manually changed one of the files in the new version to match what the patch says it should be, then created a new patch file from that and it applies and appears to work fine. (I patched the previous version's file, compared the result to the original and made the same change in the new version's file.)
This method seems to work fine when the change is only one or two lines, but some of the patches are somewhat more involved than that.
It seems to me that there may be an automated way to handle this matter by somehow patching a into b, then compare a and b and make corresponding changes in c. Basically the same process that I just tried manually on a small patch file, without all of the labour and chance of a screw-up that would be involved in manually comparing the old files and rewriting the new file.
I have two questions:
First, am I going about this the right way? And if so, is there a way to automate the process as described in the previous paragraph?
Second, what is the proper convention for handling this in a rpm? The obvious solution seems to be to create new patch files and throw the old ones away, then build the rpm from that. Some of these patches appear to go back several versions, though, so is there a better or more proper way to handle this than just throwing them out and making a whole new set of patches?
I have learned a lot more about patch and diff tonight than I ever needed to know before. Very cool stuff, and very useful.
From: Frank Cox theatre@sasktel.net
... some of the files that need to be patched have had some stuff moved around a bit, just enough to (apparently) cause patch to fail... ... By way of experimentation, I manually changed one of the files ... is there a way to automate the process...
I am afraid patch is not able to auto-magicaly adapt an old patch to a heavily modified file... Did your manual experimentation involved any "fuzzy logic", pattern recognition or code interpretation...? Maybe check http://en.wikipedia.org/wiki/Patch_(Unix)#Advanced_diffs
JD
On Tue, 2010-06-15 at 02:44 -0700, John Doe wrote:
I am afraid patch is not able to auto-magicaly adapt an old patch to a heavily modified file...
That's what I was afraid of. I was hoping, however, that there might be some way to verify that everything in the patch has now been done in the new version. My best idea on that score is to inspect the contents of the old diff and the new diff to make sure that they are the same length and refer to the same stuff.
Did your manual experimentation involved any "fuzzy logic", pattern recognition or code interpretation...?
I didn't think so, but perhaps my idea of "fuzzy" is different than the computer's. It's simply a matter of finding a reference to dancing zebras and changing it to waltzing giraffes. In the one patch that I have re-created so far, the two lines that need to be changed appeared to be the same in both the old and new version of the program, but their positions in the file are different -- the change is now on line 165 instead of line 142 and so on.
Maybe check http://en.wikipedia.org/wiki/Patch_(Unix)#Advanced_diffs
The original patch files appear to already be unified diffs.
I guess I'll just have to bite the bullet and rewrite some parts of this thing manually to match the old patch files. My major concern is that I'll get lost in the woods and miss something; hopefully comparing the old patch files to a new diff will allow me to check that.
Frank Cox wrote, On 06/15/2010 11:51 AM:
On Tue, 2010-06-15 at 02:44 -0700, John Doe wrote:
I am afraid patch is not able to auto-magicaly adapt an old patch to a heavily modified file...
That's what I was afraid of. I was hoping, however, that there might be some way to verify that everything in the patch has now been done in the new version. My best idea on that score is to inspect the contents of the old diff and the new diff to make sure that they are the same length and refer to the same stuff.
I guess I'll just have to bite the bullet and rewrite some parts of this thing manually to match the old patch files. My major concern is that I'll get lost in the woods and miss something; hopefully comparing the old patch files to a new diff will allow me to check that.
Frank, Some questions that you should probably think about for yourself, and might help those of us on the list help some more.
Where did the original SRPM come from? What was it of/for? Does the original source repository/group exist anymore? ... someone else may have already been here with the product you are looking at.
Does the person who is building the new SRPM understand _why_ the old patches were created, i.e., what did it fix? Does the person who is building the new SRPM understand in each patch case that either _what_ the patch 'fixed' has not been fixed in the upstream, or was fixed but not in the same way, i.e., contact upstream and ask if the reasons for the patches has gone away so you don't need to patch for it anymore? Would the upstream be interested in integrating the patches, or similar functionality changes, for you? Would the upstream be interested in integrating the spec file for you?
On Tue, 2010-06-15 at 13:06 -0400, Todd Denniston wrote:
Where did the original SRPM come from?
rpmfusion.
What was it of/for?
vice-2.1-3.src.rpm
Does the original source repository/group exist anymore? ... someone else may have already been here with the product you are looking at.
I emailed Hans to ask if he was planning to update to version 2.2 and he replied that he's a bit short of time. So I have decided to take a stab at it myself and see what develops.
Does the person who is building the new SRPM
That would be me.
understand _why_ the old patches were created, i.e., what did it fix?
That's part of what I'm working on figuring out.
Does the person who is building the new SRPM understand in each patch case that either _what_ the patch 'fixed' has not been fixed in the upstream, or was fixed but not in the same way, i.e., contact upstream and ask if the reasons for the patches has gone away so you don't need to patch for it anymore?
Most of the patches seem to have been there for quite some period of time. I initially tried creating and compiling a srpm/rpm that didn't include the patches to see what would happen. It compiled and installed, but didn't actually work. Therefore, at least some of the patches are still required.
Would the upstream be interested in integrating the patches, or similar functionality changes, for you? Would the upstream be interested in integrating the spec file for you?
That I don't know, but once it's whipped into shape I could ask them, I suppose.
On Tue, 15 Jun 2010, Frank Cox wrote:
By way of experimentation, I manually changed one of the files in the new version to match what the patch says it should be, then created a new patch file from that and it applies and appears to work fine. (I patched the previous version's file, compared the result to the original and made the same change in the new version's file.)
ugghhh --- doable, but laborious ... ;)
I have two questions:
First, am I going about this the right way?
no -- Usually one unrolls the old tree, applies the patches to the old; and then unrolls the new in a directory 'next to' the first, and diffs from a point above the top of each
This produces a new patch set, which may already have some of what the older patches formerly needed to do (or a wholly different approach, when two forks diverge)
And if so, is there a way to automate the process as described in the previous paragraph?
Early automation of a partially understood technology seems like a premature optimization ;)
Second, what is the proper convention for handling this in a rpm? The obvious solution seems to be to create new patch files and throw the old ones away, then build the rpm from that. Some of these patches appear to go back several versions, though, so is there a better or more proper way to handle this than just throwing them out and making a whole new set of patches?
A serious developer will usually have available a complete copy of the master upstream, and local branches which are used and discarded without a second thought, once the 'fruit' from an approach is 'cherrypicked' [disk space has become inexpensive]; Mere re-packagers can usually get by with less, and simply pluck prior packages containing (in part) tarballs and patches, and diff between two points in time
This is to some degree a matter of taste and administrative approach. A big fat batch was used in the old and early kernel and libc days to distribute 'nightly deltas' which one would D/L and apply one after another againast a periodic master tarball. As bandwidth availability has grown, this fell by the wayside, and later distributed version control systems ('VCS') have emerged as the approach favored there
The world is moving to building from VCS as well as snap-shotting; for safety's sake, periodically rolling and signing a SRPM or saving a file containing a signed set of checksums for a backup tarball comes to mind as 'good practices' See: http://www.unrealircd.com/ and the prior experience of the Linux kernel folks, as well as at Fedora and Red Hat with the issue of detecting possible hostile substituted checkins
I have learned a lot more about patch and diff tonight than I ever needed to know before. Very cool stuff, and very useful.
I wrote this introduction to let people get an early success doing patching and SRPM building
http://www.owlriver.com/tips/patching_srpms/
and it is designed to be approachible
-- Russ herrold
On Tue, 2010-06-15 at 08:55 -0400, R P Herrold wrote:
First, am I going about this the right way?
no -- Usually one unrolls the old tree, applies the patches to the old; and then unrolls the new in a directory 'next to' the first, and diffs from a point above the top of each
What would that gain me? Following this procedure would get me a big diff showing the differences between the old (patched) version and the new (unpatched) version. But that would contain a list of all of the stuff in the new version which probably doesn't need to be changed, and revert the patches from the previous patched version. In other words, unless I'm looking at this backward somehow, I don't see the point.
This produces a new patch set, which may already have some of what the older patches formerly needed to do (or a wholly different approach, when two forks diverge)
It would also revert the old patch set, wouldn't it?
And if so, is there a way to automate the process as described in the previous paragraph?
Early automation of a partially understood technology seems like a premature optimization ;)
Ah.... but having the computer tell me that I forgot to include the change made to line X in the old patch in my newly rewritten file would be a lot easier (and probably more reliable) than the Mark I Eyeball method.
On Tue, 15 Jun 2010, Frank Cox wrote:
On Tue, 2010-06-15 at 08:55 -0400, R P Herrold wrote:
First, am I going about this the right way?
no -- Usually one unrolls the old tree, applies the patches to the old; and then unrolls the new in a directory 'next to' the first, and diffs from a point above the top of each
What would that gain me? Following this procedure would get me a big diff showing the differences between the old (patched) version and the new (unpatched) version. But that would contain a list of all of the stuff in the new version which probably doesn't need to be changed, and revert the patches from the previous patched version. In other words, unless I'm looking at this backward somehow, I don't see the point.
The point is to see where changes are happening, and to be able to cherry pick in a migration toward the latest [but being able to spot the deltas from the prior version], which, as I understood it, was your goal
I did not suggest applying that resulting diff, as a patch without review, but rather as a means to get visibility as to what changes were being 'upstreamed'
-- Russ herrold
On Tue, 2010-06-15 at 12:17 -0400, R P Herrold wrote:
The point is to see where changes are happening, and to be able to cherry pick in a migration toward the latest [but being able to spot the deltas from the prior version], which, as I understood it, was your goal
I can see that. However, I think 95% of the resulting diff would be irrelevant to what I'm trying to do, and it would become even easier to get lost in the weeds...
I did not suggest applying that resulting diff, as a patch without review, but rather as a means to get visibility as to what changes were being 'upstreamed'
The current patch files number about a half-dozen of varying sizes. They seem to be arranged by functionality, i.e. patch 1 modifies the location of the data directories, patch 2 modifies the variable names in in parts of the screen handling, and so on.
A single monolithic diff of the entire tree would lose this functional separation of the patches, and it would be a lot more maintainable and understandable into the future if I could retain that instead.
On Tue, 15 Jun 2010, Frank Cox wrote:
A single monolithic diff of the entire tree would lose this functional separation of the patches, and it would be a lot more maintainable and understandable into the future if I could retain that instead.
time for training of that Mk I eyeball ;)
some find cats easier to herd if all yoked together
- R
On Tue, 2010-06-15 at 12:17 -0400, R P Herrold wrote:
The point is to see where changes are happening, and to be able to cherry pick in a migration toward the latest [but being able to spot the deltas from the prior version], which, as I understood it, was your goal
I just found the slickest tool to compare files.
meld
"yum install meld" will get it for you from the epel repository.
It even does 3-way compares, which is just exactly what I need for this project.
On Tue, 15 Jun 2010, Frank Cox wrote:
I just found the slickest tool to compare files.
meld
"yum install meld" will get it for you from the epel repository.
I did not know that Mr. Spock had brought that back from Vulcan; next think you know the secret of the nerve pinch will be revealed ;)
----- Original Message ----
From: R P Herrold herrold@centos.org To: CentOS mailing list centos@centos.org Sent: Tue, June 15, 2010 4:14:03 PM Subject: [CentOS] rpm - diff and patch updating
On Tue, 15 Jun 2010, Frank Cox wrote:
I just found the slickest tool to compare files.
meld
"yum install meld" will get it for you from the epel repository.
I did not know that Mr. Spock
had brought that back from
Vulcan; next think you know the secret of the
nerve pinch will
be revealed
;)
huh, am i seeing things correctly! Russ has a sense of humor? have to put that in my log books, :-D
Frank Cox wrote:
I am attempting to create a rpm of the latest version of a program. The rpm for the previous version contains a number of patch files that make numerous changes various files in the tar.gz as downloaded from the project's website so it will work properly on Linux.
The latest version of the program has changed enough stuff that some of the patches now fail to apply. "1 out of 1 hunk FAILED" and so on. Upon comparing the previous version's files to the latest version, I see that the problem is that some of the files that need to be patched have had some stuff moved around a bit, just enough to (apparently) cause patch to fail.
By way of experimentation, I manually changed one of the files in the new version to match what the patch says it should be, then created a new patch file from that and it applies and appears to work fine. (I patched the previous version's file, compared the result to the original and made the same change in the new version's file.)
This method seems to work fine when the change is only one or two lines, but some of the patches are somewhat more involved than that.
It seems to me that there may be an automated way to handle this matter by somehow patching a into b, then compare a and b and make corresponding changes in c. Basically the same process that I just tried manually on a small patch file, without all of the labour and chance of a screw-up that would be involved in manually comparing the old files and rewriting the new file.
I have two questions:
First, am I going about this the right way? And if so, is there a way to automate the process as described in the previous paragraph?
Second, what is the proper convention for handling this in a rpm? The obvious solution seems to be to create new patch files and throw the old ones away, then build the rpm from that. Some of these patches appear to go back several versions, though, so is there a better or more proper way to handle this than just throwing them out and making a whole new set of patches?
I have learned a lot more about patch and diff tonight than I ever needed to know before. Very cool stuff, and very useful.
You probably can't automate this - but note that many of the patches included in RHEL/CentOS RPMs are to backport fixes from newer versions of the code without bringing in new/different features. So, if you start with newer base code you may not need many/most of the patches at all.