On Sun, 2009-11-15 at 13:44 -0500, Jeff Johnson wrote:
FWIW, there are deep fundamental design issues wrto Courgette that have nothing to do with with whether Google is choosing Courgette for Chromium peculier updates.
For starters:
RFC 3229 at http://www.rfc-editor.org/rfc/rfc3229.txt
This is what subversion uses (afaik) instead of xdelta While I privately like xdelta _A LOT_ and I think that Josh McDonald's master's thesis, and xdelta[123] are the cat's pajama's, xdelta code is quite obscure and hard to justify deploying generally. YMMV, everyone's does, but (objectively) subevrsion chose vdelta rather than xdelta because xdelta code is insanely difficult and uncommented, and uses vdelta (ala RFC 3229) instead.
FWIW, I think the delta algorithm is one of the smaller problems deltarpm needs to deal with right now.
- disassembling code to remove pointer entropy (as in Courgette)
maye be a win for executables, but is not generally useful for presto (or packaging).
To clarify for others following this thread, most of the files in an rpm are data, *not* executables.
Deltarpm currently has two big problems that keep it from having hugely efficient deltas even when two rpms have barely changed.
1) Any colored binaries that aren't in a multilib directory (i.e. /usr/bin/*) are never delta'd at all. This was because we didn't want to lose the complete delta because some 32-bit package on a 64-bit machine was missing some file in /usr/bin. We may want to rethink this now as 64-bit installs tend to have fewer 32-bit packages then when this decision was made. 2) A small change in an uncompressed file will result in a huge change after it's been compressed. Many of the larger packages have at least some compressed files and those files are essentially not delta'd at all.
In my mind, at least, solving these two problems will have a far bigger effect in reducing deltarpm size than adopting Courgette.
Jonathan