[CentOS-devel] brand hunting utility

Wed Jun 4 17:36:45 UTC 2014
Kay Williams <kay at deployproject.org>

I've written a small utility (brand-hunter.py) to help with tracking down
branding issues. It does the following:

* accepts a list of srpms to search (if no srpms are listed, all srpms are
searched)
* downloads the srpms from ftp.redhat.com (across all rhel7 srpm repos)
* extracts srpm content, including any bz2 or gz tarfiles.
* searches text files (in multiline mode) for the pattern
'[Rh][Ee][Dd]\s?[Hh][Aa][Tt]'
* searches for any binary files
* writes a list of issues (by file and line) to an issues.txt file (per
srpm)
* writes a noissues.txt file listing any srpms for which no issues were
found.

If folks are interested, please let me know a location where I can make the
utility available.

I ran it across the first 100 srpms (by yum sort order) and found only 7
srpms with no issues:

GreSQL-4.0-9.el7.src 
SOAPpy-0.11.6-17.el7.src 
akonadi-1.9.2-4.el7.src 
ant-antunit-1.2-10.el7.src 
aopalliance-1.0-8.el7.src 
apache-commons-exec-1.1-11.el7.src 
apache-parent-10-14.el7.src

The other 93 files had a range of issues, the most common of which are -

* Redhat.com email addresses in patch files or author lists
* Red Hat copyright statements
* Binary files (of any kind, right now the utility flags them all as
potential issues)

The utility can be made smarter, say to ignore srpms with only redhat email
addresses or  copyright statements. But exclusions like these would be want
to be done on a case by case bases, I imagine.

Kay