On Wednesday 28 September 2005 20:18, Max wrote:
Ignacio Vazquez-Abrams wrote:
Wow. Blaming the OS vendor for an ISV's issues.
That whizzing sound you hear is my eyes rolling over and over again,
really quickly.
Yeah, wow...I've been using Red Hat products since 7.3. I've ran 8, 9,
and all the Fedora Core distros. Now I run CentOS, and we use RHEL 3 at
my work. I've never had any distro, even the Fedora "Testbed's" just
stop working, unless I caused the issue myself from playing.
Hmmm, interesting. Now, understand that I use Linux daily, and that I have no
plans of switching around. However, reading the article I get the following
summary:
1.) Paying Customer had to use particular software components, including a
particular version of RHEL;
2.) Paying Customer had intermittent lockups of the machine that were
difficult to reproduce;
3.) Paying Customer got tired of Red Hat's 'WORKSFORME' bug resolution (that's
the typical bugzilla tag when such an irreproducible problem occurs);
4.) Paying Customer quit paying and switched to Windows, which worked better
for them (meaning, it didn't crash).
Now, just exactly what part of this is untrue or would require a Microsoft
payoff? I personally have seen instances of Red Hat's WORKSFORME attitude;
one example is the version of tftp being shipped with RHEL3 and 4, 0.39.
0.40 has been out for a while, and it fixes a nasty bug in the tftp file
translation (remapping) feature/misfeature (which I have had to use before on
certain releases of Cisco's 7960 IP telephone). Red Hat has a bug in
bugzilla on this issue for RHEL3 that predates U4, yet the bug won't be
addressed until U7! And even then the two-line patch has to be backported;
can't let the customer see a higher version number! (Bugzilla:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143536 ) Follow the
thread; with such a simple bug taking so long, what kind of mess are bigger
bugs in?
And there are several kernel bugs that are like this, with strange lockups and
such that are either ignored (NOTABUG or WONTFIX resolution) or the reporter
is told to file 'upstream'. See the list at
https://bugzilla.redhat.com/bugzilla/buglist.cgi?product=Red+Hat+Enterprise+Linux&version=4&component=kernel&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=CLOSED&bug_status=NEEDINFO&bug_status=MODIFIED&short_desc_type=allwordssubstr&short_desc=&long_desc_type=allwordssubstr&long_desc=
for yourself, particularly bug ID's 157170 (Closed: Wontfix), 162653 (Closed:
Wontfix because it's a CentOS kernel) and163373 (Closed: Wontfix
megaRAID-old). Yeah, I can just see a bug filed in LKML by a Red Hat customer
getting priority attention from Linus and Co. I can see that about as good
as I can see myself walking on the moon tomorrow.
Let me put it this way: if I were paying $1,200 per server per year for RHEL
support, I would not tolerate the WORKSFORME attitude either; I would expect
and demand that Red Hat send someone out to my site and diagnose my problem
for those costs (and if they wouldn't do it, I would find a vendor that
would, for that price, which is exactly what this particular customer did).
Since I am not paying Red Hat for support at this point, I obviously cannot
have that attitude with bugs I find that are clearly Red Hat bugs; but I
certainly can understand this guy's attitude. We shouldn't just
automatically dismiss such a problem as being a troll. We should look
intelligently and logically at the problem.
And no matter what ISV's software is running, the OS should not LOCK UP!
Sounds like a RHEL kernel issue to me.
So, Jim Perrin, I think a HAHAHAHAHA is entirely inappropriate. This guy had
a real problem, and Red Hat couldn't fix it. Meaning CentOS still has the
problem on that same hardware under the same conditions, since CentOS is as
close to upstream RHEL as is possible under the trademark guidelines.
We're not told what 'diagnostic' Red Hat recommended be run on the box, but I
know that if I were in the same situation I would ask Red Hat to come to my
site and pull that data themselves. That's what I pay for for support, in my
opinion. If I buy 'turnkey' I expect field circus to make it turnkey. (Of
course, old-timers will recognize the DEC jab there, and the whole idea of
turnkey is a Bad Idea in my opinion, but that is the world we're in. See
http://nemesis.lonestar.org/stories/stages.html for a very humorous tale from
an old-timer in the computer business. This guy, Frank Durda IV, wrote much
of the I/O processor code for the Tandy 6000 Xenix computer of the mid-80's,
so he is quite a character.)
Further, as to 'automatic' updates, I totally agree with Mr. Horton in the
article. Automatic updates should Just Work and should not break the whole
system (things like, oh, I don't know, caching-nameserver hosing files
(caching-nameserver is STILL INSTALLED BY DEFAULT EVEN WITH bind-server
installed, which is a recipe for this sort of problem), an httpd update that
fails to properly restart httpd after update, the kernel quits working with
your hardware, you know, minor inconveniences that can only cost thousands of
dollars of downtime, nothing major). Everyone has seen some of the updates
come down in broken states; search the archives of nahant-list or taroon-list
for lots of those. I personally have turned off automatic updating at
several sites for these reasons and because of the poor track record of
automatic updates.
And, furthermore, I thought the whole idea of the RHEL platform was ABI
stability. If SAP thinks, as an ISV, that they need to recertify every
update then something is bad wrong, and it's not necessarily at SAP. I
personally hypothesize that SAP perhaps got burned one too many times by a
poor update and now blanket recertifies for every update to make sure their
customers keep running. "Poor stupid ISV trying to serve their
customers...they should Get With the Program!" Of course, Windows has this
problem, too, but it's typically limited to Service Packs. Yuck, just typing
that phrase turns my stomach, thinking about what Win2k3 SP1 does to certain
many dozens of programs...
Sorry for the rant, but it is ridiculous to automatically dismiss a real-world
problem. Note that the same Mr. Horton is still using Linux for web services
and such, and likes and supports it in that role; it's just the SAP server he
had to transition to Windows due to a lowly business concern. Nope, the
business bottom line should never interfere with technological zealotry!
Oh, quick quiz: what happens to a CentOS box when you reboot after running out
of disk space on the / filesystem? You get a graphical login just fine, but
then you can't login (you login, and it returns you to a login). Not hard to
fix, but very mysterious to the newbie.