Lamar Owen wrote:
On Wednesday 28 September 2005 20:18, Max wrote:
  
Ignacio Vazquez-Abrams wrote:
    
Wow. Blaming the OS vendor for an ISV's issues.

That whizzing sound you hear is my eyes rolling over and over again,
really quickly.
      

  
Yeah, wow...I've been using Red Hat products since 7.3. I've ran 8, 9,
and all the Fedora Core distros. Now I run CentOS, and we use RHEL 3 at
my work. I've never had any distro, even the Fedora "Testbed's" just
stop working, unless I caused the issue myself from playing.
    

Hmmm, interesting.  Now, understand that I use Linux daily, and that I have no 
plans of switching around.  However, reading the article I get the following 
summary:
1.)	Paying Customer had to use particular software components, including a 
particular version of RHEL;
2.)	Paying Customer had intermittent lockups of the machine that were 
difficult to reproduce;
3.)	Paying Customer got tired of Red Hat's 'WORKSFORME' bug resolution (that's 
the typical bugzilla tag when such an irreproducible problem occurs);
4.)	Paying Customer quit paying and switched to Windows, which worked better 
for them (meaning, it didn't crash).

Now, just exactly what part of this is untrue or would require a Microsoft 
payoff?  I personally have seen instances of Red Hat's WORKSFORME attitude; 
one example is the version of tftp being shipped with RHEL3 and 4, 0.39.  
0.40 has been out for a while, and it fixes a nasty bug in the tftp file 
translation (remapping) feature/misfeature (which I have had to use before on 
certain releases of Cisco's 7960 IP telephone).  Red Hat has a bug in 
bugzilla on this issue for RHEL3 that predates U4, yet the bug won't be 
addressed until U7!  And even then the two-line patch has to be backported; 
can't let the customer see a higher version number! (Bugzilla: 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143536 )  Follow the 
thread; with such a simple bug taking so long, what kind of mess are bigger 
bugs in? 

And there are several kernel bugs that are like this, with strange lockups and 
such that are either ignored (NOTABUG or WONTFIX resolution) or the reporter 
is told to file 'upstream'. See the list at 
https://bugzilla.redhat.com/bugzilla/buglist.cgi?product=Red+Hat+Enterprise+Linux&version=4&component=kernel&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=CLOSED&bug_status=NEEDINFO&bug_status=MODIFIED&short_desc_type=allwordssubstr&short_desc=&long_desc_type=allwordssubstr&long_desc= 
for yourself, particularly bug ID's 157170 (Closed: Wontfix), 162653 (Closed: 
Wontfix because it's a CentOS kernel) and163373 (Closed: Wontfix 
megaRAID-old). Yeah, I can just see a bug filed in LKML by a Red Hat customer 
getting priority attention from Linus and Co.  I can see that about as good 
as I can see myself walking on the moon tomorrow.

Let me put it this way: if I were paying $1,200 per server per year for RHEL 
support, I would not tolerate the WORKSFORME attitude either; I would expect 
and demand that Red Hat send someone out to my site and diagnose my problem 
for those costs (and if they wouldn't do it, I would find a vendor that 
would, for that price, which is exactly what this particular customer did).  
Since I am not paying Red Hat for support at this point, I obviously cannot 
have that attitude with bugs I find that are clearly Red Hat bugs; but I 
certainly can understand this guy's attitude.  We shouldn't just 
automatically dismiss such a problem as being a troll.  We should look 
intelligently and logically at the problem.

And no matter what ISV's software is running, the OS should not LOCK UP!  
Sounds like a RHEL kernel issue to me. 

So, Jim Perrin, I think a HAHAHAHAHA is entirely inappropriate.  This guy had 
a real problem, and Red Hat couldn't fix it.  Meaning CentOS still has the 
problem on that same hardware under the same conditions, since CentOS is as 
close to upstream RHEL as is possible under the trademark guidelines.

We're not told what 'diagnostic' Red Hat recommended be run on the box, but I 
know that if I were in the same situation I would ask Red Hat to come to my 
site and pull that data themselves.  That's what I pay for for support, in my 
opinion.  If I buy 'turnkey' I expect field circus to make it turnkey. (Of 
course, old-timers will recognize the DEC jab there, and the whole idea of 
turnkey is a Bad Idea in my opinion, but that is the world we're in.  See 
http://nemesis.lonestar.org/stories/stages.html for a very humorous tale from 
an old-timer in the computer business.  This guy, Frank Durda IV, wrote much 
of the I/O processor code for the Tandy 6000 Xenix computer of the mid-80's, 
so he is quite a character.)

Further, as to 'automatic' updates, I totally agree with Mr. Horton in the 
article.  Automatic updates should Just Work and should not break the whole 
system (things like, oh, I don't know, caching-nameserver hosing files 
(caching-nameserver is STILL INSTALLED BY DEFAULT EVEN WITH bind-server 
installed, which is a recipe for this sort of problem), an httpd update that 
fails to properly restart httpd after update, the kernel quits working with 
your hardware, you know, minor inconveniences that can only cost thousands of 
dollars of downtime, nothing major).  Everyone has seen some of the updates 
come down in broken states; search the archives of nahant-list or taroon-list 
for lots of those.  I personally have turned off automatic updating at 
several sites for these reasons and because of the poor track record of 
automatic updates.

And, furthermore, I thought the whole idea of the RHEL platform was ABI 
stability.  If SAP thinks, as an ISV, that they need to recertify every 
update then something is bad wrong, and it's not necessarily at SAP.  I 
personally hypothesize that SAP perhaps got burned one too many times by a 
poor update and now blanket recertifies for every update to make sure their 
customers keep running.  "Poor stupid ISV trying to serve their 
customers...they should Get With the Program!"  Of course, Windows has this 
problem, too, but it's typically limited to Service Packs.  Yuck, just typing 
that phrase turns my stomach, thinking about what Win2k3 SP1 does to certain 
many dozens of programs...

Sorry for the rant, but it is ridiculous to automatically dismiss a real-world 
problem.  Note that the same Mr. Horton is still using Linux for web services 
and such, and likes and supports it in that role; it's just the SAP server he 
had to transition to Windows due to a lowly business concern.  Nope, the 
business bottom line should never interfere with technological zealotry!

Oh, quick quiz: what happens to a CentOS box when you reboot after running out 
of disk space on the / filesystem?  You get a graphical login just fine, but 
then you can't login (you login, and it returns you to a login).  Not hard to 
fix, but very mysterious to the newbie.
  


I quite agree with this post, especially the part about 'automatic updates should just work' !!!! I am on the SuSE AMD64 list as well, and about 1/2 of the posts there start out with 'My machine updated itself last night w/ YOU (their auto-update package) & now it won't boot'. I came to Linux from SGI's, *EXPENSIVE*, but darn close to 'it just works'. I realize that for the cost savings of Linux compared to SGI, I might have to sacrifice a bit of 'it just works', but the problem of apparently-incompletely-tested OS updates seems far too prevalent in the Linux world. I am here to stay (Linux vs. SGI), make no mistake, but this is a situation that *SOMEONE* needs to drum up a fix for. I lay the problem mostly at the feet of the distro-providers, since they DO make $$$$ selling the software & would seem to be in the best position to police their own distro. I have also had a few of RH's bug-blow-offs in the more distant past (several years ago, circa 1999) & it does nothing for the Linux movement or themselves professionally. SGI uses a fairly sophisticated & probably proprietary suite of utilities to test new OS components and assure that at least bugs aren't re-introduced and that bugs that are claimed to be fixed actually are. I don't know who would assume this responsibility in a distributed developement environment such as Linux flourishes under, but it seems increasingly obvious that *SOMEBODY* needs to. My $0.02 ....


There, I feel better now :-).


-- 
	William A. Mahaffey III
---------------------------------------------------------------------
	Remember, ignorance is bliss, but
	willful ignorance is LIBERALISM !!!!