On Wednesday 28 September 2005 20:18, Max wrote:
Ignacio Vazquez-Abrams wrote:
Wow. Blaming the OS vendor for an ISV's issues.
That whizzing sound you hear is my eyes rolling over and over again, really quickly.
Yeah, wow...I've been using Red Hat products since 7.3. I've ran 8, 9, and all the Fedora Core distros. Now I run CentOS, and we use RHEL 3 at my work. I've never had any distro, even the Fedora "Testbed's" just stop working, unless I caused the issue myself from playing.
Hmmm, interesting. Now, understand that I use Linux daily, and that I have no plans of switching around. However, reading the article I get the following summary: 1.) Paying Customer had to use particular software components, including a particular version of RHEL; 2.) Paying Customer had intermittent lockups of the machine that were difficult to reproduce; 3.) Paying Customer got tired of Red Hat's 'WORKSFORME' bug resolution (that's the typical bugzilla tag when such an irreproducible problem occurs); 4.) Paying Customer quit paying and switched to Windows, which worked better for them (meaning, it didn't crash).
Now, just exactly what part of this is untrue or would require a Microsoft payoff? I personally have seen instances of Red Hat's WORKSFORME attitude; one example is the version of tftp being shipped with RHEL3 and 4, 0.39. 0.40 has been out for a while, and it fixes a nasty bug in the tftp file translation (remapping) feature/misfeature (which I have had to use before on certain releases of Cisco's 7960 IP telephone). Red Hat has a bug in bugzilla on this issue for RHEL3 that predates U4, yet the bug won't be addressed until U7! And even then the two-line patch has to be backported; can't let the customer see a higher version number! (Bugzilla: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143536 ) Follow the thread; with such a simple bug taking so long, what kind of mess are bigger bugs in?
And there are several kernel bugs that are like this, with strange lockups and such that are either ignored (NOTABUG or WONTFIX resolution) or the reporter is told to file 'upstream'. See the list at https://bugzilla.redhat.com/bugzilla/buglist.cgi?product=Red+Hat+Enterprise+... for yourself, particularly bug ID's 157170 (Closed: Wontfix), 162653 (Closed: Wontfix because it's a CentOS kernel) and163373 (Closed: Wontfix megaRAID-old). Yeah, I can just see a bug filed in LKML by a Red Hat customer getting priority attention from Linus and Co. I can see that about as good as I can see myself walking on the moon tomorrow.
Let me put it this way: if I were paying $1,200 per server per year for RHEL support, I would not tolerate the WORKSFORME attitude either; I would expect and demand that Red Hat send someone out to my site and diagnose my problem for those costs (and if they wouldn't do it, I would find a vendor that would, for that price, which is exactly what this particular customer did). Since I am not paying Red Hat for support at this point, I obviously cannot have that attitude with bugs I find that are clearly Red Hat bugs; but I certainly can understand this guy's attitude. We shouldn't just automatically dismiss such a problem as being a troll. We should look intelligently and logically at the problem.
And no matter what ISV's software is running, the OS should not LOCK UP! Sounds like a RHEL kernel issue to me.
So, Jim Perrin, I think a HAHAHAHAHA is entirely inappropriate. This guy had a real problem, and Red Hat couldn't fix it. Meaning CentOS still has the problem on that same hardware under the same conditions, since CentOS is as close to upstream RHEL as is possible under the trademark guidelines.
We're not told what 'diagnostic' Red Hat recommended be run on the box, but I know that if I were in the same situation I would ask Red Hat to come to my site and pull that data themselves. That's what I pay for for support, in my opinion. If I buy 'turnkey' I expect field circus to make it turnkey. (Of course, old-timers will recognize the DEC jab there, and the whole idea of turnkey is a Bad Idea in my opinion, but that is the world we're in. See http://nemesis.lonestar.org/stories/stages.html for a very humorous tale from an old-timer in the computer business. This guy, Frank Durda IV, wrote much of the I/O processor code for the Tandy 6000 Xenix computer of the mid-80's, so he is quite a character.)
Further, as to 'automatic' updates, I totally agree with Mr. Horton in the article. Automatic updates should Just Work and should not break the whole system (things like, oh, I don't know, caching-nameserver hosing files (caching-nameserver is STILL INSTALLED BY DEFAULT EVEN WITH bind-server installed, which is a recipe for this sort of problem), an httpd update that fails to properly restart httpd after update, the kernel quits working with your hardware, you know, minor inconveniences that can only cost thousands of dollars of downtime, nothing major). Everyone has seen some of the updates come down in broken states; search the archives of nahant-list or taroon-list for lots of those. I personally have turned off automatic updating at several sites for these reasons and because of the poor track record of automatic updates.
And, furthermore, I thought the whole idea of the RHEL platform was ABI stability. If SAP thinks, as an ISV, that they need to recertify every update then something is bad wrong, and it's not necessarily at SAP. I personally hypothesize that SAP perhaps got burned one too many times by a poor update and now blanket recertifies for every update to make sure their customers keep running. "Poor stupid ISV trying to serve their customers...they should Get With the Program!" Of course, Windows has this problem, too, but it's typically limited to Service Packs. Yuck, just typing that phrase turns my stomach, thinking about what Win2k3 SP1 does to certain many dozens of programs...
Sorry for the rant, but it is ridiculous to automatically dismiss a real-world problem. Note that the same Mr. Horton is still using Linux for web services and such, and likes and supports it in that role; it's just the SAP server he had to transition to Windows due to a lowly business concern. Nope, the business bottom line should never interfere with technological zealotry!
Oh, quick quiz: what happens to a CentOS box when you reboot after running out of disk space on the / filesystem? You get a graphical login just fine, but then you can't login (you login, and it returns you to a login). Not hard to fix, but very mysterious to the newbie.