[CentOS] Re: Installation problem, possibly RAID -- [OT] Why OS installers always "suck"

Sun Sep 11 17:40:30 UTC 2005
Edward Diener <eddielee at tropicsoft.com>

Bryan J. Smith wrote:
> On Sun, 2005-09-11 at 07:06 -0400, Edward Diener wrote:
> 
>>First I received an error message box that told me nothing about what my problem 
>>was. I am a programmer myself and find such error messages to be, almost always, 
>>just lazy programming with little regard for the end user. The days when 
>>something wrong happens in code and the error message is essentially "something 
>>wrong happened" should have been over decades ago, so I find it pretty 
>>disappointing it persists, especially in OS code because OS code is crucial.
> 
> 
> As both a programmer and (wanna-be ;-) kernel developer, debugging at
> the OS level is far more complex than end-user programs.  In an OS
> installer, with all the variables, it's damn near impossible to figure
> out the exact sequence of events.  So it's most likely you are getting
> an error code/message that seems rather useless, but it's the only
> "common denominator" the installer can come up with.

My thought does not involve having to know the exact sequence of events at any 
level when an error occurs but rather having that error recognized at the 
earliest possible time and propagating that error to the code which can put out 
an intelligent, even very technical if necessary, message to end-user. That 
message would then give the end-user at least a fighting chance of either 
understanding what went wrong, or reporting the message so that a developer of 
the install could explain it to the end-user.

I grant that OS code, especially in an installer, is almost certainly far more 
complicated than any application code. Still there are techniques for reporting 
errors from the point in which they are discovered. The modern way is exception 
handling, but even if one uses the older error code technique, the error should 
translate into a more narrow possibility thatn the vague and generalized message 
which I received.

> 
> We're not talking an application installer on a known, good, usable OS
> that is already installed.  We are talking the OS itself, which is far,
> far, far more involved -- because the entire system is _not_ yet in a
> usable state.

Agreed.

> 
> 
>>Second, even with the vague error message I received I would think that some 
>>CentOS developer might tell me what the possibilities are that generate this 
>>message. Then it would be easier for me to find a workaround rather than having 
>>to spend some time experimenting on installation options to get the OS 
>>installed, and not even knowing if there was a way for me to succeed.
> 
> 
> Again, see my comment above.
> Bugzilla reports are the best means to find out more.

Is there a Bugzilla data base for CentOS ? If so I sure do not see it anywhere 
off of the home page at http://www.centos.org/.

> 
> 
>>Actually an installation program must know these things as far as I am 
>>concerned. If the idea is "we have a great OS but our installer is not nearly as 
>>good", how does one expect an OS to attract users if the installer can not even 
>>install properly, or at least tell the end-user why it failed for the end-user's 
>>particular hardware.
> 
> 
> Again, you're asking for the _impossible_ -- even on Windows.  There is
> the terminology barrier, the plethora of combinations that could have
> caused the error, etc...  Installer issues are almost _never_ understood
> until they are repeated and documented in a bugzilla report -- and since
> hardware is _always_ changing, it's a "moving target."
> 
> Which is why any "sane" professional recommends that end-users _always_
> get their OS "pre-installed."  By pre-installed I mean either OEM, by a
> LUG (with users more familiar with the installer in use), etc...
> 
> Because not even experts who wrote the installer itself and have
> extensive kernel development experience can always figure out the
> massive set of combinations that _could_ be thrown at an installer.
> Which is why many installer type issues (like the kernel 2.6 / buggy
> BIOS geometry / NT 255/63 head/sector issue) were _not_ discovered until
> _after_ the installer was released.  Not everyone can test every single
> hardware combination out there.
> 
> Again, this is very, very, _very_different_ than installing an
> application.  When installing an application, most everyone has the same
> set of libraries, binaries, etc..., or they can easily bring the system
> to that same state as everyone else.  Again, remember the purpose of an
> OS -- to take radically different hardware and capabilities and present
> them for applications in a single set of common interfaces.  So, again,
> don't compare OS installers to applications or even application
> installers.

I agree with all you say above. My argument is not that any OS must install on 
my computer, or anyone elses computer, but that if it does not the end-user 
should be given a decent indication why not. Even though OS code is generally 
more complicated than application code I see no difference in its ability to 
give intelligent error messages, or even error message numbers which translate 
to an intelligent reason.

> 
> 
>>From what I got investigating messages for FC4 installation black screens and 
>>white screens, the failure to support various video cards which worked 
>>flawlessly in FC3 was discovered soon after the FC4 release, and the reasons for 
>>this failure were well-known ( buggy Gnu C++ 4.0 code ).
> 
> 
> Again, is it _not_uncommon_ for Red Hat to use the "bleeding edge" code
> every 2-3 releases of their community release (Red Hat Linux, now Fedora
> core).

I was spoiled by FC3 which installed and worked nearly flawlessly on my machine.

> 
> 
>>So why not just fix it and post an updated set of ISOs ? Not doing so just gives
>>a release a bad name from the start.
> 
> 
> Because who says one GNU C++ 4.0.x revision will fix them all?  So then
> the Fedora team keeps spinning out revision of the installer and
> corresponding CD after revision after revision.  Red Hat learned early
> on in Red Hat Linux that it cannot support respinning installers/CDs.
> In fact, I know of _no_ major vendor (including Microsoft) that respins
> new installers every few weeks -- only every 6-24 months.

You make a good point but it is annoying that what worked on a previous release 
could not even install on the next one. C'est la vie.

> 
> Again, these releases are no different than the "bad name" releases of
> Red Hat Linux 5.0, Red Hat Linux 7.0, Fedora Core 2.  Right now I've
> adopted the "reverse Star Trek" attitude on Fedora -- the even are bad,
> the odd are good.  On the evens, Red Hat changes things in Fedora Core
> majorily (like old RHL ".0" releases).  On the odds, Red Hat changes
> things minorly.
> 
> I'm looking forward to Fedora Core 5, just like I did Fedora Core 3.
> 
> 
>>Installers are the first thing one sees when using an OS. If the installer fails 
>>the user is not going to think much of the OS. If one is concerned on promoting 
>>an OS, the installer needs to be first-rate.
> 
> 
> Impossible.  See above commentary.
> 
> 
>>Do Linux developers study Windows drivers in order to create Linux device driver 
>>code ?
> 
> 
> No offense, but are you really a developer?  You're talking disassembly.
> You're talking machine code-level into assembler and "headache" level
> reverse engineering, and possible legal issues.

My question was a reply to your statement of

"In fact, the main problem isn't Linux, but the increase in "superstore-
designed hardware."  And that means cheap, poorly tested, Windows
version _specific_ drivers, and absolutely _no_ public specifications."

You seemed to be implying that poor Windows version _specific_ drivers was one 
of the reasons why Linux has trouble creating device drivers for hardware. So I 
asked the above as if to say that I find it hard to believe that Linux depends 
on disassembling Windows driver code in order to get at the internal hardware 
specs for a device.

> 
> And lastly, you're talking about _software_ based hardware.  You can't
> just "send codes X, Y and Z to a printer, scanner, etc..." but you have
> to build the entire _support_ code that the vendor probably licensed
> from a 3rd party before customizing.  Sometimes Linux comes up with
> replacement "subsystems" for Windows equivalents, but getting them to
> work with proprietary hardware is a long, painful process.
> 
> But some do it.  And it takes 6+ months.  Which means half-way through
> the product lifecycle of a "superstore product" (~12 months), once the
> Linux driver is finally written (if at all), the "superstore vendor" has
> already introduced a replacement product for the next revision of
> Windows, or whatever "technology" is being pushed.
> 
> There are so many things here -- I can't even begin.

I do not doubt the difficulty but I have done some very difficult programming 
myself so I do not doubt the solution also.

> 
> 
>>I can understand that the public specs can be bad and I can understand, 
>>in an unfortunate way, that the hardware companies have been sold on the idea 
>>that only Microsoft should be able to create software fot their hardware.
> 
> 
> The logic is 180 degrees.  Microsoft does _not_ create software for
> their hardware.  In fact, if Microsoft had to write their own drivers,
> Windows would have about 1/100th of the drivers available in the stock
> Linux kernel.  Microsoft would _die_overnight_ if vendors stopped
> producing Windows drivers for their hardware.
> 
> In fact, the reason why people upgrade Windows/applications is because
> hardware vendors force them too, and vice-versa.  It's the "superstore
> model" which is come over from the decade-long, MS-driven OEM model.
> Hence why Microsoft has a stake in Best Buy (which really began it in
> the late '90s).
> 
> Hardware vendors support Microsoft because that addresses 90% of
> consumers, nearly 100% of consumers who shop at the superstore.  That's
> where their profit model is.

Thanks for pointing this out to me.

Microsoft does have a very good record in supporting hardware devices, and their 
install programs are very solid. If Linux is to compete against Windows it must 
take the same attitude that it needs to be as good, or at least needs to be more 
informative when an error occurs.

> 
> 
>>In the 
>>latter case I sympathize with Linux device driver developers. I would not mind 
>>if an OS says that it can not support certain hardware in any way due to the 
>>lack of information by the hardware vendor, but I did not find any such 
>>information for CentOS ( or FC4 when I failed to install it ).
>>Thanks for your help and I am glad I got CentOS working.
> 
> 
> Again, I think your issues have more to do with going with a "bleeding-
> edge" Fedora Core adoption, just like others who used to try the latest
> Red Hat Linux prior.

I am now on CentOS 4.1 so I will leave bleeding edge Fedora behind. I did not 
appreciate the Fedora 4 answers I got when I brought up the video card problem 
on their forums so I will go for greater stability instead. I never wanted to be 
bleeding edge but Fedora 3 worked so easily I thought that Fedora 4 would easy 
to setup and use. I was wrong about that.