[CentOS] kickstart problems

Wed Sep 3 20:41:50 UTC 2008
Paolo Supino <paolo.supino at gmail.com>

On Wed, Sep 3, 2008 at 5:52 PM, Marco Fretz <mailinglist at blah.li> wrote:

> hi,
>
> we had the same problem with newer HP pcs and servers (broadcom nics).
> pxe works well on broadcom, the install not. doesn't matter if you're
> using kickstart or manual install.
>
> the problem was in centos 4.2. after updating the install environment to
> 4.5 the problem was gone... so it was a driver issue! the install kernel
> is not exactly the normal linux kernel i think.
>
> if anaconda just says that it cannot find install image, etc. the system
> has no connectivity at this time.
>
> hope this is helpful...
>
> bests
>  marco
>
> Paolo Supino wrote:
> >
> >
> > On Tue, Sep 2, 2008 at 3:07 PM, Romeo Ninov <rninov at gmail.com
> > <mailto:rninov at gmail.com>> wrote:
> >
> >
> >
> >     Paolo Supino  wrote / napísal(a):
> >
> >
> >
> >         On Tue, Sep 2, 2008 at 2:17 PM, Romeo Ninov <rninov at gmail.com
> >         <mailto:rninov at gmail.com> <mailto:rninov at gmail.com
> >         <mailto:rninov at gmail.com>>> wrote:
> >
> >
> >
> >            Paolo Supino  wrote / napísal(a):
> >
> >
> >
> >                On Tue, Sep 2, 2008 at 8:14 AM, nate
> >         <centos at linuxpowered.net <mailto:centos at linuxpowered.net>
> >                <mailto:centos at linuxpowered.net
> >         <mailto:centos at linuxpowered.net>>
> >                <mailto:centos at linuxpowered.net
> >         <mailto:centos at linuxpowered.net>
> >
> >                <mailto:centos at linuxpowered.net
> >         <mailto:centos at linuxpowered.net>>>> wrote:
> >
> >                   Paolo Supino wrote:
> >                   > Hi Nate
> >                   >
> >
> >                   > 3: After the error comes up I get the HTTP setup
> >                configuration
> >                   screen with
> >                   > the source website (in IP) and CentOS directory as I
> >         entered
> >                   them in the
> >                   > pxeconfiguration file and as it appears in the
> kickstart
> >                   configuration file
> >                   > and all I have to do is press the 'OK' button to
> >         continue the
> >                   installation
> >                   > to a successful completion.
> >
> >                   If that's the case the next most likely culprit is
> >
> >                   > url --url http://192.168.11.1/source
> >
> >
> >                   Just because the PXE boot loader can download the
> >         kickstart
> >                   config does not mean that the installation process
> >         will work
> >                   with that NIC.
> >
> >                   Also I've had lots of broadcom systems not work with
> >                kickstart over
> >                   the years, it's not uncommon for newer systems to have
> >         newer
> >                   revs of the chipsets and those revs not being
> >         supported by the
> >                   installer.
> >
> >                   But it sounds like in your case it does work, so I
> >         would look
> >                   at the url above, as it likely is the cause of the
> >         problem.
> >                Check
> >                   the http access logs on the server for 404s and
> >         similar errors.
> >
> >                   nate
> >
> >                   _______________________________________________
> >                   CentOS mailing list
> >                   CentOS at centos.org <mailto:CentOS at centos.org>
> >         <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>
> >                <mailto:CentOS at centos.org <mailto:CentOS at centos.org>
> >         <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>>
> >
> >
> >                   http://lists.centos.org/mailman/listinfo/centos
> >
> >
> >
> >                Hi Nate
> >
> >                 After figuring what I was doing wrong (see previous reply
> >                ...) I started going through each of my systems in order
> to
> >                boot them and install CentOS 5.2 on each. For the most
> >         part it
> >                works, but only for the most part? Because once in a few
> >         boots
> >                (not machine specific) anaconda stops and either asks me
> what
> >                interface it needs to configure or fails to load
> 'stage2.img'
> >                from the web server on 192.168.11.1 <http://192.168.11.1>
> >         <http://192.168.11.1>
> >                <http://192.168.11.1> ... All cables are good cables. The
> >                network switch is a Cisco 3750G with no configuration)
> >         and all
> >                the NICs are broadcom with firmware 3.8.9.
> >         <http://3.8.9.> <http://3.8.9.>
> >                <http://3.8.9.> Can you throw a guess where the problem
> might
> >                be lying (I hate inconsistencies)?
> >
> >
> >            Have you check apache logs for something. Check also the
> server
> >            messages
> >
> >            _______________________________________________
> >            CentOS mailing list
> >            CentOS at centos.org <mailto:CentOS at centos.org>
> >         <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>
> >            http://lists.centos.org/mailman/listinfo/centos
> >
> >
> >         Hi Romeo
> >
> >          Yes I did, and nothing shows up in either access_log or
> >         error_log :-(
> >         I just had a node that stopped asking me for IP configuration
> >         (twice) and only on the second time (checked on the server using
> >         tcpdump) did it actually try to contact the server to retrieve
> >         network configuration continue and it successfully retrieved
> >         'stage2.img' from the web server :-(
> >
> >     Paolo, what about DHCP or bootp servers. Check the logs, flush ARP
> >     cache from server(s)
> >
> >     _______________________________________________
> >     CentOS mailing list
> >     CentOS at centos.org <mailto:CentOS at centos.org>
> >     http://lists.centos.org/mailman/listinfo/centos
> >
> >
> > Hi Romeo
> >
> >   The more systems I boot the more I'm starting to feel that it's
> > hardware problem related ... I just booted a system in which the ELOM
> > says that NIC0 has 1 MAC address, but when I boot the system I saw on
> > the network a different MAC address altogether ...
> >   I'm checking at the lowest level: on the wire (using tcpdump) so if
> > nothing shows in the capture I'm sure I won't find anything in the logs
> :-(
> >
> >
> >
> >
> > --
> > TIA
> > Paolo
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > CentOS mailing list
> > CentOS at centos.org
> > http://lists.centos.org/mailman/listinfo/centos
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>


Hi Marco

 Thanx for the email. I've been debugging this problem for a few days and a
few installs before I posted the first email in this thread I started
sniffing the network interface on the server (dhcp, tftp, http are all on
the same computer) and I noticed that no communication reaches the server
between the PXE load and the retrieval error (and I think I wrote about it
in my original post). Some people suggested that it might be that Linux gets
confused in the interfaces (the Sun X2200 M2 has 4 NICs), which I find hard
to believe (Linux kernel is old enough and probably got rid of these kind of
bugs a long time ago). In some of the failures the kernel loaded, retrieved
the kickstart configuration file and than failed to retrieve 'stage2.img'
(again nothing appeared on the wire). I have a sneaky feeling that the
kickstart process assumes a lot of basic facts and doesn't do any/enough
sanity checking. Right now I need to get this cluster up and running (I'm
already 2 weeks behind schedule). After it's up I will try to debug the
process.
  The situation got me so aggravated that I was contemplating resurrecting
my old private distro (not going to do that) that does things in a much
simpler way.





--
ttyl
Paolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos/attachments/20080903/de040ce6/attachment-0005.html>