[CentOS] kickstart problems

Thu Sep 4 06:27:55 UTC 2008
Romeo Ninov <rninov at gmail.com>


Paolo Supino  wrote / napísal(a):
>
>
> On Wed, Sep 3, 2008 at 5:52 PM, Marco Fretz <mailinglist at blah.li 
> <mailto:mailinglist at blah.li>> wrote:
>
>     hi,
>
>     we had the same problem with newer HP pcs and servers (broadcom nics).
>     pxe works well on broadcom, the install not. doesn't matter if you're
>     using kickstart or manual install.
>
>     the problem was in centos 4.2. after updating the install
>     environment to
>     4.5 the problem was gone... so it was a driver issue! the install
>     kernel
>     is not exactly the normal linux kernel i think.
>
>     if anaconda just says that it cannot find install image, etc. the
>     system
>     has no connectivity at this time.
>
>     hope this is helpful...
>
>     bests
>      marco
>
>     Paolo Supino wrote:
>     >
>     >
>     > On Tue, Sep 2, 2008 at 3:07 PM, Romeo Ninov <rninov at gmail.com
>     <mailto:rninov at gmail.com>
>     > <mailto:rninov at gmail.com <mailto:rninov at gmail.com>>> wrote:
>     >
>     >
>     >
>     >     Paolo Supino  wrote / napísal(a):
>     >
>     >
>     >
>     >         On Tue, Sep 2, 2008 at 2:17 PM, Romeo Ninov
>     <rninov at gmail.com <mailto:rninov at gmail.com>
>     >         <mailto:rninov at gmail.com <mailto:rninov at gmail.com>>
>     <mailto:rninov at gmail.com <mailto:rninov at gmail.com>
>     >         <mailto:rninov at gmail.com <mailto:rninov at gmail.com>>>> wrote:
>     >
>     >
>     >
>     >            Paolo Supino  wrote / napísal(a):
>     >
>     >
>     >
>     >                On Tue, Sep 2, 2008 at 8:14 AM, nate
>     >         <centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net> <mailto:centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net>>
>     >                <mailto:centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net>
>     >         <mailto:centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net>>>
>     >                <mailto:centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net>
>     >         <mailto:centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net>>
>     >
>     >                <mailto:centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net>
>     >         <mailto:centos at linuxpowered.net
>     <mailto:centos at linuxpowered.net>>>>> wrote:
>     >
>     >                   Paolo Supino wrote:
>     >                   > Hi Nate
>     >                   >
>     >
>     >                   > 3: After the error comes up I get the HTTP setup
>     >                configuration
>     >                   screen with
>     >                   > the source website (in IP) and CentOS
>     directory as I
>     >         entered
>     >                   them in the
>     >                   > pxeconfiguration file and as it appears in
>     the kickstart
>     >                   configuration file
>     >                   > and all I have to do is press the 'OK' button to
>     >         continue the
>     >                   installation
>     >                   > to a successful completion.
>     >
>     >                   If that's the case the next most likely culprit is
>     >
>     >                   > url --url http://192.168.11.1/source
>     >
>     >
>     >                   Just because the PXE boot loader can download the
>     >         kickstart
>     >                   config does not mean that the installation process
>     >         will work
>     >                   with that NIC.
>     >
>     >                   Also I've had lots of broadcom systems not
>     work with
>     >                kickstart over
>     >                   the years, it's not uncommon for newer systems
>     to have
>     >         newer
>     >                   revs of the chipsets and those revs not being
>     >         supported by the
>     >                   installer.
>     >
>     >                   But it sounds like in your case it does work, so I
>     >         would look
>     >                   at the url above, as it likely is the cause of the
>     >         problem.
>     >                Check
>     >                   the http access logs on the server for 404s and
>     >         similar errors.
>     >
>     >                   nate
>     >
>     >                   _______________________________________________
>     >                   CentOS mailing list
>     >                   CentOS at centos.org <mailto:CentOS at centos.org>
>     <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>
>     >         <mailto:CentOS at centos.org <mailto:CentOS at centos.org>
>     <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>>
>     >                <mailto:CentOS at centos.org
>     <mailto:CentOS at centos.org> <mailto:CentOS at centos.org
>     <mailto:CentOS at centos.org>>
>     >         <mailto:CentOS at centos.org <mailto:CentOS at centos.org>
>     <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>>>
>     >
>     >
>     >                   http://lists.centos.org/mailman/listinfo/centos
>     >
>     >
>     >
>     >                Hi Nate
>     >
>     >                 After figuring what I was doing wrong (see
>     previous reply
>     >                ...) I started going through each of my systems
>     in order to
>     >                boot them and install CentOS 5.2 on each. For the
>     most
>     >         part it
>     >                works, but only for the most part? Because once
>     in a few
>     >         boots
>     >                (not machine specific) anaconda stops and either
>     asks me what
>     >                interface it needs to configure or fails to load
>     'stage2.img'
>     >                from the web server on 192.168.11.1
>     <http://192.168.11.1> <http://192.168.11.1>
>     >         <http://192.168.11.1>
>     >                <http://192.168.11.1> ... All cables are good
>     cables. The
>     >                network switch is a Cisco 3750G with no
>     configuration)
>     >         and all
>     >                the NICs are broadcom with firmware 3.8.9.
>     <http://3.8.9.>
>     >         <http://3.8.9.> <http://3.8.9.>
>     >                <http://3.8.9.> Can you throw a guess where the
>     problem might
>     >                be lying (I hate inconsistencies)?
>     >
>     >
>     >            Have you check apache logs for something. Check also
>     the server
>     >            messages
>     >
>     >            _______________________________________________
>     >            CentOS mailing list
>     >            CentOS at centos.org <mailto:CentOS at centos.org>
>     <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>
>     >         <mailto:CentOS at centos.org <mailto:CentOS at centos.org>
>     <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>>
>     >            http://lists.centos.org/mailman/listinfo/centos
>     >
>     >
>     >         Hi Romeo
>     >
>     >          Yes I did, and nothing shows up in either access_log or
>     >         error_log :-(
>     >         I just had a node that stopped asking me for IP
>     configuration
>     >         (twice) and only on the second time (checked on the
>     server using
>     >         tcpdump) did it actually try to contact the server to
>     retrieve
>     >         network configuration continue and it successfully retrieved
>     >         'stage2.img' from the web server :-(
>     >
>     >     Paolo, what about DHCP or bootp servers. Check the logs,
>     flush ARP
>     >     cache from server(s)
>     >
>     >     _______________________________________________
>     >     CentOS mailing list
>     >     CentOS at centos.org <mailto:CentOS at centos.org>
>     <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>
>     >     http://lists.centos.org/mailman/listinfo/centos
>     >
>     >
>     > Hi Romeo
>     >
>     >   The more systems I boot the more I'm starting to feel that it's
>     > hardware problem related ... I just booted a system in which the
>     ELOM
>     > says that NIC0 has 1 MAC address, but when I boot the system I
>     saw on
>     > the network a different MAC address altogether ...
>     >   I'm checking at the lowest level: on the wire (using tcpdump)
>     so if
>     > nothing shows in the capture I'm sure I won't find anything in
>     the logs :-(
>     >
>     >
>     >
>     >
>     > --
>     > TIA
>     > Paolo
>     >
>     >
>     >
>     ------------------------------------------------------------------------
>     >
>     > _______________________________________________
>     > CentOS mailing list
>     > CentOS at centos.org <mailto:CentOS at centos.org>
>     > http://lists.centos.org/mailman/listinfo/centos
>     _______________________________________________
>     CentOS mailing list
>     CentOS at centos.org <mailto:CentOS at centos.org>
>     http://lists.centos.org/mailman/listinfo/centos
>
>
>
> Hi Marco
>
>  Thanx for the email. I've been debugging this problem for a few days 
> and a few installs before I posted the first email in this thread I 
> started sniffing the network interface on the server (dhcp, tftp, http 
> are all on the same computer) and I noticed that no communication 
> reaches the server between the PXE load and the retrieval error (and I 
> think I wrote about it in my original post). Some people suggested 
> that it might be that Linux gets confused in the interfaces (the Sun 
> X2200 M2 has 4 NICs), which I find hard to believe (Linux kernel is 
> old enough and probably got rid of these kind of bugs a long time 
> ago). In some of the failures the kernel loaded, retrieved the 
> kickstart configuration file and than failed to retrieve 'stage2.img' 
> (again nothing appeared on the wire). I have a sneaky feeling that the 
> kickstart process assumes a lot of basic facts and doesn't do 
> any/enough sanity checking. Right now I need to get this cluster up 
> and running (I'm already 2 weeks behind schedule). After it's up I 
> will try to debug the process.
>   The situation got me so aggravated that I was contemplating 
> resurrecting my old private distro (not going to do that) that does 
> things in a much simpler way.
>
>
Paolo
Unfortunately CentOS/RHEL have really problem in process of loading 
modules, especialy in case of two identical NICs, they change on random 
way. I personaly use this way to mitigate the problem: in 
/etc/modprobe.conf add 1st modprobe for NIC on 1st place and second on 
last place in the file and after reboot i have always NIC->eth? relation 
in place