Paolo Supino wrote / napísal(a): > > On Thu, Sep 4, 2008 at 8:27 AM, Romeo Ninov <rninov at gmail.com > <mailto:rninov at gmail.com>> wrote: > > > > Paolo Supino wrote / napísal(a): > > > > On Wed, Sep 3, 2008 at 5:52 PM, Marco Fretz > <mailinglist at blah.li <mailto:mailinglist at blah.li> > <mailto:mailinglist at blah.li <mailto:mailinglist at blah.li>>> wrote: > > hi, > > we had the same problem with newer HP pcs and servers > (broadcom nics). > pxe works well on broadcom, the install not. doesn't matter > if you're > using kickstart or manual install. > > the problem was in centos 4.2. after updating the install > environment to > 4.5 the problem was gone... so it was a driver issue! the > install > kernel > is not exactly the normal linux kernel i think. > > if anaconda just says that it cannot find install image, > etc. the > system > has no connectivity at this time. > > hope this is helpful... > > bests > marco > > Paolo Supino wrote: > > > > > > On Tue, Sep 2, 2008 at 3:07 PM, Romeo Ninov > <rninov at gmail.com <mailto:rninov at gmail.com> > <mailto:rninov at gmail.com <mailto:rninov at gmail.com>> > > <mailto:rninov at gmail.com <mailto:rninov at gmail.com> > <mailto:rninov at gmail.com <mailto:rninov at gmail.com>>>> wrote: > > > > > > > > Paolo Supino wrote / napísal(a): > > > > > > > > On Tue, Sep 2, 2008 at 2:17 PM, Romeo Ninov > <rninov at gmail.com <mailto:rninov at gmail.com> > <mailto:rninov at gmail.com <mailto:rninov at gmail.com>> > > <mailto:rninov at gmail.com > <mailto:rninov at gmail.com> <mailto:rninov at gmail.com > <mailto:rninov at gmail.com>>> > <mailto:rninov at gmail.com <mailto:rninov at gmail.com> > <mailto:rninov at gmail.com <mailto:rninov at gmail.com>> > > <mailto:rninov at gmail.com > <mailto:rninov at gmail.com> <mailto:rninov at gmail.com > <mailto:rninov at gmail.com>>>>> wrote: > > > > > > > > Paolo Supino wrote / napísal(a): > > > > > > > > On Tue, Sep 2, 2008 at 8:14 AM, nate > > <centos at linuxpowered.net > <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>> > <mailto:centos at linuxpowered.net <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>>> > > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>> > > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>>>> > > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>> > > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>>> > > > > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>> > > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net> > <mailto:centos at linuxpowered.net > <mailto:centos at linuxpowered.net>>>>>> wrote: > > > > Paolo Supino wrote: > > > Hi Nate > > > > > > > > 3: After the error comes up I get the > HTTP setup > > configuration > > screen with > > > the source website (in IP) and CentOS > directory as I > > entered > > them in the > > > pxeconfiguration file and as it > appears in > the kickstart > > configuration file > > > and all I have to do is press the > 'OK' button to > > continue the > > installation > > > to a successful completion. > > > > If that's the case the next most likely > culprit is > > > > > url --url http://192.168.11.1/source > > > > > > Just because the PXE boot loader can > download the > > kickstart > > config does not mean that the > installation process > > will work > > with that NIC. > > > > Also I've had lots of broadcom systems not > work with > > kickstart over > > the years, it's not uncommon for newer > systems > to have > > newer > > revs of the chipsets and those revs not > being > > supported by the > > installer. > > > > But it sounds like in your case it does > work, so I > > would look > > at the url above, as it likely is the > cause of the > > problem. > > Check > > the http access logs on the server for > 404s and > > similar errors. > > > > nate > > > > > _______________________________________________ > > CentOS mailing list > > CentOS at centos.org > <mailto:CentOS at centos.org> <mailto:CentOS at centos.org > <mailto:CentOS at centos.org>> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>> > > <mailto:CentOS at centos.org > <mailto:CentOS at centos.org> <mailto:CentOS at centos.org > <mailto:CentOS at centos.org>> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>>> > > <mailto:CentOS at centos.org > <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>> > > <mailto:CentOS at centos.org > <mailto:CentOS at centos.org> <mailto:CentOS at centos.org > <mailto:CentOS at centos.org>> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>>>> > > > > > > > http://lists.centos.org/mailman/listinfo/centos > > > > > > > > Hi Nate > > > > After figuring what I was doing wrong (see > previous reply > > ...) I started going through each of my > systems > in order to > > boot them and install CentOS 5.2 on each. > For the > most > > part it > > works, but only for the most part? Because > once > in a few > > boots > > (not machine specific) anaconda stops and > either > asks me what > > interface it needs to configure or fails > to load > 'stage2.img' > > from the web server on 192.168.11.1 > <http://192.168.11.1> > <http://192.168.11.1> <http://192.168.11.1> > > <http://192.168.11.1> > > <http://192.168.11.1> ... All cables are good > cables. The > > network switch is a Cisco 3750G with no > configuration) > > and all > > the NICs are broadcom with firmware 3.8.9. > <http://3.8.9.> > <http://3.8.9.> > > <http://3.8.9.> <http://3.8.9.> > > <http://3.8.9.> Can you throw a guess > where the > problem might > > be lying (I hate inconsistencies)? > > > > > > Have you check apache logs for something. > Check also > the server > > messages > > > > _______________________________________________ > > CentOS mailing list > > CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>> > > <mailto:CentOS at centos.org > <mailto:CentOS at centos.org> <mailto:CentOS at centos.org > <mailto:CentOS at centos.org>> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>>> > > http://lists.centos.org/mailman/listinfo/centos > > > > > > Hi Romeo > > > > Yes I did, and nothing shows up in either > access_log or > > error_log :-( > > I just had a node that stopped asking me for IP > configuration > > (twice) and only on the second time (checked on the > server using > > tcpdump) did it actually try to contact the server to > retrieve > > network configuration continue and it > successfully retrieved > > 'stage2.img' from the web server :-( > > > > Paolo, what about DHCP or bootp servers. Check the logs, > flush ARP > > cache from server(s) > > > > _______________________________________________ > > CentOS mailing list > > CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>>> > > http://lists.centos.org/mailman/listinfo/centos > > > > > > Hi Romeo > > > > The more systems I boot the more I'm starting to feel > that it's > > hardware problem related ... I just booted a system in > which the > ELOM > > says that NIC0 has 1 MAC address, but when I boot the > system I > saw on > > the network a different MAC address altogether ... > > I'm checking at the lowest level: on the wire (using > tcpdump) > so if > > nothing shows in the capture I'm sure I won't find > anything in > the logs :-( > > > > > > > > > > -- > > TIA > > Paolo > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > CentOS mailing list > > CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>> > > http://lists.centos.org/mailman/listinfo/centos > _______________________________________________ > CentOS mailing list > CentOS at centos.org <mailto:CentOS at centos.org> > <mailto:CentOS at centos.org <mailto:CentOS at centos.org>> > http://lists.centos.org/mailman/listinfo/centos > > > > Hi Marco > > Thanx for the email. I've been debugging this problem for a > few days and a few installs before I posted the first email in > this thread I started sniffing the network interface on the > server (dhcp, tftp, http are all on the same computer) and I > noticed that no communication reaches the server between the > PXE load and the retrieval error (and I think I wrote about it > in my original post). Some people suggested that it might be > that Linux gets confused in the interfaces (the Sun X2200 M2 > has 4 NICs), which I find hard to believe (Linux kernel is old > enough and probably got rid of these kind of bugs a long time > ago). In some of the failures the kernel loaded, retrieved the > kickstart configuration file and than failed to retrieve > 'stage2.img' (again nothing appeared on the wire). I have a > sneaky feeling that the kickstart process assumes a lot of > basic facts and doesn't do any/enough sanity checking. Right > now I need to get this cluster up and running (I'm already 2 > weeks behind schedule). After it's up I will try to debug the > process. > The situation got me so aggravated that I was contemplating > resurrecting my old private distro (not going to do that) that > does things in a much simpler way. > > > Paolo > Unfortunately CentOS/RHEL have really problem in process of > loading modules, especialy in case of two identical NICs, they > change on random way. I personaly use this way to mitigate the > problem: in /etc/modprobe.conf add 1st modprobe for NIC on 1st > place and second on last place in the file and after reboot i have > always NIC->eth? relation in place > > _______________________________________________ > CentOS mailing list > CentOS at centos.org <mailto:CentOS at centos.org> > http://lists.centos.org/mailman/listinfo/centos > > > > Hi Marco > > I didn't finish testing the way Nate asked me to so right now I > don't have any conclusive answers about what exactly is going on, but > in pasting my original email (that started this thread) I wrote that > what I see happening is: > anaconda prints an error message that it fails to retrieve > 'stage2.img' from the HTTP server. I press 'OK' in the error message > screen. The screen that comes after it is the HTTP setup screen with > the information given by the 'ks' directive from pxelinux already in > place, so that the only thing left for me to do is press the 'OK' > button. When I press the 'OK' button anaconda successfully retrieves > 'stage2.img' from the http server and goes on to finish successfully > the unattendded install (take a look at my original post). The only > thing that makes sense is that the network configuration didn't finish > (yet) before tring to retrieve 'stage2.img'. > Along the way I tried to change configuration various times and I > got all possible failures (or at least it feels like it): failed to > retrieve kickstart config file, failed to retrieve 'stage2.img' file > no matter how many times I pressed the 'OK' button in the HTTP setup > screen, and probably a few more scenarios that I'm trying very hard to > forget ;-) > One thing I noticed is that anaconda reconfigures the network > interface after the kernel already configured it and successfully > retrieves the kickstart config file from the web server (proved by > sniffing the network). The question that goes in my mind when I see it > is: why is it doing that??? and makes me feel that something is wrong > in the assumptions and install process .., > Maybe you're right about the module loading issue because (though it > doesn't explain what I wrote in the original post): I resorected my > old distro (a heavily modified Slackware) to test the issue and what I > found is that a no module kernel (all needed drivers are statically > compiled before) and no initrd to mess things up the issue simply > didn't happen (tested 10 times). > On the other hand if you were right about it than RHEL/CentOS/Fedora > installation would be unsuitable in any multihome configuration > because it would map ETH devices differently (albeit once in a while) > which means one whould have to swtich the cables because of network > device remapping!!! and that isn't something users and corporations > that use REHL (and there are many of those) would be willing to live > with :-) > Paolo, this problem occur only in RHEL/CentOS/other RH based distros and not in Slack, SuSE, Debian, etc. I was not going deeper in the problem, but that is the reality. BTW: You can play with MAC address in incfg files, but this is applicable only on already installed machine.About Your remarc for corporations and RH - you are right, but how often servers are restarted? :-)