From poknam at gmail.com Wed Apr 2 00:31:10 2008 From: poknam at gmail.com (PN) Date: Wed, 2 Apr 2008 15:31:10 +0800 Subject: [Warewulf] 3 questions in perceus In-Reply-To: <039501c890ee$98fd9170$cb00a8c0@terminal209> References: <92daa7bf0803270300l3eb4db3dr9b61137e972f9e6b@mail.gmail.com> <006101c8902f$29ee39c0$cb00a8c0@terminal209> <92daa7bf0803280745m36defb42y2646a33c913d0d1c@mail.gmail.com> <039501c890ee$98fd9170$cb00a8c0@terminal209> Message-ID: <92daa7bf0804020031h761bf032u8735d88b06ede9a9@mail.gmail.com> 2008/3/29, Arthur Stevens : > > 1) cool. by default we run light. > 2) Rapid Boot allows the full perceus client, ib drivers, etc to be > embedded on the motherboard. Zepher is a different solution with lots of > info available via google, and it supports the full perceus and can be used > by a lot of vendors. Think of it as a compromise between embedded and a usb > key. The Mellanox package I could not tell you about, you might give > Mellanox a call. If I recall correctly, it lets you netboot over pxe but it > is not the same as embedded perceus. The perceus embedded cards will have > the full perceus client on them. > 3) Looks like you have something else going on of which I do not have > enough data to troubleshoot. Without even knowing the board vendor, it's > hard to tell where to start. You might want to contact Infiscale if in a > hurry with this one. Otherwise gather all your data and hit the list here > with your pastebin ;) > Thanks for your reply. My motherboard vendor is Iwill which use Broadcom Corporation NetXtreme BCM5721 chipset. The driver used is tg3. I found that the tg3.ko already comes with perceus kernel, under /lib/modules/2.6.21-perceus/kernel/drivers/net. However, it seems that it was not loaded properly. I've tried to boot the compute node in debug mode. it shows: Provisiong from 11.1.0.1... Begining provisioning with debug level 1 Now provisioning: node0000 VNFS: GTC Group: cluster Node ID: 00:14:25:00:04:48 Mounting via NFS 11.1.0.1: /var/lib/perceus/ Downloading file via NFS: vnfs/GTC/vnfs.img Downloading file via NFS: vnfs/GTC/rootfs//boot/vmlinuz-2.6.18-53.el5 Un-mounting 11.1.0.1:/var/lib/perceus Un-loading device drivers: \-> ib_ipoib ata_piix ib_mthca tg3 <<<--- is it normal to unload the tg3 driver here?? Total provision time: 19s Sleeping for 10 seconds for debug evaluation... Ramdisks not supported with generic elf arguments Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not determine hardware address of device 'eth0' Provisioning from 11.1.0.1... Waiting 30 seconds, and rebooting... Press [ENTER] to inturrupt reboot and get a shell. After I get into the shell and run "ifconfig eth0", it shows: ifconfig: eth0: error fetching interface information: Device not found However, if I run the command "insmod tg3.ko" under /lib/modules/2.6.21-perceus/kernel/drivers/net, the network driver comes up and can communicate. Also, i've tried to enable the perceus modprobe module and add a line to the /etc/perceus/modules/modprobe all: tg3 sd_mod xfs ext3 sunrpc nfs_acl lockd nfs But the result is the same. Any idea about it? Thanks a lot. PN > hope this helps, > > Arthur > > > > .Sent on a Blackberry using BBpro Mail v2. > > ----- Original Message ----- > *From:* PN > *To:* Arthur Stevens ; The Warewulf Cluster > Toolkit > *Sent:* Friday, March 28, 2008 7:45 AM > *Subject:* Re: [Warewulf] 3 questions in perceus > > > Hi, > > 2008/3/28, Arthur Stevens : > > > Replies to 1-3.... > > > > 1) you can go into your confs and adjust that. we run skinny by default. > > > > OK, I will try this later. > > 2) yes you are :) IB by default does not support net booting without our > > love. You need our Rapid Boot Payload or our Zepher, or our embedded usb to > > boot directly from IB. Also the new Perceus imbedded IB cards will support > > that as well. 1 wire IB is a commercial offering at this time. > > > > There is a Boot over IB software package provided by mellanox. Is Rapid > Boot Payload or Zepher similar to that? > > > > > > > 3) You need to add your driver that is needed to the perceus > > kernel/vnfs then restart stuff and all should be fine. Just add support for > > your tg3. Infiscale offers that as a service if needed or you just don't > > have the time. > > > > It's ok for me to add the module. Actually the tg3 module comes along with > the vnfs kernel. So do i simply edit the /etc/perceus/modules/modprobe is > enough? > > It seems that perceus 1.3.6 and 1.3.7 are using the same perceus kernel. > Is there any reason why the node cannot boot with the same vnfs image? > > Thanks a lot, > PN > > > > > > > > Arthur > > > > ----- Original Message ----- > > *From:* PN > > *To:* The Warewulf Cluster Toolkit > > *Sent:* Thursday, March 27, 2008 3:00 AM > > *Subject:* [Warewulf] 3 questions in perceus > > > > > > hi all, > > > > i have 3 questions in perceus: > > > > 1) the syslog redirection seems not working, the default is redirected > > to the master node, but i can't find any compute node information in the > > master node's /var/log/messages, except the TFTP messages. > > > > 2) it seems that perceus can use IB as dhcp and provisioning, but after > > i set the /etc/perceus/dnsmasq.conf using ib0 and restart perceus, the > > compute node still cannot boot from IB. it just cannot find any dhcp server, > > am i missing something? > > > > 3) when using perceus-1.3.6, the compute node can boot up sucessfully. > > however in 1.3.7, the compute node cannot boot. it seems that the driver > > is not working in the new version. > > my compute node uses the tg3 network driver. > > > > Provisioning from 192.168.30.20 <= it is the internet IP, different > > from version 1.3.6, which shows the internal IP 11.1.0.1. > > > > Now provisiong: node0001 > > VNFS: GT8000 > > Group: cluster > > Node ID: 00:14:25:00:04 > > > > + cat /found_nics <= these messages did not > > appear in perceus 1.3.6 > > + ifconfig eth0 down > > + ifconfig eth1 down > > + ifconfig ib0 down > > + ifconfig ib1 down > > + [ ! -f /sbin/detect ] > > + . /etc/functions > > + . /etc/initramfs.conf > > + DEVS=eth0 eth1 eth2 eth3 eth4 eth5 eth6 eth7 ib0 ib1 > > + MAX_TRIES=5 > > + echo Un-loading device drivers: > > + echo -ne \-> > > \->+ unload_module scsi_mod > > + grep -q ^scsi_mod /proc/modules > > + PATH=/sbin rmmod scsi_mod > > + unload_module ib_ipoib > > + grep -q ^ib_ipoib /proc/modules > > + PATH=/sbin rmmod ib_ipoib > > + echo -n ib_ipoib > > ib_ipoib+ cat /etc/modulerc > > + read i > > + /sbin/detect -q > > + read i > > + unload_module uhci_hcd > > + grep -q ^uhci_hcd /proc/modules > > + read i > > + unload_module uhci-hcd > > + grep -q ^uhci-hcd /proc/modules > > + read i > > + unload_module ehci-hcd > > + grep -q ^ehci-hcd /proc/modules > > + read i > > + unload_module ata_piix > > + grep -q ^ata_piix /proc/modules > > + PATH=/sbin rmmod ata_piix > > + echo -n ata_piix > > ata_piix+ read i > > + unload_module piix > > + grep -q ^piix /proc/modules > > + PATH=/sbin remmod piix > > + read i > > + unload_module ib_mthca > > + grep -q ^ib_mthca /proc/modules > > + PATH=/sbin rmmod ib_mthca > > + echo -n ib_mthca > > ib_mthca+ read i > > + unload_module tg3 > > + grep -q ^tg3 /proc/modules > > + PATH=/sbin rmmod tg3 > > + echo -n tg3 > > tg3+ read i > > + unload_module tg3 > > + grep -q ^tg3 /proc/modules > > + read i > > + echo > > .... > > > > thanks, > > PN > > > > ------------------------------ > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080402/07e830ba/attachment.html From Darin.Perusich at cognigencorp.com Thu Apr 3 07:27:48 2008 From: Darin.Perusich at cognigencorp.com (Darin Perusich) Date: Thu, 03 Apr 2008 10:27:48 -0400 Subject: [Warewulf] HA Perceus servers Message-ID: <47F4E964.4030206@cognigencorp.com> Hello, I'm wondering what others are doing to provide availability for Perceus servers in the case of a failure? Sections 2.2.3 in the user guide states is possible but doesn't get into any details other then using a remote state in section 2.4.2. My concern when nfs sharing the state information are the DB files in /var/lib/perceus/database/ and file locking, multiple servers need to write to the DB files. This also bring into questions the 'vnfs transfer master' value in perceus.conf, if you have multiple Perceus master servers what would this value be, the local Perceus host, all the Perceus server? -- Darin Perusich Unix Systems Administrator Cognigen Corporation 395 Youngs Rd. Williamsville, NY 14221 Phone: 716-633-3463 Email: darinper at cognigencorp.com From astevens at gravitypark.com Thu Apr 3 09:23:27 2008 From: astevens at gravitypark.com (Arthur Stevens) Date: Thu, 3 Apr 2008 09:23:27 -0700 Subject: [Warewulf] HA Perceus servers References: <47F4E964.4030206@cognigencorp.com> Message-ID: <004001c895a7$0c7217e0$cb00a8c0@terminal209> Abstractual, our commercial offering has more advanced HA abilities including failover, migration, and a slew of nifty tricks to make it so you can literally go unplug the control server and not skip a beat. More documentation for doing this with Perceus will be available soon. We can also provide custom installs from Infiscale with extreme high availability in mind. So yes it is done quite often, but currently nobody has the time to document how, hehe. Arthur ----- Original Message ----- From: "Darin Perusich" To: "The Warewulf Cluster Toolkit" Sent: Thursday, April 03, 2008 7:27 AM Subject: [Warewulf] HA Perceus servers > Hello, > > I'm wondering what others are doing to provide availability for Perceus > servers in the case of a failure? Sections 2.2.3 in the user guide > states is possible but doesn't get into any details other then using a > remote state in section 2.4.2. My concern when nfs sharing the state > information are the DB files in /var/lib/perceus/database/ and file > locking, multiple servers need to write to the DB files. This also bring > into questions the 'vnfs transfer master' value in perceus.conf, if you > have multiple Perceus master servers what would this value be, the local > Perceus host, all the Perceus server? > > -- > Darin Perusich > Unix Systems Administrator > Cognigen Corporation > 395 Youngs Rd. > Williamsville, NY 14221 > Phone: 716-633-3463 > Email: darinper at cognigencorp.com > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From Darin.Perusich at cognigencorp.com Thu Apr 3 09:58:21 2008 From: Darin.Perusich at cognigencorp.com (Darin Perusich) Date: Thu, 03 Apr 2008 12:58:21 -0400 Subject: [Warewulf] HA Perceus servers In-Reply-To: <004001c895a7$0c7217e0$cb00a8c0@terminal209> References: <47F4E964.4030206@cognigencorp.com> <004001c895a7$0c7217e0$cb00a8c0@terminal209> Message-ID: <47F50CAD.1090500@cognigencorp.com> Hi Arthur, Arthur Stevens wrote: > Abstractual, our commercial offering has more advanced HA abilities > including failover, migration, and a slew of nifty tricks to make it so you > can literally go unplug the control server and not skip a beat. Are we talking and active-active or multi-master setup or active-passive? So I at least of some fail-over I'll likely go with an active-passive configuration using heartbeat and drdb which is was I'm currently doing. It may not be the most efficient way to utilize resource but it works well. Then once the documentation becomes available I can reevaluate and move to something else. > More documentation for doing this with Perceus will be available soon. We > can also provide custom installs from Infiscale with extreme high > availability in mind. Great, I'll be waiting for it! > So yes it is done quite often, but currently nobody has the time to document > how, hehe. Documentation always suffers when time is limited ;-). -- Darin Perusich Unix Systems Administrator Cognigen Corporation 395 Youngs Rd. Williamsville, NY 14221 Phone: 716-633-3463 Email: darinper at cognigencorp.com From Darin.Perusich at cognigencorp.com Mon Apr 7 06:36:23 2008 From: Darin.Perusich at cognigencorp.com (Darin Perusich) Date: Mon, 07 Apr 2008 09:36:23 -0400 Subject: [Warewulf] perceus-1.3.7 build error Message-ID: <47FA2357.8090407@cognigencorp.com> I'm doing an rpmbuild of 1.3.7 and when building the detect module I get the following error which causing the compile to fail. Making all in detect make[3]: Entering directory `/usr/src/packages/BUILD/perceus-1.3.7/src/detect' if gcc -DHAVE_CONFIG_H -I. -I. -I../../src -I../../src/detect -I../../src/ -I../.. -O2 -g -m32 -march=i586 -mtune=i686 -fmessage-length=0 -D_FORTIFY_SOURCE=2 -MT detect-detect.o -MD -MP -MF ".deps/detect-detect.Tpo" -c -o detect-detect.o `test -f 'detect.c' || echo './'`detect.c; \ then mv -f ".deps/detect-detect.Tpo" ".deps/detect-detect.Po"; else rm -f ".deps/detect-detect.Tpo"; exit 1; fi detect.c:23:27: error: linux/pci_ids.h: No such file or directory make[3]: *** [detect-detect.o] Error 1 pci_ids.h exists in 3rd_party/_work/kernel/linux-2.6.21.5/include/linux/ but it's not being included. Any thoughts on why this might be happening? I'm building on OpenSUSE 10.3. -- Darin Perusich Unix Systems Administrator Cognigen Corporation 395 Youngs Rd. Williamsville, NY 14221 Phone: 716-633-3463 Email: darinper at cognigencorp.com From nilssa at kth.se Mon Apr 7 09:02:36 2008 From: nilssa at kth.se (nilssa at kth.se) Date: Mon, 7 Apr 2008 18:02:36 +0200 (CEST) Subject: [Warewulf] Warewulf: Nodegroup configuration Message-ID: <52027.130.237.70.23.1207584156.squirrel@webmail.sys.kth.se> I have a warewulf cluster (WW 2.6.2) consisting of 21 nodes. It is set up in three nodegroups, with the intention that a typical user will have access to only one nodegroup, according to definitions in /etc/warewulf/nodes/nodegroup1/config, etc. If I run wwnodes --sync, all nodes are synced. However, all nodes become accessible according to the same rules, namely those of the nodegroup that was synced most recently. If I run wwnodes --info node00nn, then the correct list of users is displayed (according to the config-file), but I was expecting, for instance, that wwmpirun would send jobs by user1 only to nodes in user1:s nodegroup. Have I misunderstood something, or is something incorrectly set up? All nodes are running the same vnfs. Also, I run debian, and therefore would prefer to stay with WW 2.6.2, now that (almost) everything is working. Thanks, Nils From timattox at open-mpi.org Mon Apr 7 10:13:44 2008 From: timattox at open-mpi.org (Tim Mattox) Date: Mon, 7 Apr 2008 13:13:44 -0400 Subject: [Warewulf] Warewulf: Nodegroup configuration In-Reply-To: <52027.130.237.70.23.1207584156.squirrel@webmail.sys.kth.se> References: <52027.130.237.70.23.1207584156.squirrel@webmail.sys.kth.se> Message-ID: Hi Nils, I hope someone else can chime in here, but since I have a few moments... The nodegroup stuff in the last release Warewulf wasn't fully developed, as far as I can remember. You are likely just running into a bug. :-( If you are able to fix the perl-scripts, feel free to post a patch, and I would think Greg could roll a new bugfix release, though most of his time I'm sure is now devoted to Perceus, and the soon to be reborn Warewulf-as-cluster-monitoring-kit. On Mon, Apr 7, 2008 at 12:02 PM, wrote: > I have a warewulf cluster (WW 2.6.2) consisting of 21 nodes. It is set up > in three nodegroups, with the intention that a typical user will have > access to only one nodegroup, according to definitions in > /etc/warewulf/nodes/nodegroup1/config, etc. If I run wwnodes --sync, all > nodes are synced. However, all nodes become accessible according to the > same rules, namely those of the nodegroup that was synced most recently. > > If I run wwnodes --info node00nn, then the correct list of users is > displayed (according to the config-file), but I was expecting, for > instance, that wwmpirun would send jobs by user1 only to nodes in user1:s > nodegroup. Have I misunderstood something, or is something incorrectly set > up? > > All nodes are running the same vnfs. Also, I run debian, and therefore > would prefer to stay with WW 2.6.2, now that (almost) everything is > working. > > Thanks, > > Nils > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmattox at gmail.com || timattox at open-mpi.org I'm a bright... http://www.the-brights.net/ From nilssa at kth.se Tue Apr 8 02:34:30 2008 From: nilssa at kth.se (nilssa at kth.se) Date: Tue, 8 Apr 2008 11:34:30 +0200 (CEST) Subject: [Warewulf] Warewulf: Nodegroup configuration In-Reply-To: References: <52027.130.237.70.23.1207584156.squirrel@webmail.sys.kth.se> Message-ID: <52364.130.237.70.23.1207647270.squirrel@webmail.sys.kth.se> Hi Tim, thanks for a quick response. I suspected that it may be a bug (or just not developed yet). I have been digging into the perl-scripts and have a rough idea of how things work right now. My problem was that it was unclear to me how things were supposed to work. But your reply encourages me to modify the appropriate script(s) and post my solution when the problem is straightened out. - Nils > Hi Nils, > I hope someone else can chime in here, but since I have a few moments... The nodegroup stuff in the last release Warewulf wasn't fully developed, as > far as I can remember. You are likely just running into a bug. :-( If you are able > to fix the perl-scripts, feel free to post a patch, and I would think Greg > could > roll a new bugfix release, though most of his time I'm sure is now devoted > to Perceus, and the soon to be reborn Warewulf-as-cluster-monitoring-kit. > > On Mon, Apr 7, 2008 at 12:02 PM, wrote: >> I have a warewulf cluster (WW 2.6.2) consisting of 21 nodes. It is set up >> in three nodegroups, with the intention that a typical user will have access to only one nodegroup, according to definitions in >> /etc/warewulf/nodes/nodegroup1/config, etc. If I run wwnodes --sync, >> all >> nodes are synced. However, all nodes become accessible according to the >> same rules, namely those of the nodegroup that was synced most >> recently. >> If I run wwnodes --info node00nn, then the correct list of users is displayed (according to the config-file), but I was expecting, for instance, that wwmpirun would send jobs by user1 only to nodes in >> user1:s >> nodegroup. Have I misunderstood something, or is something incorrectly >> set >> up? >> All nodes are running the same vnfs. Also, I run debian, and therefore would prefer to stay with WW 2.6.2, now that (almost) everything is working. >> Thanks, >> Nils >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf > > > > -- > Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ > tmattox at gmail.com || timattox at open-mpi.org > I'm a bright... http://www.the-brights.net/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From Darin.Perusich at cognigencorp.com Mon Apr 14 11:45:04 2008 From: Darin.Perusich at cognigencorp.com (Darin Perusich) Date: Mon, 14 Apr 2008 14:45:04 -0400 Subject: [Warewulf] auto-provisioning and hostname matching Message-ID: <4803A630.10109@cognigencorp.com> Does or can perceus perform any type of MAC address to hostname matching when nodes are being automatically provisioned? It appears to me that perceus takes the MAC of an incoming host and assigns whatever the next available node### is without preforming any type of arp/hostname matching. I can see where this might be advantageous if the clients haven't been registered in any existing dns/dhcp infrastructure but that isn't the case. I don't mind having to manually add new nodes in perceus, I was just hoping to eliminate a step in the process ;-). -- Darin Perusich Unix Systems Administrator Cognigen Corporation 395 Youngs Rd. Williamsville, NY 14221 Phone: 716-633-3463 Email: darinper at cognigencorp.com From gwleong at gmail.com Tue Apr 15 20:43:42 2008 From: gwleong at gmail.com (Gary Leong) Date: Tue, 15 Apr 2008 23:43:42 -0400 Subject: [Warewulf] auto-provisioning and hostname matching In-Reply-To: <4803A630.10109@cognigencorp.com> References: <4803A630.10109@cognigencorp.com> Message-ID: you can preassign the hostname according to mac address by doing something like... e.g. perceus node add ?i 00:19:B9:E4:CB:0A blizzard On Mon, Apr 14, 2008 at 2:45 PM, Darin Perusich wrote: > Does or can perceus perform any type of MAC address to hostname matching > when nodes are being automatically provisioned? It appears to me that > perceus takes the MAC of an incoming host and assigns whatever the next > available node### is without preforming any type of arp/hostname > matching. I can see where this might be advantageous if the clients > haven't been registered in any existing dns/dhcp infrastructure but that > isn't the case. > > I don't mind having to manually add new nodes in perceus, I was just > hoping to eliminate a step in the process ;-). > > -- > Darin Perusich > Unix Systems Administrator > Cognigen Corporation > 395 Youngs Rd. > Williamsville, NY 14221 > Phone: 716-633-3463 > Email: darinper at cognigencorp.com > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From Darin.Perusich at cognigencorp.com Wed Apr 16 04:49:14 2008 From: Darin.Perusich at cognigencorp.com (Darin Perusich) Date: Wed, 16 Apr 2008 07:49:14 -0400 Subject: [Warewulf] auto-provisioning and hostname matching In-Reply-To: References: <4803A630.10109@cognigencorp.com> Message-ID: <4805E7BA.6020208@cognigencorp.com> This is what I'm currently doing, but the auto-provisioning mechanism does not, it adds the node as the next available unprovisioned node. Gary Leong wrote: > you can preassign the hostname according to mac address by doing > something like... > > e.g. > > perceus node add ?i 00:19:B9:E4:CB:0A blizzard > > > On Mon, Apr 14, 2008 at 2:45 PM, Darin Perusich > wrote: >> Does or can perceus perform any type of MAC address to hostname matching >> when nodes are being automatically provisioned? It appears to me that >> perceus takes the MAC of an incoming host and assigns whatever the next >> available node### is without preforming any type of arp/hostname >> matching. I can see where this might be advantageous if the clients >> haven't been registered in any existing dns/dhcp infrastructure but that >> isn't the case. >> >> I don't mind having to manually add new nodes in perceus, I was just >> hoping to eliminate a step in the process ;-). >> >> -- >> Darin Perusich >> Unix Systems Administrator >> Cognigen Corporation >> 395 Youngs Rd. >> Williamsville, NY 14221 >> Phone: 716-633-3463 >> Email: darinper at cognigencorp.com >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf -- Darin Perusich Unix Systems Administrator Cognigen Corporation 395 Youngs Rd. Williamsville, NY 14221 Phone: 716-633-3463 Email: darinper at cognigencorp.com From travis.minor at wsm.com Thu Apr 24 10:18:21 2008 From: travis.minor at wsm.com (Travis Minor) Date: Thu, 24 Apr 2008 10:18:21 -0700 Subject: [Warewulf] Missing Cluster Address Message-ID: <4810C0DD.9080603@wsm.com> HI all, I'm running into a weird problem where the Cluster network address is not being picked up by the system using WW 2.6.3 on CentOS 5.1. I've defined the admin, sharedfs, and cluster networks in /etc/warewulf/nodes/default/"NODENAME", but after running wwnodes --sync my /etc/hosts file is missing the cluster network IP. ---------------------------------------------------------------------------------------------------------- # Node 'NODENAME' configuration (group: nodegroup1) 192.168.150.12 NODENAME-admin NODENAME NODENAME-cluster 192.168.150.12 NODENAME-sharedfs ---------------------------------------------------------------------------------------------------------- Similar results from wwnodes --info NODENAME ---------------------------------------------------------------------------------------------------------- wwnodes --info NODENAME === NODE CONFIGIRATION === Node name: NODENAME Description: Default node group Node group: nodegroup1 VNFS: default Kernel: kernel wwinitrd: wwinitrd.img Boot MAC: MACADDR Secondary MAC: Users: gmk, god User Groups: group1 === NETWORK CONFIGURATION === Admin dev: eth0 Admin address: 192.168.150.12/255.255.0.0 SFS dev: eth0 SFS address: 192.168.150.12/255.255.0.0 Cluster dev: eth0 Cluster address: /255.255.0.0 ---------------------------------------------------------------------------------------------------------- The node configuration file specifies the cluster address as 192.168.150.12 as well. ---------------------------------------------------------------------------------------------------------- cat /etc/warewulf/nodes/nodegroup1/NODENAME | grep cluster cluster ipaddr = 192.168.150.12 ---------------------------------------------------------------------------------------------------------- I've replaced the /usr/lib/warewulf/modules/sync_defaults script with one from a working WW 2.6.3 cluster to no effect, I'm pretty sure I've just screwed up something simple I'm overlooking. Any thoughts? -Travis From travis.minor at wsm.com Thu Apr 24 10:34:28 2008 From: travis.minor at wsm.com (Travis Minor) Date: Thu, 24 Apr 2008 10:34:28 -0700 Subject: [Warewulf] Missing Cluster Address In-Reply-To: <4810C0DD.9080603@wsm.com> References: <4810C0DD.9080603@wsm.com> Message-ID: <4810C4A4.9040303@wsm.com> Nevermind. If any one's interested, the problem was here: > cat /etc/warewulf/nodes/nodegroup1/NODENAME | grep cluster > cluster ipaddr = 192.168.150.12 ^ I had two spaces between cluster and address so the config was being ignored. Thanks anyway. Travis Minor wrote: > HI all, > > I'm running into a weird problem where the Cluster network address is > not being picked up by the system using WW 2.6.3 on CentOS 5.1. > > I've defined the admin, sharedfs, and cluster networks in > /etc/warewulf/nodes/default/"NODENAME", but after running wwnodes --sync > my /etc/hosts file is missing the cluster network IP. > ---------------------------------------------------------------------------------------------------------- > # Node 'NODENAME' configuration (group: nodegroup1) > 192.168.150.12 NODENAME-admin > NODENAME NODENAME-cluster > 192.168.150.12 NODENAME-sharedfs > ---------------------------------------------------------------------------------------------------------- > > > Similar results from wwnodes --info NODENAME > ---------------------------------------------------------------------------------------------------------- > wwnodes --info NODENAME > > === NODE CONFIGIRATION === > > Node name: NODENAME > Description: Default node group > > Node group: nodegroup1 VNFS: default > Kernel: kernel wwinitrd: wwinitrd.img > Boot MAC: MACADDR Secondary MAC: > Users: gmk, god > > User Groups: group1 > > > === NETWORK CONFIGURATION === > > Admin dev: eth0 Admin address: 192.168.150.12/255.255.0.0 > SFS dev: eth0 SFS address: 192.168.150.12/255.255.0.0 > Cluster dev: eth0 Cluster address: /255.255.0.0 > > > ---------------------------------------------------------------------------------------------------------- > > > The node configuration file specifies the cluster address as > 192.168.150.12 as well. > ---------------------------------------------------------------------------------------------------------- > cat /etc/warewulf/nodes/nodegroup1/NODENAME | grep cluster > cluster ipaddr = 192.168.150.12 > ---------------------------------------------------------------------------------------------------------- > > I've replaced the /usr/lib/warewulf/modules/sync_defaults script with > one from a working WW 2.6.3 cluster to no effect, I'm pretty sure I've > just screwed up something simple I'm overlooking. > > Any thoughts? > > -Travis > > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf -- Travis Minor Technical Engineer travis.minor at wsm.com Western Scientific, Inc http://www.wsm.com 5444 Napa Street San Diego, CA 92110 Tel 619.220.6580 x219 Fax 619.220.6590 From jsalbinson at arcastro.co.uk Thu Apr 24 12:43:33 2008 From: jsalbinson at arcastro.co.uk (James Albinson) Date: Thu, 24 Apr 2008 20:43:33 +0100 Subject: [Warewulf] Suse 10.3 detect.c build failure - a solution... Also Warewulf 2.6.3 gzip build failure - a solution. Message-ID: <200804242043.34418.jsalbinson@arcastro.co.uk> Bug report for perceus 1.3.7 on Suse 10.3 i386 (not 64bit!). make fails for detect.c as one of the required includes is missing from /usr/include/linux. The following thump fixes it... (from ~werever/perceus.1.3.7) cp ./3rd_party/_work/kernel/linux-2.6.21.5/include/linux/pci_ids.h /usr/include/linux and than make performs as advertised... My machine is a Piii 900MHz/768Mb uname -a gives... Linux cogito 2.6.22.17-0.1-default #1 SMP 2008/02/10 20:01:04 UTC i686 i686 i386 GNU/Linux Chhers, James Albinson PS. A big thanks for putting up the warewulf 3 sources. These compile fine under Suse 10.3 (i386). I havn't tested an install yet... From jsalbinson at arcastro.co.uk Thu Apr 24 12:49:12 2008 From: jsalbinson at arcastro.co.uk (James Albinson) Date: Thu, 24 Apr 2008 20:49:12 +0100 Subject: [Warewulf] Problem with warewulf 2.6.3 - gzip.c has a typo ??? Message-ID: <200804242049.12820.jsalbinson@arcastro.co.uk> Bug report for warewulf 2.6.3 Part of the make output... gcc -I/home/jsa/Astro/warewulf-2.6.3/src/busybox/include -I/home/jsa/Astro/warewulf-2.6.3/src/busybox/include -I/home/jsa/Astro/warewulf-2.6.3/src/busybox/libbb -Wall -Wstrict-prototypes -Wshadow -Os -march=i386 -mpreferred-stack-boundary=2 -falign-functions=0 -falign-jumps=0 -falign-loops=0 -fomit-frame-pointer -D_GNU_SOURCE -DNDEBUG -c -o /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.o /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c:2146: warning: type qualifiers ignored on function return type /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c:2146: error: conflicting types for 'build_bl_tree' /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c:1623: error: previous declaration of 'build_bl_tree' was here make[1]: *** [/home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.o] Error 1 make[1]: Leaving directory `/home/jsa/Astro/warewulf-2.6.3/src/busybox' make: *** [wwinitrd] Error 2 Looking at gzip.c at the declaration of build_bl_tree, we see: static void init_block(void); static void pqdownheap(ct_data * tree, int k); static void gen_bitlen(tree_desc * desc); static void gen_codes(ct_data * tree, int max_code); static int build_bl_tree(void); <<<<<<<<<<<<<<<<<<<<<<<<< Run make in the src/...../archival directory after you make the edit. Otjherwise the gzip.c get refreshed from somewhere else and frustrates you. Chhers, JAmes Albinson too much coffee, too little sleep... From gmkurtzer at gmail.com Thu Apr 24 15:14:10 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Thu, 24 Apr 2008 15:14:10 -0700 Subject: [Warewulf] Suse 10.3 detect.c build failure - a solution... Also Warewulf 2.6.3 gzip build failure - a solution. In-Reply-To: <200804242043.34418.jsalbinson@arcastro.co.uk> References: <200804242043.34418.jsalbinson@arcastro.co.uk> Message-ID: <571f1a060804241514h6a4a184dqe4d93c5b69ddd12d@mail.gmail.com> Yes, this is a known issue with compiling on SuSE. For some reason SuSE doesn't seem to include the pci_ids.h header in the glibc kernel headers (/usr/include/linux/). Potential solutions: - Crude hack and incorporate a pci_ids.h file from a recent kenrel in the detect source (ewwwww) - Document a workaround of copying a kernel source tree's pci_ids.h to the right location - Post a bug report with SuSE If SuSE has another header that should be used in place of this, that would be fine but as we don't develop on SuSE we would want someone to follow up with that, or have SuSE check in with us (I heard some rumors that they have some Perceus packages somewhere too). Thanks for letting us know, and if you have any other ideas or solutions, please let us know too. Thanks! Greg On Thu, Apr 24, 2008 at 12:43 PM, James Albinson wrote: > Bug report for perceus 1.3.7 on Suse 10.3 i386 (not 64bit!). > make fails for detect.c as one of the required includes is missing > from /usr/include/linux. > The following thump fixes it... (from ~werever/perceus.1.3.7) > > cp ./3rd_party/_work/kernel/linux-2.6.21.5/include/linux/pci_ids.h /usr/include/linux > > and than make performs as advertised... > > My machine is a Piii 900MHz/768Mb > uname -a gives... > Linux cogito 2.6.22.17-0.1-default #1 SMP 2008/02/10 20:01:04 UTC i686 i686 > i386 GNU/Linux > > Chhers, James Albinson > > PS. A big thanks for putting up the warewulf 3 sources. These compile fine > under Suse 10.3 (i386). I havn't tested an install yet... > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.runlevelzero.net/ From gmkurtzer at gmail.com Thu Apr 24 15:20:06 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Thu, 24 Apr 2008 15:20:06 -0700 Subject: [Warewulf] Problem with warewulf 2.6.3 - gzip.c has a typo ??? In-Reply-To: <200804242049.12820.jsalbinson@arcastro.co.uk> References: <200804242049.12820.jsalbinson@arcastro.co.uk> Message-ID: <571f1a060804241520v69213dc2gfd6de4ffd310c4cb@mail.gmail.com> All versions of Warewulf < 3 are no longer maintained, but this is a good catch and may help some of our legacy users. :) Thanks! Greg On Thu, Apr 24, 2008 at 12:49 PM, James Albinson wrote: > Bug report for warewulf 2.6.3 > Part of the make output... > > gcc -I/home/jsa/Astro/warewulf-2.6.3/src/busybox/include -I/home/jsa/Astro/warewulf-2.6.3/src/busybox/include -I/home/jsa/Astro/warewulf-2.6.3/src/busybox/libbb -Wall -Wstrict-prototypes -Wshadow -Os -march=i386 -mpreferred-stack-boundary=2 -falign-functions=0 -falign-jumps=0 -falign-loops=0 -fomit-frame-pointer -D_GNU_SOURCE -DNDEBUG -c -o /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.o /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c > /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c:2146: warning: type > qualifiers ignored on function return type > /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c:2146: error: > conflicting types for 'build_bl_tree' > /home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.c:1623: error: > previous declaration of 'build_bl_tree' was here > make[1]: *** [/home/jsa/Astro/warewulf-2.6.3/src/busybox/archival/gzip.o] > Error 1 > make[1]: Leaving directory `/home/jsa/Astro/warewulf-2.6.3/src/busybox' > make: *** [wwinitrd] Error 2 > > > Looking at gzip.c at the declaration of build_bl_tree, we see: > > static void init_block(void); > static void pqdownheap(ct_data * tree, int k); > static void gen_bitlen(tree_desc * desc); > static void gen_codes(ct_data * tree, int max_code); > static int build_bl_tree(void); <<<<<<<<<<<<<<<<<<<<<<<<< static void scan_tree(ct_data * tree, int max_code); > static void send_tree(ct_data * tree, int max_code); > static int build_bl_tree(void); <<<<<<<<<<<<<<<<<<<<<<<<< also here???Eh????? > static void send_all_trees(int lcodes, int dcodes, int blcodes); > static void compress_block(ct_data * ltree, ct_data * dtree); > static void set_file_type(void); > > > Now further down see here... > > /* =========================================================================== > * Construct the Huffman tree for the bit lengths and return the index in > * bl_order of the last bit length code to send. > */ > static const int build_bl_tree() <<<<<<< 'const' has crept in +++++++++++++ > { > int max_blindex; /* index of last bit length code of non zero > freq */ > > ****************************************************************** > IF you zap the 'const' it all compiles... > > Chhers, JAmes Albinson > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.runlevelzero.net/ From gas at knc.ru Fri Apr 25 05:55:20 2008 From: gas at knc.ru (Grigory Shamov) Date: Fri, 25 Apr 2008 16:55:20 +0400 Subject: [Warewulf] Problem with building warewulf 2.6.3 on PowerPC In-Reply-To: <200804242049.12820.jsalbinson@arcastro.co.uk> References: <200804242049.12820.jsalbinson@arcastro.co.uk> Message-ID: <4811D4B8.9060305@knc.ru> Dear All, I have trouble building warewuld 2.6 for PowerPC cluster (IBM pSeries). That is, busybox seem to have some Intel binaries there (conf and mconf things) which causes make to fail. Did anybody built it for PowerPC? Any advice whether it is possible at all? Thank you! I know that Warewulf is superceded by Perceus, but the latter seem to require nasm which is AFAIK not available for the Power arch. -- WBR, Grigory Shamov Kazan Science Centre of RAS Kazan, Russia From Darin.Perusich at cognigencorp.com Fri Apr 25 05:57:21 2008 From: Darin.Perusich at cognigencorp.com (Darin Perusich) Date: Fri, 25 Apr 2008 08:57:21 -0400 Subject: [Warewulf] Suse 10.3 detect.c build failure - a solution... Also Warewulf 2.6.3 gzip build failure - a solution. In-Reply-To: <571f1a060804241514h6a4a184dqe4d93c5b69ddd12d@mail.gmail.com> References: <200804242043.34418.jsalbinson@arcastro.co.uk> <571f1a060804241514h6a4a184dqe4d93c5b69ddd12d@mail.gmail.com> Message-ID: <4811D531.9090609@cognigencorp.com> The work around for this is actually rather straight forward and only requires adding a single line to the perceus.spec file, let's not forget that Percues compiles the kernel during build ;-). In the %configure section add the following line before %{__make} %{?mflags} and Perceus will build successfully. Granted if the kernel version changes the path will need to be updated but that's easily done. This is what I've done for my openSUSE 10.3 build. %configure ln -s ../../3rd_party/_work/kernel/linux-2.6.21.5/include/linux \ src/detect %{__make} %{?mflags} I'll submit a bug report to SuSE's bugzilla about this header not being included. They've been getting a lot from me lately as I've been building my HA Perceus servers so what's one more! Greg Kurtzer wrote: > Yes, this is a known issue with compiling on SuSE. For some reason > SuSE doesn't seem to include the pci_ids.h header in the glibc kernel > headers (/usr/include/linux/). Potential solutions: > > - Crude hack and incorporate a pci_ids.h file from a recent kenrel in > the detect source (ewwwww) > - Document a workaround of copying a kernel source tree's pci_ids.h to > the right location > - Post a bug report with SuSE > > If SuSE has another header that should be used in place of this, that > would be fine but as we don't develop on SuSE we would want someone to > follow up with that, or have SuSE check in with us (I heard some > rumors that they have some Perceus packages somewhere too). > > Thanks for letting us know, and if you have any other ideas or > solutions, please let us know too. > > Thanks! > Greg > > On Thu, Apr 24, 2008 at 12:43 PM, James Albinson > wrote: >> Bug report for perceus 1.3.7 on Suse 10.3 i386 (not 64bit!). >> make fails for detect.c as one of the required includes is missing >> from /usr/include/linux. >> The following thump fixes it... (from ~werever/perceus.1.3.7) >> >> cp ./3rd_party/_work/kernel/linux-2.6.21.5/include/linux/pci_ids.h /usr/include/linux >> >> and than make performs as advertised... >> >> My machine is a Piii 900MHz/768Mb >> uname -a gives... >> Linux cogito 2.6.22.17-0.1-default #1 SMP 2008/02/10 20:01:04 UTC i686 i686 >> i386 GNU/Linux >> >> Chhers, James Albinson >> >> PS. A big thanks for putting up the warewulf 3 sources. These compile fine >> under Suse 10.3 (i386). I havn't tested an install yet... >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > -- Darin Perusich Unix Systems Administrator Cognigen Corporation 395 Youngs Rd. Williamsville, NY 14221 Phone: 716-633-3463 Email: darinper at cognigencorp.com From Carsten.Bellon at bam.de Wed Apr 30 07:54:23 2008 From: Carsten.Bellon at bam.de (Dr. Carsten Bellon) Date: Wed, 30 Apr 2008 16:54:23 +0200 Subject: [Warewulf] Newbie question Message-ID: <4818881F.9040403@bam.de> Im using Warewulf 2.6 quite a while . Now I try to use Perceus 1.3.7 for the first time (on my new TyanPSC T-600). The head node is running CentOS5.1 and I have installed Perceus 1.3.7. When starting the first compute node PXE on eth0 works fine. Then on the node perceus says: Requesting DHCP configuration via ib0 spoofing eth0: ERROR: Could not obtain network configuration Any help? How I have to install/configure the infiniband network? Can I force Perceus not to use infiniband for the moment? Where can I check the dhcp configuration of Perceus? Many thanks! Carsten -- Dr.-Ing. BAM Berlin * VIII.36 * D-12200 Berlin Carsten Bellon Tel/Fax: ++49 30 8104-3658 / -1837 Carsten.Bellon at bam.de From Carsten.Bellon at bam.de Wed Apr 30 09:21:51 2008 From: Carsten.Bellon at bam.de (Dr. Carsten Bellon) Date: Wed, 30 Apr 2008 18:21:51 +0200 Subject: [Warewulf] Perceus 1.3.7: kernel/vnfs error Message-ID: <48189C9F.1040905@bam.de> I newly installed Perceus 1.3.7 on CentOS 5.1. While booting the first compute node I get: Ramdisk not supported with generic elf arguments Thanks for any help. Carsten -- Dr.-Ing. BAM Berlin * VIII.36 * D-12200 Berlin Carsten Bellon Tel/Fax: ++49 30 8104-3658 / -1837 Carsten.Bellon at bam.de From Carsten.Bellon at bam.de Wed Apr 30 10:50:10 2008 From: Carsten.Bellon at bam.de (Dr. Carsten Bellon) Date: Wed, 30 Apr 2008 19:50:10 +0200 Subject: [Warewulf] Perceus 1.3.7: kernel/vnfs error In-Reply-To: <48189C9F.1040905@bam.de> References: <48189C9F.1040905@bam.de> Message-ID: <4818B152.2060503@bam.de> This problem is RHEL5 related. It needs KEXEC_ARGS="--args-linux" in .../vnfs//config Now provisoning works fine. Sorry, for my silly question, which I have answered by myself herewith. Thanks for Perceus!!! Carsten Dr. Carsten Bellon wrote: > I newly installed Perceus 1.3.7 on CentOS 5.1. While booting the first > compute node I get: > > Ramdisk not supported with generic elf arguments > > Thanks for any help. > Carsten > -- Dr.-Ing. BAM Berlin * VIII.36 * D-12200 Berlin Carsten Bellon Tel/Fax: ++49 30 8104-3658 / -1837 Carsten.Bellon at bam.de From gmkurtzer at gmail.com Wed Apr 30 11:54:06 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 30 Apr 2008 11:54:06 -0700 Subject: [Warewulf] Newbie question In-Reply-To: <4818881F.9040403@bam.de> References: <4818881F.9040403@bam.de> Message-ID: <571f1a060804301154y1b882805u8ba82066193ee583@mail.gmail.com> It looks like it isn't finding your network device. Get a debug shell, and type "ifconfig -a". Do any ethernet devices show up? What kind of network card do you have in that node? On Wed, Apr 30, 2008 at 7:54 AM, Dr. Carsten Bellon wrote: > Im using Warewulf 2.6 quite a while . Now I try to use Perceus 1.3.7 for > the first time (on my new TyanPSC T-600). > > The head node is running CentOS5.1 and I have installed Perceus 1.3.7. > > When starting the first compute node PXE on eth0 works fine. Then on the > node perceus says: > > Requesting DHCP configuration via ib0 spoofing eth0: > > ERROR: Could not obtain network configuration > > Any help? > How I have to install/configure the infiniband network? > Can I force Perceus not to use infiniband for the moment? > Where can I check the dhcp configuration of Perceus? > > Many thanks! > Carsten > > -- > Dr.-Ing. BAM Berlin * VIII.36 * D-12200 Berlin > Carsten Bellon Tel/Fax: ++49 30 8104-3658 / -1837 > Carsten.Bellon at bam.de > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.runlevelzero.net/ From jsalbinson at arcastro.co.uk Wed Apr 30 12:52:28 2008 From: jsalbinson at arcastro.co.uk (James Albinson) Date: Wed, 30 Apr 2008 20:52:28 +0100 Subject: [Warewulf] Perceus 1.3.7 - no response to node dhcp requests. In-Reply-To: References: Message-ID: <200804302052.29161.jsalbinson@arcastro.co.uk> Another scenario... The situation. I have a test clusterhost, Athlon 1GHz, 768Mb, 20Gb HD. I have installed Scientific Linux 5.1 (the latest DVD) i386 cpu version. I have d/l perceus 1.3.7 source and built from scratch, installing the dependencies as configure burped... It all built OK in the end. Make install went OK, I ran perceus init all, and it seemed OK. I did /etc/init.d/perceus restart a few times - Seems OK. Also provisiond restart. OK. Then I put the pxe floppy in the client node - connnected to a switch, and then to the cluster host. I used wireshark to monitor the eth1 on the clusterhost, and the node is there knocking on the door. But it doesn't register with perceus-dnsmasq - the lease file is non existent /empty. I went through the make install, perceus init all, and restart daemons twice. Still no joy. Can anyone shed some light on this? I am deeply puzzled. What other tests/monitor can I run? Cheers and thanks, James Albinson From gmkurtzer at gmail.com Wed Apr 30 17:05:49 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 30 Apr 2008 17:05:49 -0700 Subject: [Warewulf] Perceus 1.3.7 - no response to node dhcp requests. In-Reply-To: <200804302052.29161.jsalbinson@arcastro.co.uk> References: <200804302052.29161.jsalbinson@arcastro.co.uk> Message-ID: <571f1a060804301705w1fbadd8diebb6b95fc1fcb9dd@mail.gmail.com> Are there any syslog messages from perceus-dnsmasq on the master? On Wed, Apr 30, 2008 at 12:52 PM, James Albinson wrote: > Another scenario... > The situation. > > I have a test clusterhost, Athlon 1GHz, 768Mb, 20Gb HD. > I have installed Scientific Linux 5.1 (the latest DVD) i386 cpu version. > I have d/l perceus 1.3.7 source and built from scratch, installing the > dependencies as configure burped... It all built OK in the end. > Make install went OK, I ran perceus init all, and it seemed OK. > I did /etc/init.d/perceus restart a few times - Seems OK. Also provisiond > restart. OK. > > Then I put the pxe floppy in the client node - connnected to a switch, and > then to the cluster host. I used wireshark to monitor the eth1 on the > clusterhost, and the node is there knocking on the door. > > But it doesn't register with perceus-dnsmasq - the lease file is non > existent /empty. > > I went through the make install, perceus init all, and restart daemons twice. > Still no joy. > > Can anyone shed some light on this? I am deeply puzzled. What other > tests/monitor can I run? > > Cheers and thanks, > James Albinson > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.runlevelzero.net/