From cmorse at unm.edu Tue Jul 8 12:35:15 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 8 Jul 2008 13:35:15 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules Message-ID: I have a machine that is hanging right after all of the modules are unloaded during the provisioning process. The output from the script isn't very useful, so I would like to modify it for more verbose output. But, I'm not sure where the script is located. -- Caleb -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080708/a3b220a3/attachment.html From griznog at gmail.com Tue Jul 8 13:21:14 2008 From: griznog at gmail.com (John Hanks) Date: Tue, 8 Jul 2008 14:21:14 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: References: Message-ID: On Tue, Jul 8, 2008 at 1:35 PM, Caleb Morse wrote: > I have a machine that is hanging right after all of the modules are unloaded > during the provisioning process. The output from the script isn't very > useful, so I would like to modify it for more verbose output. But, I'm not > sure where the script is located. > I've been looking for an excuse to put these notes somewhere where I wouldn't lose them. http://www.perceus.org/portal/faq/6#6n130 init is probably the script you want to modify but I've also had to add debug stuff to the hw_load scripts to figure some things out. It's pretty straightforward how it all works Hope that helps. jbh From gmkurtzer at gmail.com Tue Jul 8 14:04:46 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 8 Jul 2008 14:04:46 -0700 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: References: Message-ID: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> On Tue, Jul 8, 2008 at 1:21 PM, John Hanks wrote: > On Tue, Jul 8, 2008 at 1:35 PM, Caleb Morse wrote: >> I have a machine that is hanging right after all of the modules are unloaded >> during the provisioning process. The output from the script isn't very >> useful, so I would like to modify it for more verbose output. But, I'm not >> sure where the script is located. What kind of hardware is this, and more specifically, what network card/driver is being used? It might be worth while to test drive the 1.4 release. Contact me offline and I can coordinate with you on this, and build a beta release tarball/RPMS to see if newer drivers, and scripts help. >> > > I've been looking for an excuse to put these notes somewhere where I > wouldn't lose them. > > http://www.perceus.org/portal/faq/6#6n130 > > init is probably the script you want to modify but I've also had to > add debug stuff to the hw_load scripts to figure some things out. It's > pretty straightforward how it all works ego++; hehe As far as the upcoming 1.4 release, most of the node script structures have undergone some general optimizations, but the internal stage 1 initialization scripts are getting reworked to support more complex nodescripts. Now is the time to voice all opinions so let us know what's missing! :) Thanks, Greg -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From william.strossman at ucr.edu Tue Jul 8 14:20:42 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Tue, 8 Jul 2008 14:20:42 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 Message-ID: I am having a problem that has been mention many times in the past, but none of those solutions have helped. The nodes hang forever after the message "Provisioning from 192.168.10.1". Running perceusd in debug mode shows that 6 nodescripts are run and then the socket is closed and it goes back to eval(). This is repeated ad nauseum forever. Running provisiond in debug mode reveals no information whatsoever. Adding debug=4 to the pxelinux.cfg/default file shows that a SIGALRM is received twice and after the second time an "unable to connect to 192.168.10.1, bad file descriptor" (it connects to port 987 the first time). This sequence is also repeated adnauseum. I have attached a portion of the output from perceusd in debug mode. There are a few lines like "Parsing provisiond's arguments (^@)" in there (a null character? which would be odd because there are options after it in the /etc/init.d/provisiond file) Any ideas or need any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the provided script to create the vnfs capsule. Thanks, Bill S. -------------- next part -------------- DEBUG [perceusd/47/MAIN()]: Starting MAIN() DEBUG [Sanity.pm/46/sanity_check()]: Entered function DEBUG [Util.pm/357/parse_config()]: Entered function DEBUG [Util.pm/87/getargs()]: Processed argument for Perceus::Util::parse_config: '/etc/perceus/perceus.conf' DEBUG [Util.pm/376/parse_config()]: config key: 'vnfs transfer master', value: '192.168.10.1' DEBUG [Util.pm/376/parse_config()]: config key: 'vnfs transfer method', value: 'nfs' DEBUG [Util.pm/378/parse_config()]: omitting config line: 'vnfs transfer prefix =' DEBUG [Util.pm/376/parse_config()]: config key: 'node timeout', value: '100' DEBUG [Util.pm/384/parse_config()]: Returning function with hash array DEBUG [Sanity.pm/64/sanity_check()]: Returning function DEBUG [Util.pm/357/parse_config()]: Entered function DEBUG [Util.pm/87/getargs()]: Processed argument for Perceus::Util::parse_config: '/etc/perceus/perceus.conf' DEBUG [Util.pm/376/parse_config()]: config key: 'vnfs transfer master', value: '192.168.10.1' DEBUG [Util.pm/376/parse_config()]: config key: 'vnfs transfer method', value: 'nfs' DEBUG [Util.pm/378/parse_config()]: omitting config line: 'vnfs transfer prefix =' DEBUG [Util.pm/376/parse_config()]: config key: 'node timeout', value: '100' DEBUG [Util.pm/384/parse_config()]: Returning function with hash array DEBUG [perceusd/88/MAIN()]: Creating TCP socket at port 987 DEBUG [perceusd/109/MAIN()]: Debug enabled, not gonna fork() DEBUG [perceusd/125/MAIN()]: New connection opened, jumping into eval() DEBUG [perceusd/148/(eval)]: setting alarm to: 5 DEBUG [perceusd/155/(eval)]: Recieved string: 'init ' DEBUG [Util.pm/389/addr2bin()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Util::addr2bin: '192.168.10.128' DEBUG [Util.pm/394/addr2bin()]: Returning function with: 3232238208 DEBUG [perceusd/163/(eval)]: Parsing command arguments 'init ' DEBUG [perceusd/167/(eval)]: Parsing provisiond's arguments () DEBUG [perceusd/183/(eval)]: Doing arp lookup on '192.168.10.128' Use of uninitialized value in subroutine entry at /usr/sbin/perceusd line 184. DEBUG [Nodes.pm/824/nodeid2nodename()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2nodename: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'hostname' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/hostname.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'hostname' database (n0000) DEBUG [DB.pm/384/db_get()]: Returning function with: n0000 DEBUG [Nodes.pm/829/nodeid2nodename()]: Returning function with: n0000 DEBUG [perceusd/190/(eval)]: Entering node init (NodeName: n0000) DEBUG [perceusd/192/(eval)]: Got node 'init' from NodeName:n0000, NodeID=00:15:17:4E:BE:DC DEBUG [Nodes.pm/824/nodeid2nodename()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2nodename: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'hostname' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/hostname.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'hostname' database (n0000) DEBUG [DB.pm/384/db_get()]: Returning function with: n0000 DEBUG [Nodes.pm/829/nodeid2nodename()]: Returning function with: n0000 DEBUG [Nodes.pm/838/nodeid2vnfs()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2vnfs: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'vnfs' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/vnfs.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'vnfs' database (centos-5.1-1.stateless.x86_64) DEBUG [DB.pm/384/db_get()]: Returning function with: centos-5.1-1.stateless.x86_64 DEBUG [Nodes.pm/843/nodeid2vnfs()]: Returning function with: centos-5.1-1.stateless.x86_64 DEBUG [Nodes.pm/852/nodeid2group()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2group: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'group' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/group.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'group' database (cluster) DEBUG [DB.pm/384/db_get()]: Returning function with: cluster DEBUG [Nodes.pm/857/nodeid2group()]: Returning function with: cluster DEBUG [Nodes.pm/880/nodeid2enabled()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2enabled: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'enabled' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/enabled.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'enabled' database (1) DEBUG [DB.pm/384/db_get()]: Returning function with: 1 DEBUG [Nodes.pm/885/nodeid2enabled()]: Returning function with: 1 DEBUG [Nodes.pm/866/nodeid2debug()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2debug: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'debug' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/debug.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'debug' database () DEBUG [DB.pm/387/db_get()]: Returning function undefined DEBUG [Nodes.pm/874/nodeid2debug()]: Returning function undefined DEBUG [Nodes.pm/799/nodestatus()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodestatus: '00:15:17:4E:BE:DC' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodestatus: 'init' DEBUG [DB.pm/115/db_put()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: 'status' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: '00:15:17:4E:BE:DC' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: 'init' DEBUG [DB.pm/124/db_put()]: opening /var/lib/perceus//database/status.db DEBUG [DB.pm/129/db_put()]: Adding key: 00:15:17:4E:BE:DC, value: init to 'status' database DEBUG [DB.pm/147/db_put()]: Returning function undefined DEBUG [DB.pm/115/db_put()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: 'lastcontact' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: '00:15:17:4E:BE:DC' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: '1215537620' DEBUG [DB.pm/124/db_put()]: opening /var/lib/perceus//database/lastcontact.db DEBUG [DB.pm/129/db_put()]: Adding key: 00:15:17:4E:BE:DC, value: 1215537620 to 'lastcontact' database DEBUG [DB.pm/147/db_put()]: Returning function undefined DEBUG [Nodes.pm/807/nodestatus()]: Returning function DEBUG [Nodes.pm/812/nodeipaddr()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeipaddr: '00:15:17:4E:BE:DC' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeipaddr: '3232238208' DEBUG [DB.pm/115/db_put()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: 'ipaddr' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: '00:15:17:4E:BE:DC' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_put: '3232238208' DEBUG [DB.pm/124/db_put()]: opening /var/lib/perceus//database/ipaddr.db DEBUG [DB.pm/129/db_put()]: Adding key: 00:15:17:4E:BE:DC, value: 3232238208 to 'ipaddr' database DEBUG [DB.pm/147/db_put()]: Returning function undefined DEBUG [Nodes.pm/819/nodeipaddr()]: Returning function Use of uninitialized value in scalar assignment at /usr/sbin/perceusd line 247. DEBUG [Nodes.pm/683/nodescriptlist()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodescriptlist: '00:15:17:4E:BE:DC' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodescriptlist: 'init' DEBUG [Nodes.pm/824/nodeid2nodename()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2nodename: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'hostname' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/hostname.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'hostname' database (n0000) DEBUG [DB.pm/384/db_get()]: Returning function with: n0000 DEBUG [Nodes.pm/829/nodeid2nodename()]: Returning function with: n0000 DEBUG [Nodes.pm/838/nodeid2vnfs()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2vnfs: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'vnfs' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/vnfs.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'vnfs' database (centos-5.1-1.stateless.x86_64) DEBUG [DB.pm/384/db_get()]: Returning function with: centos-5.1-1.stateless.x86_64 DEBUG [Nodes.pm/843/nodeid2vnfs()]: Returning function with: centos-5.1-1.stateless.x86_64 DEBUG [Nodes.pm/852/nodeid2group()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Nodes::nodeid2group: '00:15:17:4E:BE:DC' DEBUG [DB.pm/359/db_get()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: 'group' DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::DB::db_get: '00:15:17:4E:BE:DC' DEBUG [DB.pm/366/db_get()]: opening /var/lib/perceus//database/group.db DEBUG [DB.pm/372/db_get()]: Getting value for key: 00:15:17:4E:BE:DC from 'group' database (cluster) DEBUG [DB.pm/384/db_get()]: Returning function with: cluster DEBUG [Nodes.pm/857/nodeid2group()]: Returning function with: cluster DEBUG [Nodes.pm/704/nodescriptlist()]: script list now: DEBUG [Nodes.pm/705/nodescriptlist()]: Checking: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64 for scripts DEBUG [Nodes.pm/708/nodescriptlist()]: script list now: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh DEBUG [Nodes.pm/712/nodescriptlist()]: Checking: /var/lib/perceus//nodescripts/init/all for scripts DEBUG [Nodes.pm/715/nodescriptlist()]: script list now: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh /var/lib/perceus//nodescripts/init/all/00-init.sh /var/lib/perceus//nodescripts/init/all/80-close.sh DEBUG [Nodes.pm/719/nodescriptlist()]: Checking: /var/lib/perceus//nodescripts/init/group/cluster for scripts DEBUG [Nodes.pm/722/nodescriptlist()]: script list now: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh /var/lib/perceus//nodescripts/init/all/00-init.sh /var/lib/perceus//nodescripts/init/all/80-close.sh DEBUG [Nodes.pm/726/nodescriptlist()]: Checking: /var/lib/perceus//nodescripts/init/node/n0000 for scripts DEBUG [Nodes.pm/729/nodescriptlist()]: script list now: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh /var/lib/perceus//nodescripts/init/all/00-init.sh /var/lib/perceus//nodescripts/init/all/80-close.sh DEBUG [Nodes.pm/738/nodescriptlist()]: reordering /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh (10-vnfsinit.sh) DEBUG [Nodes.pm/738/nodescriptlist()]: reordering /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh (89-vnfsinit.sh) DEBUG [Nodes.pm/738/nodescriptlist()]: reordering /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh (99-vnfsinit.sh) DEBUG [Nodes.pm/738/nodescriptlist()]: reordering /var/lib/perceus//nodescripts/init/all/00-init.sh (00-init.sh) DEBUG [Nodes.pm/738/nodescriptlist()]: reordering /var/lib/perceus//nodescripts/init/all/80-close.sh (80-close.sh) DEBUG [Nodes.pm/742/nodescriptlist()]: sorted 00-init.sh (/var/lib/perceus//nodescripts/init/all/00-init.sh) DEBUG [Nodes.pm/742/nodescriptlist()]: sorted 10-vnfsinit.sh (/var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh) DEBUG [Nodes.pm/742/nodescriptlist()]: sorted 80-close.sh (/var/lib/perceus//nodescripts/init/all/80-close.sh) DEBUG [Nodes.pm/742/nodescriptlist()]: sorted 89-vnfsinit.sh (/var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh) DEBUG [Nodes.pm/742/nodescriptlist()]: sorted 99-vnfsinit.sh (/var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh) DEBUG [Nodes.pm/745/nodescriptlist()]: returning(nodescriptlist): /var/lib/perceus//nodescripts/init/all/00-init.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh /var/lib/perceus//nodescripts/init/all/80-close.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh DEBUG [Nodes.pm/747/nodescriptlist()]: Returning function with: /var/lib/perceus//nodescripts/init/all/00-init.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh /var/lib/perceus//nodescripts/init/all/80-close.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh DEBUG [perceusd/253/(eval)]: Running nodescript: /var/lib/perceus//nodescripts/init/all/00-init.sh DEBUG [perceusd/253/(eval)]: Running nodescript: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/10-vnfsinit.sh DEBUG [perceusd/253/(eval)]: Running nodescript: /var/lib/perceus//nodescripts/init/all/80-close.sh DEBUG [perceusd/253/(eval)]: Running nodescript: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/89-vnfsinit.sh DEBUG [perceusd/253/(eval)]: Running nodescript: /var/lib/perceus//nodescripts/init/vnfs/centos-5.1-1.stateless.x86_64/99-vnfsinit.sh DEBUG [perceusd/270/(eval)]: Shutting down the client socket DEBUG [perceusd/272/(eval)]: Connection closed DEBUG [perceusd/125/MAIN()]: New connection opened, jumping into eval() DEBUG [perceusd/148/(eval)]: setting alarm to: 5 DEBUG [perceusd/155/(eval)]: Recieved string: 'init ' DEBUG [Util.pm/389/addr2bin()]: Entered function DEBUG [Util.pm/69/getarg()]: Processed argument for Perceus::Util::addr2bin: '192.168.10.128' DEBUG [Util.pm/394/addr2bin()]: Returning function with: 3232238208 DEBUG [perceusd/163/(eval)]: Parsing command arguments 'init ' DEBUG [perceusd/167/(eval)]: Parsing provisiond's arguments () DEBUG [perceusd/183/(eval)]: Doing arp lookup on '192.168.10.128' --- this is repeated endlessly. Had to truncate to get past size requirements. From cmorse at unm.edu Tue Jul 8 14:25:33 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 8 Jul 2008 15:25:33 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> References: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> Message-ID: On Tue, Jul 8, 2008 at 15:04, Greg Kurtzer wrote: > On Tue, Jul 8, 2008 at 1:21 PM, John Hanks wrote: > > On Tue, Jul 8, 2008 at 1:35 PM, Caleb Morse wrote: > >> I have a machine that is hanging right after all of the modules are > unloaded > >> during the provisioning process. The output from the script isn't very > >> useful, so I would like to modify it for more verbose output. But, I'm > not > >> sure where the script is located. > > What kind of hardware is this, and more specifically, what network > card/driver is being used? > > It might be worth while to test drive the 1.4 release. Contact me > offline and I can coordinate with you on this, and build a beta > release tarball/RPMS to see if newer drivers, and scripts help. I was able to narrow down the hardware that was causing the problem. Disabling the SCSI controller (it isn't even needed) fixed the hang. Unfortunately I now get the following error message. ******************* Total provision time: 8 s Ramdisks not supported with generic elf arguemnts ******************* Then I get the "Waiting 30 seconds and rebooting message. I would be willing to test drive the 1.4 release. By "contact me offline" did you just mean email you directly? > > > >> > > > > I've been looking for an excuse to put these notes somewhere where I > > wouldn't lose them. > > > > http://www.perceus.org/portal/faq/6#6n130 > > > > init is probably the script you want to modify but I've also had to > > add debug stuff to the hw_load scripts to figure some things out. It's > > pretty straightforward how it all works > > ego++; hehe > > As far as the upcoming 1.4 release, most of the node script structures > have undergone some general optimizations, but the internal stage 1 > initialization scripts are getting reworked to support more complex > nodescripts. Now is the time to voice all opinions so let us know > what's missing! :) > > Thanks, > Greg > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080708/2ad951b8/attachment.html From gmkurtzer at gmail.com Tue Jul 8 14:26:30 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 8 Jul 2008 14:26:30 -0700 Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: References: Message-ID: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> What version of perl-Net-ARP do you have installed? Also, I don't really like the extra characters (e.g. newline) after the init command. Not sure this is causing your direct problem, but lets start there... Greg On Tue, Jul 8, 2008 at 2:20 PM, wrote: > I am having a problem that has been mention many times in the past, > but none of those solutions have helped. The nodes hang forever after the > message "Provisioning from 192.168.10.1". Running perceusd in debug mode > shows that 6 nodescripts are run and then the socket is closed and it goes > back to eval(). This is repeated ad nauseum forever. Running provisiond in > debug mode reveals no information whatsoever. Adding debug=4 to the > pxelinux.cfg/default file shows that a SIGALRM is received twice and after > the second time an "unable to connect to 192.168.10.1, bad file descriptor" > (it connects to port 987 the first time). This sequence is also repeated > adnauseum. I have attached a portion of the output from > perceusd in debug mode. There are a few lines like "Parsing provisiond's > arguments (^@)" in there (a null character? which would be odd because there > are options after it in the /etc/init.d/provisiond file) Any ideas or need > any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the > provided script to create the vnfs capsule. > > Thanks, > > Bill S. > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From william.strossman at ucr.edu Tue Jul 8 14:29:49 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Tue, 8 Jul 2008 14:29:49 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> Message-ID: Hi Greg: perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these are Opterons, if that makes a difference. So those "^@" are newlines? Thanks, Bill S. UC Riverside On Tue, 8 Jul 2008, Greg Kurtzer wrote: > What version of perl-Net-ARP do you have installed? > > Also, I don't really like the extra characters (e.g. newline) after > the init command. Not sure this is causing your direct problem, but > lets start there... > > Greg > > > On Tue, Jul 8, 2008 at 2:20 PM, wrote: >> I am having a problem that has been mention many times in the past, >> but none of those solutions have helped. The nodes hang forever after the >> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >> shows that 6 nodescripts are run and then the socket is closed and it goes >> back to eval(). This is repeated ad nauseum forever. Running provisiond in >> debug mode reveals no information whatsoever. Adding debug=4 to the >> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >> (it connects to port 987 the first time). This sequence is also repeated >> adnauseum. I have attached a portion of the output from >> perceusd in debug mode. There are a few lines like "Parsing provisiond's >> arguments (^@)" in there (a null character? which would be odd because there >> are options after it in the /etc/init.d/provisiond file) Any ideas or need >> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >> provided script to create the vnfs capsule. >> >> Thanks, >> >> Bill S. >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> >> > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From cmorse at unm.edu Tue Jul 8 14:45:01 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 8 Jul 2008 15:45:01 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: References: Message-ID: Could you add how to create the initramfs image? I found these commands online that worked for me. $ find . | cpio -H newc -o > ../initramfs.cpio $ cd .. $ cat initramfs.cpio | gzip > initramfs.igz -- Caleb On Tue, Jul 8, 2008 at 14:21, John Hanks wrote: > On Tue, Jul 8, 2008 at 1:35 PM, Caleb Morse wrote: > > I have a machine that is hanging right after all of the modules are > unloaded > > during the provisioning process. The output from the script isn't very > > useful, so I would like to modify it for more verbose output. But, I'm > not > > sure where the script is located. > > > > I've been looking for an excuse to put these notes somewhere where I > wouldn't lose them. > > http://www.perceus.org/portal/faq/6#6n130 > > init is probably the script you want to modify but I've also had to > add debug stuff to the hw_load scripts to figure some things out. It's > pretty straightforward how it all works > > Hope that helps. > > jbh > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080708/53fac9f7/attachment.html From gmkurtzer at gmail.com Tue Jul 8 14:49:09 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 8 Jul 2008 14:49:09 -0700 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: References: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> Message-ID: <571f1a060807081449h2af8282bre45d56a1ce383ea0@mail.gmail.com> On Tue, Jul 8, 2008 at 2:25 PM, Caleb Morse wrote: > On Tue, Jul 8, 2008 at 15:04, Greg Kurtzer wrote: >> >> On Tue, Jul 8, 2008 at 1:21 PM, John Hanks wrote: >> > On Tue, Jul 8, 2008 at 1:35 PM, Caleb Morse wrote: >> >> I have a machine that is hanging right after all of the modules are >> >> unloaded >> >> during the provisioning process. The output from the script isn't very >> >> useful, so I would like to modify it for more verbose output. But, I'm >> >> not >> >> sure where the script is located. >> >> What kind of hardware is this, and more specifically, what network >> card/driver is being used? >> >> It might be worth while to test drive the 1.4 release. Contact me >> offline and I can coordinate with you on this, and build a beta >> release tarball/RPMS to see if newer drivers, and scripts help. > > I was able to narrow down the hardware that was causing the problem. > Disabling the SCSI controller (it isn't even needed) fixed the hang. > > Unfortunately I now get the following error message. > > ******************* > Total provision time: 8 s > > Ramdisks not supported with generic elf arguemnts > ******************* Check the VNFS configuration under KEXEC_ARGS. If your using RHEL5, you may need to add --args-linux. > > Then I get the "Waiting 30 seconds and rebooting message. > > I would be willing to test drive the 1.4 release. By "contact me offline" > did you just mean email you directly? Yes. > >> >> >> >> > >> > I've been looking for an excuse to put these notes somewhere where I >> > wouldn't lose them. >> > >> > http://www.perceus.org/portal/faq/6#6n130 >> > >> > init is probably the script you want to modify but I've also had to >> > add debug stuff to the hw_load scripts to figure some things out. It's >> > pretty straightforward how it all works >> >> ego++; hehe >> >> As far as the upcoming 1.4 release, most of the node script structures >> have undergone some general optimizations, but the internal stage 1 >> initialization scripts are getting reworked to support more complex >> nodescripts. Now is the time to voice all opinions so let us know >> what's missing! :) >> >> Thanks, >> Greg >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From gmkurtzer at gmail.com Tue Jul 8 15:09:18 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 8 Jul 2008 15:09:18 -0700 Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> Message-ID: <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> That Net-ARP looks new enough, but just for kicks can you test out the version here: http://www.perceus.org/downloads/perceus/v1.x/dependencies/ If that works, I will make sure that 1.4 is updated for the newer version of Net::ARP. Greg On Tue, Jul 8, 2008 at 2:29 PM, wrote: > Hi Greg: > perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these > are Opterons, if that makes a difference. So those "^@" are newlines? > > Thanks, > > Bill S. > UC Riverside > > On Tue, 8 Jul 2008, Greg Kurtzer wrote: > >> What version of perl-Net-ARP do you have installed? >> >> Also, I don't really like the extra characters (e.g. newline) after >> the init command. Not sure this is causing your direct problem, but >> lets start there... >> >> Greg >> >> >> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>> I am having a problem that has been mention many times in the past, >>> but none of those solutions have helped. The nodes hang forever after the >>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>> shows that 6 nodescripts are run and then the socket is closed and it goes >>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>> debug mode reveals no information whatsoever. Adding debug=4 to the >>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>> (it connects to port 987 the first time). This sequence is also repeated >>> adnauseum. I have attached a portion of the output from >>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>> arguments (^@)" in there (a null character? which would be odd because there >>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>> provided script to create the vnfs capsule. >>> >>> Thanks, >>> >>> Bill S. >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >>> >> >> >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From william.strossman at ucr.edu Tue Jul 8 15:38:27 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Tue, 8 Jul 2008 15:38:27 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> Message-ID: It looks like I get the same result. One other thing though: I rebuilt the vnfs capsule, this time only adding the perceus-provisiond package (and chkconfig'ing it on) and I no longer get the "bad file descriptor" message. I will try to post the repeated output of provisiond. Thanks, Bill S. UC Riverside On Tue, 8 Jul 2008, Greg Kurtzer wrote: > That Net-ARP looks new enough, but just for kicks can you test out the > version here: > > http://www.perceus.org/downloads/perceus/v1.x/dependencies/ > > If that works, I will make sure that 1.4 is updated for the newer > version of Net::ARP. > > Greg > > > > On Tue, Jul 8, 2008 at 2:29 PM, wrote: >> Hi Greg: >> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >> are Opterons, if that makes a difference. So those "^@" are newlines? >> >> Thanks, >> >> Bill S. >> UC Riverside >> >> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >> >>> What version of perl-Net-ARP do you have installed? >>> >>> Also, I don't really like the extra characters (e.g. newline) after >>> the init command. Not sure this is causing your direct problem, but >>> lets start there... >>> >>> Greg >>> >>> >>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>> I am having a problem that has been mention many times in the past, >>>> but none of those solutions have helped. The nodes hang forever after the >>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>> (it connects to port 987 the first time). This sequence is also repeated >>>> adnauseum. I have attached a portion of the output from >>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>> arguments (^@)" in there (a null character? which would be odd because there >>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>> provided script to create the vnfs capsule. >>>> >>>> Thanks, >>>> >>>> Bill S. >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>>> >>> >>> >>> >>> -- >>> Greg Kurtzer >>> http://www.infiscale.com/ >>> http://www.runlevelzero.net/ >>> http://www.perceus.org/ >>> http://www.caoslinux.org/ >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From william.strossman at ucr.edu Tue Jul 8 15:46:32 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Tue, 8 Jul 2008 15:46:32 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> Message-ID: Here is what gets endlessly repeated at the node end + provisiond -d -s /bin/sh 192.168.10.1 DEBUG: interval = '0' DEBUG: shell = 'bin/sh' DEBUG: command = "init' DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 DEBUG: connection established. DEBUG: Reading from socket... DEBUG: SIGALARM received DEBUG: last command result: 0 + sleep 1 + [ ! -f /next ] + [ 4 -eq 0 ] + [ 4 -eq 1 ] + provisiond -d -s /bin/sh 192.168.10.1 ...and so on... Thanks, Bill S. UC Riverside On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: > It looks like I get the same result. One other thing though: I rebuilt > the vnfs capsule, this time only adding the perceus-provisiond package > (and chkconfig'ing it on) and I no longer get the "bad file descriptor" > message. I will try to post the repeated output of provisiond. > > Thanks, > > Bill S. > UC Riverside > > > On Tue, 8 Jul 2008, Greg Kurtzer wrote: > >> That Net-ARP looks new enough, but just for kicks can you test out the >> version here: >> >> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >> >> If that works, I will make sure that 1.4 is updated for the newer >> version of Net::ARP. >> >> Greg >> >> >> >> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>> Hi Greg: >>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>> are Opterons, if that makes a difference. So those "^@" are newlines? >>> >>> Thanks, >>> >>> Bill S. >>> UC Riverside >>> >>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>> >>>> What version of perl-Net-ARP do you have installed? >>>> >>>> Also, I don't really like the extra characters (e.g. newline) after >>>> the init command. Not sure this is causing your direct problem, but >>>> lets start there... >>>> >>>> Greg >>>> >>>> >>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>> I am having a problem that has been mention many times in the past, >>>>> but none of those solutions have helped. The nodes hang forever after the >>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>> adnauseum. I have attached a portion of the output from >>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>> provided script to create the vnfs capsule. >>>>> >>>>> Thanks, >>>>> >>>>> Bill S. >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Greg Kurtzer >>>> http://www.infiscale.com/ >>>> http://www.runlevelzero.net/ >>>> http://www.perceus.org/ >>>> http://www.caoslinux.org/ >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> >> >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From cmorse at unm.edu Tue Jul 8 15:57:00 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 8 Jul 2008 16:57:00 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: <571f1a060807081449h2af8282bre45d56a1ce383ea0@mail.gmail.com> References: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> <571f1a060807081449h2af8282bre45d56a1ce383ea0@mail.gmail.com> Message-ID: Awesome, that did it. Thanks, -- Caleb On Tue, Jul 8, 2008 at 15:49, Greg Kurtzer wrote: > On Tue, Jul 8, 2008 at 2:25 PM, Caleb Morse wrote: > > On Tue, Jul 8, 2008 at 15:04, Greg Kurtzer wrote: > >> > >> On Tue, Jul 8, 2008 at 1:21 PM, John Hanks wrote: > >> > On Tue, Jul 8, 2008 at 1:35 PM, Caleb Morse wrote: > >> >> I have a machine that is hanging right after all of the modules are > >> >> unloaded > >> >> during the provisioning process. The output from the script isn't > very > >> >> useful, so I would like to modify it for more verbose output. But, > I'm > >> >> not > >> >> sure where the script is located. > >> > >> What kind of hardware is this, and more specifically, what network > >> card/driver is being used? > >> > >> It might be worth while to test drive the 1.4 release. Contact me > >> offline and I can coordinate with you on this, and build a beta > >> release tarball/RPMS to see if newer drivers, and scripts help. > > > > I was able to narrow down the hardware that was causing the problem. > > Disabling the SCSI controller (it isn't even needed) fixed the hang. > > > > Unfortunately I now get the following error message. > > > > ******************* > > Total provision time: 8 s > > > > Ramdisks not supported with generic elf arguemnts > > ******************* > > Check the VNFS configuration under KEXEC_ARGS. If your using RHEL5, > you may need to add --args-linux. > > > > > Then I get the "Waiting 30 seconds and rebooting message. > > > > I would be willing to test drive the 1.4 release. By "contact me offline" > > did you just mean email you directly? > > Yes. > > > > > > > > > >> > >> >> > >> > > >> > I've been looking for an excuse to put these notes somewhere where I > >> > wouldn't lose them. > >> > > >> > http://www.perceus.org/portal/faq/6#6n130 > >> > > >> > init is probably the script you want to modify but I've also had to > >> > add debug stuff to the hw_load scripts to figure some things out. It's > >> > pretty straightforward how it all works > >> > >> ego++; hehe > >> > >> As far as the upcoming 1.4 release, most of the node script structures > >> have undergone some general optimizations, but the internal stage 1 > >> initialization scripts are getting reworked to support more complex > >> nodescripts. Now is the time to voice all opinions so let us know > >> what's missing! :) > >> > >> Thanks, > >> Greg > >> > >> -- > >> Greg Kurtzer > >> http://www.infiscale.com/ > >> http://www.runlevelzero.net/ > >> http://www.perceus.org/ > >> http://www.caoslinux.org/ > >> _______________________________________________ > >> Warewulf mailing list > >> Warewulf at caoslinux.org > >> http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080708/2ace137f/attachment.html From gmkurtzer at gmail.com Tue Jul 8 16:51:38 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 8 Jul 2008 16:51:38 -0700 Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> Message-ID: <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> OK, something is concerning me. You are running this from a provisioned system with the "init" command into /bin/sh? That is scary.... Very scary. ;) "init" is a hard coded provisionary state that is responsible for the installation of the VNFS. Once a node boots it should be run in the "ready" provisionary state. # provisiond -d -s /bin/cat 192.168.10.1 ready would be safer and would cat out any shell commands for viewing (note, output is dependent on which Perceus modules are activated). So lets backup several steps. hehe When and where exactly did you see the bad file descriptor error? Good luck. Greg On Tue, Jul 8, 2008 at 3:46 PM, wrote: > Here is what gets endlessly repeated at the node end > > + provisiond -d -s /bin/sh 192.168.10.1 > DEBUG: interval = '0' > DEBUG: shell = 'bin/sh' > DEBUG: command = "init' > DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 > DEBUG: connection established. > DEBUG: Reading from socket... > DEBUG: SIGALARM received > DEBUG: last command result: 0 > + sleep 1 > + [ ! -f /next ] > + [ 4 -eq 0 ] > + [ 4 -eq 1 ] > + provisiond -d -s /bin/sh 192.168.10.1 > > ...and so on... > > Thanks, > > Bill S. > UC Riverside > > > On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: > >> It looks like I get the same result. One other thing though: I rebuilt >> the vnfs capsule, this time only adding the perceus-provisiond package >> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >> message. I will try to post the repeated output of provisiond. >> >> Thanks, >> >> Bill S. >> UC Riverside >> >> >> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >> >>> That Net-ARP looks new enough, but just for kicks can you test out the >>> version here: >>> >>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>> >>> If that works, I will make sure that 1.4 is updated for the newer >>> version of Net::ARP. >>> >>> Greg >>> >>> >>> >>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>> Hi Greg: >>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>> >>>> Thanks, >>>> >>>> Bill S. >>>> UC Riverside >>>> >>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>> >>>>> What version of perl-Net-ARP do you have installed? >>>>> >>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>> the init command. Not sure this is causing your direct problem, but >>>>> lets start there... >>>>> >>>>> Greg >>>>> >>>>> >>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>> I am having a problem that has been mention many times in the past, >>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>> adnauseum. I have attached a portion of the output from >>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>> provided script to create the vnfs capsule. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Bill S. >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Greg Kurtzer >>>>> http://www.infiscale.com/ >>>>> http://www.runlevelzero.net/ >>>>> http://www.perceus.org/ >>>>> http://www.caoslinux.org/ >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> >>> >>> >>> -- >>> Greg Kurtzer >>> http://www.infiscale.com/ >>> http://www.runlevelzero.net/ >>> http://www.perceus.org/ >>> http://www.caoslinux.org/ >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From william.strossman at ucr.edu Tue Jul 8 17:50:19 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Tue, 8 Jul 2008 17:50:19 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> Message-ID: Yes, I was wondering about that too. The thing is, I cannot locate where that provisiond command is being launched from, so I cannot change the options. Is it built into the initial ramdisk image or something? The bad file descriptor message was in the output of provisiond on the node side. I was getting that after making a lot of changes to the vnfs capsule. The output would alternate between getting a "connection established" line to an "unable to connect to 192.168.10.1, bad file descriptor" line. I haven't seen it since starting with a fresh vnfs capsule. Any ideas? Thanks, Bill S. UC Riverside On Tue, 8 Jul 2008, Greg Kurtzer wrote: > # provisiond -d -s /bin/cat 192.168.10.1 ready > > would be safer and would cat out any shell commands for viewing (note, > output is dependent on which Perceus modules are activated). > > So lets backup several steps. hehe > > When and where exactly did you see the bad file descriptor error? > > Good luck. > Greg > > > On Tue, Jul 8, 2008 at 3:46 PM, wrote: >> Here is what gets endlessly repeated at the node end >> >> + provisiond -d -s /bin/sh 192.168.10.1 >> DEBUG: interval = '0' >> DEBUG: shell = 'bin/sh' >> DEBUG: command = "init' >> DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 >> DEBUG: connection established. >> DEBUG: Reading from socket... >> DEBUG: SIGALARM received >> DEBUG: last command result: 0 >> + sleep 1 >> + [ ! -f /next ] >> + [ 4 -eq 0 ] >> + [ 4 -eq 1 ] >> + provisiond -d -s /bin/sh 192.168.10.1 >> >> ...and so on... >> >> Thanks, >> >> Bill S. >> UC Riverside >> >> >> On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: >> >>> It looks like I get the same result. One other thing though: I rebuilt >>> the vnfs capsule, this time only adding the perceus-provisiond package >>> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >>> message. I will try to post the repeated output of provisiond. >>> >>> Thanks, >>> >>> Bill S. >>> UC Riverside >>> >>> >>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>> >>>> That Net-ARP looks new enough, but just for kicks can you test out the >>>> version here: >>>> >>>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>>> >>>> If that works, I will make sure that 1.4 is updated for the newer >>>> version of Net::ARP. >>>> >>>> Greg >>>> >>>> >>>> >>>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>>> Hi Greg: >>>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>>> >>>>> Thanks, >>>>> >>>>> Bill S. >>>>> UC Riverside >>>>> >>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>> >>>>>> What version of perl-Net-ARP do you have installed? >>>>>> >>>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>>> the init command. Not sure this is causing your direct problem, but >>>>>> lets start there... >>>>>> >>>>>> Greg >>>>>> >>>>>> >>>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>>> I am having a problem that has been mention many times in the past, >>>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>>> adnauseum. I have attached a portion of the output from >>>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>>> provided script to create the vnfs capsule. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Bill S. >>>>>>> _______________________________________________ >>>>>>> Warewulf mailing list >>>>>>> Warewulf at caoslinux.org >>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Greg Kurtzer >>>>>> http://www.infiscale.com/ >>>>>> http://www.runlevelzero.net/ >>>>>> http://www.perceus.org/ >>>>>> http://www.caoslinux.org/ >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> >>>> >>>> >>>> -- >>>> Greg Kurtzer >>>> http://www.infiscale.com/ >>>> http://www.runlevelzero.net/ >>>> http://www.perceus.org/ >>>> http://www.caoslinux.org/ >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From gmkurtzer at gmail.com Tue Jul 8 20:07:46 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 8 Jul 2008 20:07:46 -0700 Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> Message-ID: <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> Again (just so I am clear), this is after provisioning and during runtime of the VNFS, correct? Where do you see the error? In Perceus 1.3, provisiond is started via an init script from within the VNFS capsule. If this is not within the VNFS capsule, then the sequence of events is very different which is why I keep stressing the question. Where do you see the error? Maybe it would help to get a screenshot or a shell clipping. Thanks, Greg On Tue, Jul 8, 2008 at 5:50 PM, wrote: > Yes, I was wondering about that too. The thing is, I cannot > locate where that provisiond command is being launched from, so I cannot > change the options. Is it built into the initial ramdisk image or > something? > The bad file descriptor message was in the output of provisiond on > the node side. I was getting that after making a lot of changes to the > vnfs capsule. The output would alternate between getting a "connection > established" line to an "unable to connect to 192.168.10.1, bad file > descriptor" line. I haven't seen it since starting with a fresh vnfs > capsule. > Any ideas? > > Thanks, > > Bill S. > UC Riverside > > > On Tue, 8 Jul 2008, Greg Kurtzer wrote: > >> # provisiond -d -s /bin/cat 192.168.10.1 ready >> >> would be safer and would cat out any shell commands for viewing (note, >> output is dependent on which Perceus modules are activated). >> >> So lets backup several steps. hehe >> >> When and where exactly did you see the bad file descriptor error? >> >> Good luck. >> Greg >> >> >> On Tue, Jul 8, 2008 at 3:46 PM, wrote: >>> Here is what gets endlessly repeated at the node end >>> >>> + provisiond -d -s /bin/sh 192.168.10.1 >>> DEBUG: interval = '0' >>> DEBUG: shell = 'bin/sh' >>> DEBUG: command = "init' >>> DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 >>> DEBUG: connection established. >>> DEBUG: Reading from socket... >>> DEBUG: SIGALARM received >>> DEBUG: last command result: 0 >>> + sleep 1 >>> + [ ! -f /next ] >>> + [ 4 -eq 0 ] >>> + [ 4 -eq 1 ] >>> + provisiond -d -s /bin/sh 192.168.10.1 >>> >>> ...and so on... >>> >>> Thanks, >>> >>> Bill S. >>> UC Riverside >>> >>> >>> On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: >>> >>>> It looks like I get the same result. One other thing though: I rebuilt >>>> the vnfs capsule, this time only adding the perceus-provisiond package >>>> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >>>> message. I will try to post the repeated output of provisiond. >>>> >>>> Thanks, >>>> >>>> Bill S. >>>> UC Riverside >>>> >>>> >>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>> >>>>> That Net-ARP looks new enough, but just for kicks can you test out the >>>>> version here: >>>>> >>>>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>>>> >>>>> If that works, I will make sure that 1.4 is updated for the newer >>>>> version of Net::ARP. >>>>> >>>>> Greg >>>>> >>>>> >>>>> >>>>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>>>> Hi Greg: >>>>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Bill S. >>>>>> UC Riverside >>>>>> >>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>> >>>>>>> What version of perl-Net-ARP do you have installed? >>>>>>> >>>>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>>>> the init command. Not sure this is causing your direct problem, but >>>>>>> lets start there... >>>>>>> >>>>>>> Greg >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>>>> I am having a problem that has been mention many times in the past, >>>>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>>>> adnauseum. I have attached a portion of the output from >>>>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>>>> provided script to create the vnfs capsule. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Bill S. >>>>>>>> _______________________________________________ >>>>>>>> Warewulf mailing list >>>>>>>> Warewulf at caoslinux.org >>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Greg Kurtzer >>>>>>> http://www.infiscale.com/ >>>>>>> http://www.runlevelzero.net/ >>>>>>> http://www.perceus.org/ >>>>>>> http://www.caoslinux.org/ >>>>>>> _______________________________________________ >>>>>>> Warewulf mailing list >>>>>>> Warewulf at caoslinux.org >>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>> >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Greg Kurtzer >>>>> http://www.infiscale.com/ >>>>> http://www.runlevelzero.net/ >>>>> http://www.perceus.org/ >>>>> http://www.caoslinux.org/ >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> >> >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From griznog at gmail.com Tue Jul 8 21:12:38 2008 From: griznog at gmail.com (John Hanks) Date: Tue, 8 Jul 2008 22:12:38 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> References: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> Message-ID: On Tue, Jul 8, 2008 at 3:04 PM, Greg Kurtzer wrote: > As far as the upcoming 1.4 release, most of the node script structures > have undergone some general optimizations, but the internal stage 1 > initialization scripts are getting reworked to support more complex > nodescripts. Now is the time to voice all opinions so let us know > what's missing! :) Well, since you asked, the ability to manage multiple perceus boot kernel/initramfs setups (or any pxe/etherboot images) similar to the way vnfs capsules are managed, assigning them to nodes within perceus, import/export, etc. That may only be useful for me, but I keep all my pxe/etherboot foo on a single perceus server, do pxe kickstarts from it, netboot interactive installs, etc. and it'd be nice to have a slick way to keep track of it all rather than my current method of manually shuffling links. jbh From william.strossman at ucr.edu Wed Jul 9 02:04:37 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Wed, 9 Jul 2008 02:04:37 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> Message-ID: > Again (just so I am clear), this is after provisioning and during > runtime of the VNFS, correct? Well, I am not sure that the provisioning ever completes. > Where do you see the error? The node goes through its normal pxe boot. Then it gets to "Etherlink found, requesting DHCP configuration via eth0" and then to "Provisioning from 192.168.10.1" where it stays in the "init" state indefinitely. There are no real error messages. The attachment in my first email contains the output of perceusd in debug mode on the master. The output I referred to two messages ago are what appears on the screen of node n0000 when a monitor is plugged into the vga port. /var/log/messages only contains dozens of instances of "Provisioning 'n0000' now...". perceus node status shows the node remaining in the "init" state forever. I hope I haven't made things more confusing here. Does this help? Thanks, Bill S. UCR > In Perceus 1.3, provisiond is started via an init script from within > the VNFS capsule. > On Tue, Jul 8, 2008 at 5:50 PM, wrote: >> Yes, I was wondering about that too. The thing is, I cannot >> locate where that provisiond command is being launched from, so I cannot >> change the options. Is it built into the initial ramdisk image or >> something? >> The bad file descriptor message was in the output of provisiond on >> the node side. I was getting that after making a lot of changes to the >> vnfs capsule. The output would alternate between getting a "connection >> established" line to an "unable to connect to 192.168.10.1, bad file >> descriptor" line. I haven't seen it since starting with a fresh vnfs >> capsule. >> Any ideas? >> >> Thanks, >> >> Bill S. >> UC Riverside >> >> >> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >> >>> # provisiond -d -s /bin/cat 192.168.10.1 ready >>> >>> would be safer and would cat out any shell commands for viewing (note, >>> output is dependent on which Perceus modules are activated). >>> >>> So lets backup several steps. hehe >>> >>> When and where exactly did you see the bad file descriptor error? >>> >>> Good luck. >>> Greg >>> >>> >>> On Tue, Jul 8, 2008 at 3:46 PM, wrote: >>>> Here is what gets endlessly repeated at the node end >>>> >>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>> DEBUG: interval = '0' >>>> DEBUG: shell = 'bin/sh' >>>> DEBUG: command = "init' >>>> DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 >>>> DEBUG: connection established. >>>> DEBUG: Reading from socket... >>>> DEBUG: SIGALARM received >>>> DEBUG: last command result: 0 >>>> + sleep 1 >>>> + [ ! -f /next ] >>>> + [ 4 -eq 0 ] >>>> + [ 4 -eq 1 ] >>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>> >>>> ...and so on... >>>> >>>> Thanks, >>>> >>>> Bill S. >>>> UC Riverside >>>> >>>> >>>> On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: >>>> >>>>> It looks like I get the same result. One other thing though: I rebuilt >>>>> the vnfs capsule, this time only adding the perceus-provisiond package >>>>> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >>>>> message. I will try to post the repeated output of provisiond. >>>>> >>>>> Thanks, >>>>> >>>>> Bill S. >>>>> UC Riverside >>>>> >>>>> >>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>> >>>>>> That Net-ARP looks new enough, but just for kicks can you test out the >>>>>> version here: >>>>>> >>>>>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>>>>> >>>>>> If that works, I will make sure that 1.4 is updated for the newer >>>>>> version of Net::ARP. >>>>>> >>>>>> Greg >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>>>>> Hi Greg: >>>>>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>>>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Bill S. >>>>>>> UC Riverside >>>>>>> >>>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>>> >>>>>>>> What version of perl-Net-ARP do you have installed? >>>>>>>> >>>>>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>>>>> the init command. Not sure this is causing your direct problem, but >>>>>>>> lets start there... >>>>>>>> >>>>>>>> Greg >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>>>>> I am having a problem that has been mention many times in the past, >>>>>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>>>>> adnauseum. I have attached a portion of the output from >>>>>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>>>>> provided script to create the vnfs capsule. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Bill S. >>>>>>>>> _______________________________________________ >>>>>>>>> Warewulf mailing list >>>>>>>>> Warewulf at caoslinux.org >>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Greg Kurtzer >>>>>>>> http://www.infiscale.com/ >>>>>>>> http://www.runlevelzero.net/ >>>>>>>> http://www.perceus.org/ >>>>>>>> http://www.caoslinux.org/ >>>>>>>> _______________________________________________ >>>>>>>> Warewulf mailing list >>>>>>>> Warewulf at caoslinux.org >>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Warewulf mailing list >>>>>>> Warewulf at caoslinux.org >>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Greg Kurtzer >>>>>> http://www.infiscale.com/ >>>>>> http://www.runlevelzero.net/ >>>>>> http://www.perceus.org/ >>>>>> http://www.caoslinux.org/ >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> >>> >>> >>> -- >>> Greg Kurtzer >>> http://www.infiscale.com/ >>> http://www.runlevelzero.net/ >>> http://www.perceus.org/ >>> http://www.caoslinux.org/ >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From william.strossman at ucr.edu Wed Jul 9 10:03:15 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Wed, 9 Jul 2008 10:03:15 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> Message-ID: I just realized something. The master does not have any record of nfs mount requests in /var/log/messages. This must mean that the vnfs transfer is not being done. I am going do download a pre-made capsule and see if the problem is with perceus or the vnfs capsule. Thanks, Bill S. UCR On Tue, 8 Jul 2008, Greg Kurtzer wrote: > Again (just so I am clear), this is after provisioning and during > runtime of the VNFS, correct? > > Where do you see the error? > > In Perceus 1.3, provisiond is started via an init script from within > the VNFS capsule. > > If this is not within the VNFS capsule, then the sequence of events is > very different which is why I keep stressing the question. Where do > you see the error? Maybe it would help to get a screenshot or a shell > clipping. > > Thanks, > Greg > > On Tue, Jul 8, 2008 at 5:50 PM, wrote: >> Yes, I was wondering about that too. The thing is, I cannot >> locate where that provisiond command is being launched from, so I cannot >> change the options. Is it built into the initial ramdisk image or >> something? >> The bad file descriptor message was in the output of provisiond on >> the node side. I was getting that after making a lot of changes to the >> vnfs capsule. The output would alternate between getting a "connection >> established" line to an "unable to connect to 192.168.10.1, bad file >> descriptor" line. I haven't seen it since starting with a fresh vnfs >> capsule. >> Any ideas? >> >> Thanks, >> >> Bill S. >> UC Riverside >> >> >> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >> >>> # provisiond -d -s /bin/cat 192.168.10.1 ready >>> >>> would be safer and would cat out any shell commands for viewing (note, >>> output is dependent on which Perceus modules are activated). >>> >>> So lets backup several steps. hehe >>> >>> When and where exactly did you see the bad file descriptor error? >>> >>> Good luck. >>> Greg >>> >>> >>> On Tue, Jul 8, 2008 at 3:46 PM, wrote: >>>> Here is what gets endlessly repeated at the node end >>>> >>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>> DEBUG: interval = '0' >>>> DEBUG: shell = 'bin/sh' >>>> DEBUG: command = "init' >>>> DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 >>>> DEBUG: connection established. >>>> DEBUG: Reading from socket... >>>> DEBUG: SIGALARM received >>>> DEBUG: last command result: 0 >>>> + sleep 1 >>>> + [ ! -f /next ] >>>> + [ 4 -eq 0 ] >>>> + [ 4 -eq 1 ] >>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>> >>>> ...and so on... >>>> >>>> Thanks, >>>> >>>> Bill S. >>>> UC Riverside >>>> >>>> >>>> On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: >>>> >>>>> It looks like I get the same result. One other thing though: I rebuilt >>>>> the vnfs capsule, this time only adding the perceus-provisiond package >>>>> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >>>>> message. I will try to post the repeated output of provisiond. >>>>> >>>>> Thanks, >>>>> >>>>> Bill S. >>>>> UC Riverside >>>>> >>>>> >>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>> >>>>>> That Net-ARP looks new enough, but just for kicks can you test out the >>>>>> version here: >>>>>> >>>>>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>>>>> >>>>>> If that works, I will make sure that 1.4 is updated for the newer >>>>>> version of Net::ARP. >>>>>> >>>>>> Greg >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>>>>> Hi Greg: >>>>>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>>>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Bill S. >>>>>>> UC Riverside >>>>>>> >>>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>>> >>>>>>>> What version of perl-Net-ARP do you have installed? >>>>>>>> >>>>>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>>>>> the init command. Not sure this is causing your direct problem, but >>>>>>>> lets start there... >>>>>>>> >>>>>>>> Greg >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>>>>> I am having a problem that has been mention many times in the past, >>>>>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>>>>> adnauseum. I have attached a portion of the output from >>>>>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>>>>> provided script to create the vnfs capsule. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Bill S. >>>>>>>>> _______________________________________________ >>>>>>>>> Warewulf mailing list >>>>>>>>> Warewulf at caoslinux.org >>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Greg Kurtzer >>>>>>>> http://www.infiscale.com/ >>>>>>>> http://www.runlevelzero.net/ >>>>>>>> http://www.perceus.org/ >>>>>>>> http://www.caoslinux.org/ >>>>>>>> _______________________________________________ >>>>>>>> Warewulf mailing list >>>>>>>> Warewulf at caoslinux.org >>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Warewulf mailing list >>>>>>> Warewulf at caoslinux.org >>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Greg Kurtzer >>>>>> http://www.infiscale.com/ >>>>>> http://www.runlevelzero.net/ >>>>>> http://www.perceus.org/ >>>>>> http://www.caoslinux.org/ >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> >>> >>> >>> -- >>> Greg Kurtzer >>> http://www.infiscale.com/ >>> http://www.runlevelzero.net/ >>> http://www.perceus.org/ >>> http://www.caoslinux.org/ >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From william.strossman at ucr.edu Wed Jul 9 11:17:50 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Wed, 9 Jul 2008 11:17:50 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: References: <571f1a060807081426j2430331bsc3ab0abf89769786@mail.gmail.com> <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> Message-ID: No dice on the different vnfs capsule. I used caos-nsa-node-0.9-40-1.stateless.x86_64 which I downloaded premade and centos-5.1-1.stateless.x86_64 which I created with the script provided. I am beginning to think that the kernel in /var/lib/perceus/tftp is having trouble with the hardware. Can one simply drop in a newer kernel and initrd, renaming them to kernel and initramfs.img respectively? Thanks, Bill S. UCR On Wed, 9 Jul 2008 william.strossman at ucr.edu wrote: > I just realized something. The master does not have any record of > nfs mount requests in /var/log/messages. This must mean that the vnfs > transfer is not being done. I am going do download a pre-made capsule and > see if the problem is with perceus or the vnfs capsule. > > Thanks, > > Bill S. > UCR > > > On Tue, 8 Jul 2008, Greg Kurtzer wrote: > >> Again (just so I am clear), this is after provisioning and during >> runtime of the VNFS, correct? >> >> Where do you see the error? >> >> In Perceus 1.3, provisiond is started via an init script from within >> the VNFS capsule. >> >> If this is not within the VNFS capsule, then the sequence of events is >> very different which is why I keep stressing the question. Where do >> you see the error? Maybe it would help to get a screenshot or a shell >> clipping. >> >> Thanks, >> Greg >> >> On Tue, Jul 8, 2008 at 5:50 PM, wrote: >>> Yes, I was wondering about that too. The thing is, I cannot >>> locate where that provisiond command is being launched from, so I cannot >>> change the options. Is it built into the initial ramdisk image or >>> something? >>> The bad file descriptor message was in the output of provisiond on >>> the node side. I was getting that after making a lot of changes to the >>> vnfs capsule. The output would alternate between getting a "connection >>> established" line to an "unable to connect to 192.168.10.1, bad file >>> descriptor" line. I haven't seen it since starting with a fresh vnfs >>> capsule. >>> Any ideas? >>> >>> Thanks, >>> >>> Bill S. >>> UC Riverside >>> >>> >>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>> >>>> # provisiond -d -s /bin/cat 192.168.10.1 ready >>>> >>>> would be safer and would cat out any shell commands for viewing (note, >>>> output is dependent on which Perceus modules are activated). >>>> >>>> So lets backup several steps. hehe >>>> >>>> When and where exactly did you see the bad file descriptor error? >>>> >>>> Good luck. >>>> Greg >>>> >>>> >>>> On Tue, Jul 8, 2008 at 3:46 PM, wrote: >>>>> Here is what gets endlessly repeated at the node end >>>>> >>>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>>> DEBUG: interval = '0' >>>>> DEBUG: shell = 'bin/sh' >>>>> DEBUG: command = "init' >>>>> DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 >>>>> DEBUG: connection established. >>>>> DEBUG: Reading from socket... >>>>> DEBUG: SIGALARM received >>>>> DEBUG: last command result: 0 >>>>> + sleep 1 >>>>> + [ ! -f /next ] >>>>> + [ 4 -eq 0 ] >>>>> + [ 4 -eq 1 ] >>>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>>> >>>>> ...and so on... >>>>> >>>>> Thanks, >>>>> >>>>> Bill S. >>>>> UC Riverside >>>>> >>>>> >>>>> On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: >>>>> >>>>>> It looks like I get the same result. One other thing though: I rebuilt >>>>>> the vnfs capsule, this time only adding the perceus-provisiond package >>>>>> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >>>>>> message. I will try to post the repeated output of provisiond. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Bill S. >>>>>> UC Riverside >>>>>> >>>>>> >>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>> >>>>>>> That Net-ARP looks new enough, but just for kicks can you test out the >>>>>>> version here: >>>>>>> >>>>>>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>>>>>> >>>>>>> If that works, I will make sure that 1.4 is updated for the newer >>>>>>> version of Net::ARP. >>>>>>> >>>>>>> Greg >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>>>>>> Hi Greg: >>>>>>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>>>>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Bill S. >>>>>>>> UC Riverside >>>>>>>> >>>>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>>>> >>>>>>>>> What version of perl-Net-ARP do you have installed? >>>>>>>>> >>>>>>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>>>>>> the init command. Not sure this is causing your direct problem, but >>>>>>>>> lets start there... >>>>>>>>> >>>>>>>>> Greg >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>>>>>> I am having a problem that has been mention many times in the past, >>>>>>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>>>>>> adnauseum. I have attached a portion of the output from >>>>>>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>>>>>> provided script to create the vnfs capsule. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Bill S. >>>>>>>>>> _______________________________________________ >>>>>>>>>> Warewulf mailing list >>>>>>>>>> Warewulf at caoslinux.org >>>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Greg Kurtzer >>>>>>>>> http://www.infiscale.com/ >>>>>>>>> http://www.runlevelzero.net/ >>>>>>>>> http://www.perceus.org/ >>>>>>>>> http://www.caoslinux.org/ >>>>>>>>> _______________________________________________ >>>>>>>>> Warewulf mailing list >>>>>>>>> Warewulf at caoslinux.org >>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Warewulf mailing list >>>>>>>> Warewulf at caoslinux.org >>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Greg Kurtzer >>>>>>> http://www.infiscale.com/ >>>>>>> http://www.runlevelzero.net/ >>>>>>> http://www.perceus.org/ >>>>>>> http://www.caoslinux.org/ >>>>>>> _______________________________________________ >>>>>>> Warewulf mailing list >>>>>>> Warewulf at caoslinux.org >>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>> >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> >>>> >>>> >>>> -- >>>> Greg Kurtzer >>>> http://www.infiscale.com/ >>>> http://www.runlevelzero.net/ >>>> http://www.perceus.org/ >>>> http://www.caoslinux.org/ >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> >> >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From gmkurtzer at gmail.com Wed Jul 9 12:06:53 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 9 Jul 2008 12:06:53 -0700 Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: References: <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> Message-ID: <571f1a060807091206sb0bf92fqfcc9faf9a1cdb971@mail.gmail.com> If there is no log of the attempted mount, then (usually) it is either the vnfs* configuration settings in the perceus.conf being wrong, or the master is denying the access (either via firewall, tcp_wrappers, or the services being down). This is not a VNFS issue.... Not yet at least. ;) As far as the Perceus kernel, it is still too early to say, but this would be the first record of that. ;) Greg On Wed, Jul 9, 2008 at 11:17 AM, wrote: > No dice on the different vnfs capsule. I used > caos-nsa-node-0.9-40-1.stateless.x86_64 which I downloaded premade and > centos-5.1-1.stateless.x86_64 which I created with the script provided. I > am beginning to think that the kernel in /var/lib/perceus/tftp is having > trouble with the hardware. Can one simply drop in a newer kernel and > initrd, renaming them to kernel and initramfs.img respectively? > > Thanks, > > Bill S. > UCR > > On Wed, 9 Jul 2008 william.strossman at ucr.edu wrote: > >> I just realized something. The master does not have any record of >> nfs mount requests in /var/log/messages. This must mean that the vnfs >> transfer is not being done. I am going do download a pre-made capsule and >> see if the problem is with perceus or the vnfs capsule. >> >> Thanks, >> >> Bill S. >> UCR >> >> >> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >> >>> Again (just so I am clear), this is after provisioning and during >>> runtime of the VNFS, correct? >>> >>> Where do you see the error? >>> >>> In Perceus 1.3, provisiond is started via an init script from within >>> the VNFS capsule. >>> >>> If this is not within the VNFS capsule, then the sequence of events is >>> very different which is why I keep stressing the question. Where do >>> you see the error? Maybe it would help to get a screenshot or a shell >>> clipping. >>> >>> Thanks, >>> Greg >>> >>> On Tue, Jul 8, 2008 at 5:50 PM, wrote: >>>> Yes, I was wondering about that too. The thing is, I cannot >>>> locate where that provisiond command is being launched from, so I cannot >>>> change the options. Is it built into the initial ramdisk image or >>>> something? >>>> The bad file descriptor message was in the output of provisiond on >>>> the node side. I was getting that after making a lot of changes to the >>>> vnfs capsule. The output would alternate between getting a "connection >>>> established" line to an "unable to connect to 192.168.10.1, bad file >>>> descriptor" line. I haven't seen it since starting with a fresh vnfs >>>> capsule. >>>> Any ideas? >>>> >>>> Thanks, >>>> >>>> Bill S. >>>> UC Riverside >>>> >>>> >>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>> >>>>> # provisiond -d -s /bin/cat 192.168.10.1 ready >>>>> >>>>> would be safer and would cat out any shell commands for viewing (note, >>>>> output is dependent on which Perceus modules are activated). >>>>> >>>>> So lets backup several steps. hehe >>>>> >>>>> When and where exactly did you see the bad file descriptor error? >>>>> >>>>> Good luck. >>>>> Greg >>>>> >>>>> >>>>> On Tue, Jul 8, 2008 at 3:46 PM, wrote: >>>>>> Here is what gets endlessly repeated at the node end >>>>>> >>>>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>>>> DEBUG: interval = '0' >>>>>> DEBUG: shell = 'bin/sh' >>>>>> DEBUG: command = "init' >>>>>> DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 >>>>>> DEBUG: connection established. >>>>>> DEBUG: Reading from socket... >>>>>> DEBUG: SIGALARM received >>>>>> DEBUG: last command result: 0 >>>>>> + sleep 1 >>>>>> + [ ! -f /next ] >>>>>> + [ 4 -eq 0 ] >>>>>> + [ 4 -eq 1 ] >>>>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>>>> >>>>>> ...and so on... >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Bill S. >>>>>> UC Riverside >>>>>> >>>>>> >>>>>> On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: >>>>>> >>>>>>> It looks like I get the same result. One other thing though: I rebuilt >>>>>>> the vnfs capsule, this time only adding the perceus-provisiond package >>>>>>> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >>>>>>> message. I will try to post the repeated output of provisiond. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Bill S. >>>>>>> UC Riverside >>>>>>> >>>>>>> >>>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>>> >>>>>>>> That Net-ARP looks new enough, but just for kicks can you test out the >>>>>>>> version here: >>>>>>>> >>>>>>>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>>>>>>> >>>>>>>> If that works, I will make sure that 1.4 is updated for the newer >>>>>>>> version of Net::ARP. >>>>>>>> >>>>>>>> Greg >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>>>>>>> Hi Greg: >>>>>>>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>>>>>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Bill S. >>>>>>>>> UC Riverside >>>>>>>>> >>>>>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>>>>> >>>>>>>>>> What version of perl-Net-ARP do you have installed? >>>>>>>>>> >>>>>>>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>>>>>>> the init command. Not sure this is causing your direct problem, but >>>>>>>>>> lets start there... >>>>>>>>>> >>>>>>>>>> Greg >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>>>>>>> I am having a problem that has been mention many times in the past, >>>>>>>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>>>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>>>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>>>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>>>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>>>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>>>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>>>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>>>>>>> adnauseum. I have attached a portion of the output from >>>>>>>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>>>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>>>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>>>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>>>>>>> provided script to create the vnfs capsule. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Bill S. >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Warewulf mailing list >>>>>>>>>>> Warewulf at caoslinux.org >>>>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Greg Kurtzer >>>>>>>>>> http://www.infiscale.com/ >>>>>>>>>> http://www.runlevelzero.net/ >>>>>>>>>> http://www.perceus.org/ >>>>>>>>>> http://www.caoslinux.org/ >>>>>>>>>> _______________________________________________ >>>>>>>>>> Warewulf mailing list >>>>>>>>>> Warewulf at caoslinux.org >>>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Warewulf mailing list >>>>>>>>> Warewulf at caoslinux.org >>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Greg Kurtzer >>>>>>>> http://www.infiscale.com/ >>>>>>>> http://www.runlevelzero.net/ >>>>>>>> http://www.perceus.org/ >>>>>>>> http://www.caoslinux.org/ >>>>>>>> _______________________________________________ >>>>>>>> Warewulf mailing list >>>>>>>> Warewulf at caoslinux.org >>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Warewulf mailing list >>>>>>> Warewulf at caoslinux.org >>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>> >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Greg Kurtzer >>>>> http://www.infiscale.com/ >>>>> http://www.runlevelzero.net/ >>>>> http://www.perceus.org/ >>>>> http://www.caoslinux.org/ >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> >>> >>> >>> -- >>> Greg Kurtzer >>> http://www.infiscale.com/ >>> http://www.runlevelzero.net/ >>> http://www.perceus.org/ >>> http://www.caoslinux.org/ >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From cmorse at unm.edu Wed Jul 9 13:23:34 2008 From: cmorse at unm.edu (Caleb Morse) Date: Wed, 9 Jul 2008 14:23:34 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> References: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> Message-ID: On Tue, Jul 8, 2008 at 15:04, Greg Kurtzer wrote: > On Tue, Jul 8, 2008 at 1:21 PM, John Hanks wrote: > > On Tue, Jul 8, 2008 at 1:35 PM, Caleb Morse wrote: > >> I have a machine that is hanging right after all of the modules are > unloaded > >> during the provisioning process. The output from the script isn't very > >> useful, so I would like to modify it for more verbose output. But, I'm > not > >> sure where the script is located. > > What kind of hardware is this, and more specifically, what network > card/driver is being used? > > It might be worth while to test drive the 1.4 release. Contact me > offline and I can coordinate with you on this, and build a beta > release tarball/RPMS to see if newer drivers, and scripts help. How would I go about contacting you offline? > > > >> > > > > I've been looking for an excuse to put these notes somewhere where I > > wouldn't lose them. > > > > http://www.perceus.org/portal/faq/6#6n130 > > > > init is probably the script you want to modify but I've also had to > > add debug stuff to the hw_load scripts to figure some things out. It's > > pretty straightforward how it all works > > ego++; hehe > > As far as the upcoming 1.4 release, most of the node script structures > have undergone some general optimizations, but the internal stage 1 > initialization scripts are getting reworked to support more complex > nodescripts. Now is the time to voice all opinions so let us know > what's missing! :) > > Thanks, > Greg > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Caleb -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080709/ec7f09d2/attachment.html From griznog at gmail.com Wed Jul 9 14:37:39 2008 From: griznog at gmail.com (John Hanks) Date: Wed, 9 Jul 2008 15:37:39 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: References: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> Message-ID: On Wed, Jul 9, 2008 at 2:23 PM, Caleb Morse wrote: > > How would I go about contacting you offline? > > Greg is easy to contact offline. Just go to http://www.caoslinux.org/index.html and grab the image with the squiggly lines. Print it on a color transparency, tape that to a strong flashlight, go outside on a dark cloudy night and shine it into the sky. Wait for a while and try not to look shaken when he suddenly appears out of nowhere and says "what?". jbh From cmorse at unm.edu Wed Jul 9 14:41:35 2008 From: cmorse at unm.edu (Caleb Morse) Date: Wed, 9 Jul 2008 15:41:35 -0600 Subject: [Warewulf] Provisioning hangs after unloading modules In-Reply-To: References: <571f1a060807081404h49610ae9j84b27a262ae0084e@mail.gmail.com> Message-ID: Haha, Batman style, awesome. -- Caleb On Wed, Jul 9, 2008 at 15:37, John Hanks wrote: > On Wed, Jul 9, 2008 at 2:23 PM, Caleb Morse wrote: > > > > How would I go about contacting you offline? > > > > > Greg is easy to contact offline. Just go to > http://www.caoslinux.org/index.html and grab the image with the > squiggly lines. Print it on a color transparency, tape that to a > strong flashlight, go outside on a dark cloudy night and shine it into > the sky. Wait for a while and try not to look shaken when he suddenly > appears out of nowhere and says "what?". > > jbh > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080709/06822d29/attachment.html From william.strossman at ucr.edu Wed Jul 9 15:03:36 2008 From: william.strossman at ucr.edu (william.strossman at ucr.edu) Date: Wed, 9 Jul 2008 15:03:36 -0700 (PDT) Subject: [Warewulf] provisioning problem with 1.3.7 In-Reply-To: <571f1a060807091206sb0bf92fqfcc9faf9a1cdb971@mail.gmail.com> References: <571f1a060807081509j7c9c7824nbfffb1a2930d9b88@mail.gmail.com> <571f1a060807081651hc5083d3h43e84b2ab27c5cbf@mail.gmail.com> <571f1a060807082007v32ec8ad3p1ebb382835a04180@mail.gmail.com> <571f1a060807091206sb0bf92fqfcc9faf9a1cdb971@mail.gmail.com> Message-ID: Well, it works now! You were right. It wasn't a vnfs problem. The problem was apparently a difference between perceus-1.3.7-1683 (which I got when I did an rpmbuild -ta on the source tarball from perceus.org) and perceus-1.3.7-1735 (which I got when I did an rpmbuild -bb on the perceus-1.3.7-1735.src.rpm which I downloaded from altruistic.lbl.gov in the centos/5 repository). The former works and the latter does not. Still not sure what would make it hang during step 4 (on page 7 of the documentation). Any ideas? Thanks, Bill S. UCR On Wed, 9 Jul 2008, Greg Kurtzer wrote: > If there is no log of the attempted mount, then (usually) it is either > the vnfs* configuration settings in the perceus.conf being wrong, or > the master is denying the access (either via firewall, tcp_wrappers, > or the services being down). > > This is not a VNFS issue.... Not yet at least. ;) > > As far as the Perceus kernel, it is still too early to say, but this > would be the first record of that. ;) > > Greg > > > > On Wed, Jul 9, 2008 at 11:17 AM, wrote: >> No dice on the different vnfs capsule. I used >> caos-nsa-node-0.9-40-1.stateless.x86_64 which I downloaded premade and >> centos-5.1-1.stateless.x86_64 which I created with the script provided. I >> am beginning to think that the kernel in /var/lib/perceus/tftp is having >> trouble with the hardware. Can one simply drop in a newer kernel and >> initrd, renaming them to kernel and initramfs.img respectively? >> >> Thanks, >> >> Bill S. >> UCR >> >> On Wed, 9 Jul 2008 william.strossman at ucr.edu wrote: >> >>> I just realized something. The master does not have any record of >>> nfs mount requests in /var/log/messages. This must mean that the vnfs >>> transfer is not being done. I am going do download a pre-made capsule and >>> see if the problem is with perceus or the vnfs capsule. >>> >>> Thanks, >>> >>> Bill S. >>> UCR >>> >>> >>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>> >>>> Again (just so I am clear), this is after provisioning and during >>>> runtime of the VNFS, correct? >>>> >>>> Where do you see the error? >>>> >>>> In Perceus 1.3, provisiond is started via an init script from within >>>> the VNFS capsule. >>>> >>>> If this is not within the VNFS capsule, then the sequence of events is >>>> very different which is why I keep stressing the question. Where do >>>> you see the error? Maybe it would help to get a screenshot or a shell >>>> clipping. >>>> >>>> Thanks, >>>> Greg >>>> >>>> On Tue, Jul 8, 2008 at 5:50 PM, wrote: >>>>> Yes, I was wondering about that too. The thing is, I cannot >>>>> locate where that provisiond command is being launched from, so I cannot >>>>> change the options. Is it built into the initial ramdisk image or >>>>> something? >>>>> The bad file descriptor message was in the output of provisiond on >>>>> the node side. I was getting that after making a lot of changes to the >>>>> vnfs capsule. The output would alternate between getting a "connection >>>>> established" line to an "unable to connect to 192.168.10.1, bad file >>>>> descriptor" line. I haven't seen it since starting with a fresh vnfs >>>>> capsule. >>>>> Any ideas? >>>>> >>>>> Thanks, >>>>> >>>>> Bill S. >>>>> UC Riverside >>>>> >>>>> >>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>> >>>>>> # provisiond -d -s /bin/cat 192.168.10.1 ready >>>>>> >>>>>> would be safer and would cat out any shell commands for viewing (note, >>>>>> output is dependent on which Perceus modules are activated). >>>>>> >>>>>> So lets backup several steps. hehe >>>>>> >>>>>> When and where exactly did you see the bad file descriptor error? >>>>>> >>>>>> Good luck. >>>>>> Greg >>>>>> >>>>>> >>>>>> On Tue, Jul 8, 2008 at 3:46 PM, wrote: >>>>>>> Here is what gets endlessly repeated at the node end >>>>>>> >>>>>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>>>>> DEBUG: interval = '0' >>>>>>> DEBUG: shell = 'bin/sh' >>>>>>> DEBUG: command = "init' >>>>>>> DEBUG: connect to 192.168.10.1 (192.168.10.1) on port 987 >>>>>>> DEBUG: connection established. >>>>>>> DEBUG: Reading from socket... >>>>>>> DEBUG: SIGALARM received >>>>>>> DEBUG: last command result: 0 >>>>>>> + sleep 1 >>>>>>> + [ ! -f /next ] >>>>>>> + [ 4 -eq 0 ] >>>>>>> + [ 4 -eq 1 ] >>>>>>> + provisiond -d -s /bin/sh 192.168.10.1 >>>>>>> >>>>>>> ...and so on... >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Bill S. >>>>>>> UC Riverside >>>>>>> >>>>>>> >>>>>>> On Tue, 8 Jul 2008 william.strossman at ucr.edu wrote: >>>>>>> >>>>>>>> It looks like I get the same result. One other thing though: I rebuilt >>>>>>>> the vnfs capsule, this time only adding the perceus-provisiond package >>>>>>>> (and chkconfig'ing it on) and I no longer get the "bad file descriptor" >>>>>>>> message. I will try to post the repeated output of provisiond. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Bill S. >>>>>>>> UC Riverside >>>>>>>> >>>>>>>> >>>>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>>>> >>>>>>>>> That Net-ARP looks new enough, but just for kicks can you test out the >>>>>>>>> version here: >>>>>>>>> >>>>>>>>> http://www.perceus.org/downloads/perceus/v1.x/dependencies/ >>>>>>>>> >>>>>>>>> If that works, I will make sure that 1.4 is updated for the newer >>>>>>>>> version of Net::ARP. >>>>>>>>> >>>>>>>>> Greg >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jul 8, 2008 at 2:29 PM, wrote: >>>>>>>>>> Hi Greg: >>>>>>>>>> perl-Net-ARP-1.0.2-1.el5.rf. Also, I forgot to metnion that these >>>>>>>>>> are Opterons, if that makes a difference. So those "^@" are newlines? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Bill S. >>>>>>>>>> UC Riverside >>>>>>>>>> >>>>>>>>>> On Tue, 8 Jul 2008, Greg Kurtzer wrote: >>>>>>>>>> >>>>>>>>>>> What version of perl-Net-ARP do you have installed? >>>>>>>>>>> >>>>>>>>>>> Also, I don't really like the extra characters (e.g. newline) after >>>>>>>>>>> the init command. Not sure this is causing your direct problem, but >>>>>>>>>>> lets start there... >>>>>>>>>>> >>>>>>>>>>> Greg >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Jul 8, 2008 at 2:20 PM, wrote: >>>>>>>>>>>> I am having a problem that has been mention many times in the past, >>>>>>>>>>>> but none of those solutions have helped. The nodes hang forever after the >>>>>>>>>>>> message "Provisioning from 192.168.10.1". Running perceusd in debug mode >>>>>>>>>>>> shows that 6 nodescripts are run and then the socket is closed and it goes >>>>>>>>>>>> back to eval(). This is repeated ad nauseum forever. Running provisiond in >>>>>>>>>>>> debug mode reveals no information whatsoever. Adding debug=4 to the >>>>>>>>>>>> pxelinux.cfg/default file shows that a SIGALRM is received twice and after >>>>>>>>>>>> the second time an "unable to connect to 192.168.10.1, bad file descriptor" >>>>>>>>>>>> (it connects to port 987 the first time). This sequence is also repeated >>>>>>>>>>>> adnauseum. I have attached a portion of the output from >>>>>>>>>>>> perceusd in debug mode. There are a few lines like "Parsing provisiond's >>>>>>>>>>>> arguments (^@)" in there (a null character? which would be odd because there >>>>>>>>>>>> are options after it in the /etc/init.d/provisiond file) Any ideas or need >>>>>>>>>>>> any more info? This is perceus 1.3.7 on CentOS 5.1 x86_64. I used the >>>>>>>>>>>> provided script to create the vnfs capsule. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Bill S. >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Warewulf mailing list >>>>>>>>>>>> Warewulf at caoslinux.org >>>>>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Greg Kurtzer >>>>>>>>>>> http://www.infiscale.com/ >>>>>>>>>>> http://www.runlevelzero.net/ >>>>>>>>>>> http://www.perceus.org/ >>>>>>>>>>> http://www.caoslinux.org/ >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Warewulf mailing list >>>>>>>>>>> Warewulf at caoslinux.org >>>>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Warewulf mailing list >>>>>>>>>> Warewulf at caoslinux.org >>>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Greg Kurtzer >>>>>>>>> http://www.infiscale.com/ >>>>>>>>> http://www.runlevelzero.net/ >>>>>>>>> http://www.perceus.org/ >>>>>>>>> http://www.caoslinux.org/ >>>>>>>>> _______________________________________________ >>>>>>>>> Warewulf mailing list >>>>>>>>> Warewulf at caoslinux.org >>>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Warewulf mailing list >>>>>>>> Warewulf at caoslinux.org >>>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Warewulf mailing list >>>>>>> Warewulf at caoslinux.org >>>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Greg Kurtzer >>>>>> http://www.infiscale.com/ >>>>>> http://www.runlevelzero.net/ >>>>>> http://www.perceus.org/ >>>>>> http://www.caoslinux.org/ >>>>>> _______________________________________________ >>>>>> Warewulf mailing list >>>>>> Warewulf at caoslinux.org >>>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>>> >>>>> _______________________________________________ >>>>> Warewulf mailing list >>>>> Warewulf at caoslinux.org >>>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>>> >>>> >>>> >>>> >>>> -- >>>> Greg Kurtzer >>>> http://www.infiscale.com/ >>>> http://www.runlevelzero.net/ >>>> http://www.perceus.org/ >>>> http://www.caoslinux.org/ >>>> _______________________________________________ >>>> Warewulf mailing list >>>> Warewulf at caoslinux.org >>>> http://lists.caosity.org/mailman/listinfo/warewulf >>>> >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >>> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From harry.mangalam at uci.edu Thu Jul 10 07:41:39 2008 From: harry.mangalam at uci.edu (Harry Mangalam) Date: Thu, 10 Jul 2008 07:41:39 -0700 Subject: [Warewulf] New article about Perceus Message-ID: <200807100741.39717.harry.mangalam@uci.edu> Y'all may already know about this, but Linux Mag just posted this article, which is a pretty good intro to Perceus. Should be linked to on the Perceus site. http://www.linux-mag.com/id/6386 Best wishes Harry -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824-0084(o), 949 285-4487(c) -- [A Nation of Sheep breeds a Government of Wolves. Edward R. Murrow] From bernard at vanhpc.org Thu Jul 10 11:26:01 2008 From: bernard at vanhpc.org (Bernard Li) Date: Thu, 10 Jul 2008 11:26:01 -0700 Subject: [Warewulf] New article about Perceus In-Reply-To: <200807100741.39717.harry.mangalam@uci.edu> References: <200807100741.39717.harry.mangalam@uci.edu> Message-ID: Hi Harry: On Thu, Jul 10, 2008 at 7:41 AM, Harry Mangalam wrote: > Y'all may already know about this, but Linux Mag just posted this > article, which is a pretty good intro to Perceus. Should be linked > to on the Perceus site. > > http://www.linux-mag.com/id/6386 It'd be nice if I have a Linux Magazine account ;-) Cheers, Bernard From lucas.wagner at gmail.com Thu Jul 10 11:32:25 2008 From: lucas.wagner at gmail.com (Lucas Wagner) Date: Thu, 10 Jul 2008 11:32:25 -0700 Subject: [Warewulf] New article about Perceus In-Reply-To: References: <200807100741.39717.harry.mangalam@uci.edu> Message-ID: <23252e110807101132k4e0f557fu62e815b2d3115018@mail.gmail.com> Hi! Try bugmenot.com, you may find it useful. Cheers, Lucas On Thu, Jul 10, 2008 at 11:26 AM, Bernard Li wrote: > Hi Harry: > > On Thu, Jul 10, 2008 at 7:41 AM, Harry Mangalam wrote: > >> Y'all may already know about this, but Linux Mag just posted this >> article, which is a pretty good intro to Perceus. Should be linked >> to on the Perceus site. >> >> http://www.linux-mag.com/id/6386 > > It'd be nice if I have a Linux Magazine account ;-) > > Cheers, > > Bernard > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > From jsquyres at cisco.com Thu Jul 10 11:33:37 2008 From: jsquyres at cisco.com (Jeff Squyres) Date: Thu, 10 Jul 2008 14:33:37 -0400 Subject: [Warewulf] New article about Perceus In-Reply-To: References: <200807100741.39717.harry.mangalam@uci.edu> Message-ID: <66A0A204-C470-408F-A33A-F7B88197BDD8@cisco.com> It's a free account; what do you care? :-) On Jul 10, 2008, at 2:26 PM, Bernard Li wrote: > Hi Harry: > > On Thu, Jul 10, 2008 at 7:41 AM, Harry Mangalam > wrote: > >> Y'all may already know about this, but Linux Mag just posted this >> article, which is a pretty good intro to Perceus. Should be linked >> to on the Perceus site. >> >> http://www.linux-mag.com/id/6386 > > It'd be nice if I have a Linux Magazine account ;-) > > Cheers, > > Bernard > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf -- Jeff Squyres Cisco Systems From harry.mangalam at uci.edu Thu Jul 10 11:34:45 2008 From: harry.mangalam at uci.edu (Harry Mangalam) Date: Thu, 10 Jul 2008 11:34:45 -0700 Subject: [Warewulf] New article about Perceus In-Reply-To: References: <200807100741.39717.harry.mangalam@uci.edu> Message-ID: <200807101134.46328.harry.mangalam@uci.edu> I believe it's a free signup (not uncommon) and they do occasionally have decent articles/whitepapers...like Perceus :) I'm pretty sure I've not paid a subscription for this mag. Harry On Thursday 10 July 2008, Bernard Li wrote: > Hi Harry: > > On Thu, Jul 10, 2008 at 7:41 AM, Harry Mangalam wrote: > > Y'all may already know about this, but Linux Mag just posted this > > article, which is a pretty good intro to Perceus. Should be > > linked to on the Perceus site. > > > > http://www.linux-mag.com/id/6386 > > It'd be nice if I have a Linux Magazine account ;-) > > Cheers, > > Bernard > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824-0084(o), 949 285-4487(c) -- [A Nation of Sheep breeds a Government of Wolves. Edward R. Murrow] From bernard at vanhpc.org Thu Jul 10 11:39:58 2008 From: bernard at vanhpc.org (Bernard Li) Date: Thu, 10 Jul 2008 11:39:58 -0700 Subject: [Warewulf] New article about Perceus In-Reply-To: <66A0A204-C470-408F-A33A-F7B88197BDD8@cisco.com> References: <200807100741.39717.harry.mangalam@uci.edu> <66A0A204-C470-408F-A33A-F7B88197BDD8@cisco.com> Message-ID: Hi Jeff: On Thu, Jul 10, 2008 at 11:33 AM, Jeff Squyres wrote: > It's a free account; what do you care? :-) Bah, I didn't know that :) Thanks for the heads up. Cheers, Bernard From landman at scalableinformatics.com Thu Jul 10 11:45:09 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Thu, 10 Jul 2008 14:45:09 -0400 Subject: [Warewulf] New article about Perceus In-Reply-To: References: <200807100741.39717.harry.mangalam@uci.edu> <66A0A204-C470-408F-A33A-F7B88197BDD8@cisco.com> Message-ID: <487658B5.4070703@scalableinformatics.com> Bernard Li wrote: > Hi Jeff: > > On Thu, Jul 10, 2008 at 11:33 AM, Jeff Squyres wrote: > >> It's a free account; what do you care? :-) > > Bah, I didn't know that :) > > Thanks for the heads up. > > Cheers, > > Bernard Of course, if you are *really* unhappy with registration, there is always the "BugMeNot" firefox plug in ... -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From laytonjb at charter.net Thu Jul 10 11:44:33 2008 From: laytonjb at charter.net (laytonjb at charter.net) Date: Thu, 10 Jul 2008 11:44:33 -0700 Subject: [Warewulf] New article about Perceus In-Reply-To: Message-ID: <20080710144433.MULB4.6428.root@mp20> Everyone, Thanks for the compliments about the article. I hope it was useful for everyone. If you find any mistakes or anything, let me know. In addition, I wrote an article about CAOS+Percues that never got published. I may post that one on Cluster Monkey. If anyone has any article ideas or desired, let me know. Sometimes when those deadlines approach I go brain dead :) Thanks! Jeff ---- Bernard Li wrote: > Hi Jeff: > > On Thu, Jul 10, 2008 at 11:33 AM, Jeff Squyres wrote: > > > It's a free account; what do you care? :-) > > Bah, I didn't know that :) > > Thanks for the heads up. > > Cheers, > > Bernard > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf From jason at redbarncomputers.com Thu Jul 10 11:49:58 2008 From: jason at redbarncomputers.com (Jason Somers) Date: Thu, 10 Jul 2008 14:49:58 -0400 Subject: [Warewulf] New article about Perceus In-Reply-To: <200807101134.46328.harry.mangalam@uci.edu> References: <200807100741.39717.harry.mangalam@uci.edu> <200807101134.46328.harry.mangalam@uci.edu> Message-ID: <487659D6.8000806@redbarncomputers.com> Speaking of magazines... Hey Greg/Art - ever hear about that article in Linux Journal that was supposed to come about from the interviews we gave at SC07? -Jason Publicity is good ;-) ============================ Jason Somers Network Administrator Red Barn Computers 1235 Front Street - Suite #3 Binghamton, NY 13905 (607) 772-1888 x222 Harry Mangalam wrote: > I believe it's a free signup (not uncommon) and they do occasionally > have decent articles/whitepapers...like Perceus :) > > I'm pretty sure I've not paid a subscription for this mag. > > Harry > > On Thursday 10 July 2008, Bernard Li wrote: > >> Hi Harry: >> >> On Thu, Jul 10, 2008 at 7:41 AM, Harry Mangalam >> > wrote: > >>> Y'all may already know about this, but Linux Mag just posted this >>> article, which is a pretty good intro to Perceus. Should be >>> linked to on the Perceus site. >>> >>> http://www.linux-mag.com/id/6386 >>> >> It'd be nice if I have a Linux Magazine account ;-) >> >> Cheers, >> >> Bernard >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > > From cmorse at unm.edu Mon Jul 14 08:10:42 2008 From: cmorse at unm.edu (Caleb Morse) Date: Mon, 14 Jul 2008 09:10:42 -0600 Subject: [Warewulf] dnsmasq name resolution Message-ID: I'm trying to get the dnsmasq server in Perceus to resolve node names. It's resolving the head node just fine, but if I ping n0000 it doesn't work. Am I missing something simple here? -- Caleb -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080714/01e60be4/attachment.html From griznog at gmail.com Mon Jul 14 10:35:05 2008 From: griznog at gmail.com (John Hanks) Date: Mon, 14 Jul 2008 11:35:05 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: Message-ID: What does /etc/resolv.conf look like on your head node and compute nodes? jbh On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse wrote: > I'm trying to get the dnsmasq server in Perceus to resolve node names. It's > resolving the head node just fine, but if I ping n0000 it doesn't work. Am I > missing something simple here? > > -- Caleb > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > From cmorse at unm.edu Mon Jul 14 10:50:55 2008 From: cmorse at unm.edu (Caleb Morse) Date: Mon, 14 Jul 2008 11:50:55 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: Message-ID: It looks like this on the compute node: *********** ; generated by /sbin/dhclient-script search domain.com nameserver 10.0.0.1 *********** Head node: *********** ; generated by /sbin/dhclient-script search lanl.gov nameserver 128.165.4.4 nameserver 128.165.11.88 *********** -- Caleb On Mon, Jul 14, 2008 at 11:35, John Hanks wrote: > What does /etc/resolv.conf look like on your head node and compute nodes? > > jbh > > On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse wrote: > > I'm trying to get the dnsmasq server in Perceus to resolve node names. > It's > > resolving the head node just fine, but if I ping n0000 it doesn't work. > Am I > > missing something simple here? > > > > -- Caleb > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080714/a9172432/attachment.html From cmorse at unm.edu Mon Jul 14 13:31:48 2008 From: cmorse at unm.edu (Caleb Morse) Date: Mon, 14 Jul 2008 14:31:48 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: Message-ID: By the way, I tried changing the search and nameserver lines on the head node to match the compute node settings and this did not fix it. -- Caleb On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: > It looks like this on the compute node: > > *********** > ; generated by /sbin/dhclient-script > search domain.com > nameserver 10.0.0.1 > *********** > > Head node: > > *********** > ; generated by /sbin/dhclient-script > search lanl.gov > nameserver 128.165.4.4 > nameserver 128.165.11.88 > *********** > > -- Caleb > > On Mon, Jul 14, 2008 at 11:35, John Hanks wrote: > >> What does /etc/resolv.conf look like on your head node and compute nodes? >> >> jbh >> >> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse wrote: >> > I'm trying to get the dnsmasq server in Perceus to resolve node names. >> It's >> > resolving the head node just fine, but if I ping n0000 it doesn't work. >> Am I >> > missing something simple here? >> > >> > -- Caleb >> > _______________________________________________ >> > Warewulf mailing list >> > Warewulf at caoslinux.org >> > http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080714/4ed8c62a/attachment.html From gmkurtzer at gmail.com Mon Jul 14 16:24:16 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Mon, 14 Jul 2008 16:24:16 -0700 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: Message-ID: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Are you running DHCP on the head node's public interface? The comment makes it look like it. Try adding: nameserver 127.0.0.1 to the top of the master's /etc/resolv.conf. With that said, you really should be using static IP addresses for the nodes. With Perceus 1.3 static IP addresses over DHCP is outlined in the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it has more features in 1.4 compared to 1.3). Good luck! Greg On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse wrote: > By the way, I tried changing the search and nameserver lines on the head > node to match the compute node settings and this did not fix it. > > -- Caleb > > On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: >> >> It looks like this on the compute node: >> >> *********** >> ; generated by /sbin/dhclient-script >> search domain.com >> nameserver 10.0.0.1 >> *********** >> >> Head node: >> >> *********** >> ; generated by /sbin/dhclient-script >> search lanl.gov >> nameserver 128.165.4.4 >> nameserver 128.165.11.88 >> *********** >> >> -- Caleb >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks wrote: >>> >>> What does /etc/resolv.conf look like on your head node and compute nodes? >>> >>> jbh >>> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse wrote: >>> > I'm trying to get the dnsmasq server in Perceus to resolve node names. >>> > It's >>> > resolving the head node just fine, but if I ping n0000 it doesn't work. >>> > Am I >>> > missing something simple here? >>> > >>> > -- Caleb >>> > _______________________________________________ >>> > Warewulf mailing list >>> > Warewulf at caoslinux.org >>> > http://lists.caosity.org/mailman/listinfo/warewulf >>> > >>> > >>> _______________________________________________ >>> Warewulf mailing list >>> Warewulf at caoslinux.org >>> http://lists.caosity.org/mailman/listinfo/warewulf >> > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From cmorse at unm.edu Tue Jul 15 07:11:31 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 15 Jul 2008 08:11:31 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: No, I am definitely not running the dhcp server on the public interface. I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. It still isn't working. Here's some output from the command line that might help narrow this down. **************** [root at hyrax1 ~]# perceus node list n0000 [root at hyrax1 ~]# nslookup > n0000.domain.com Server: 127.0.0.1 Address: 127.0.0.1#53 ** server can't find n0000.domain.com: NXDOMAIN > hyrax1.domain.com Server: 127.0.0.1 Address: 127.0.0.1#53 Name: hyrax1.domain.com Address: 10.0.0.1 ***************** -- Caleb On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer wrote: > Are you running DHCP on the head node's public interface? The comment > makes it look like it. > > Try adding: > > nameserver 127.0.0.1 > > to the top of the master's /etc/resolv.conf. > > With that said, you really should be using static IP addresses for the > nodes. With Perceus 1.3 static IP addresses over DHCP is outlined in > the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it has > more features in 1.4 compared to 1.3). > > Good luck! > > Greg > > On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse wrote: > > By the way, I tried changing the search and nameserver lines on the head > > node to match the compute node settings and this did not fix it. > > > > -- Caleb > > > > On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: > >> > >> It looks like this on the compute node: > >> > >> *********** > >> ; generated by /sbin/dhclient-script > >> search domain.com > >> nameserver 10.0.0.1 > >> *********** > >> > >> Head node: > >> > >> *********** > >> ; generated by /sbin/dhclient-script > >> search lanl.gov > >> nameserver 128.165.4.4 > >> nameserver 128.165.11.88 > >> *********** > >> > >> -- Caleb > >> > >> On Mon, Jul 14, 2008 at 11:35, John Hanks wrote: > >>> > >>> What does /etc/resolv.conf look like on your head node and compute > nodes? > >>> > >>> jbh > >>> > >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse wrote: > >>> > I'm trying to get the dnsmasq server in Perceus to resolve node > names. > >>> > It's > >>> > resolving the head node just fine, but if I ping n0000 it doesn't > work. > >>> > Am I > >>> > missing something simple here? > >>> > > >>> > -- Caleb > >>> > _______________________________________________ > >>> > Warewulf mailing list > >>> > Warewulf at caoslinux.org > >>> > http://lists.caosity.org/mailman/listinfo/warewulf > >>> > > >>> > > >>> _______________________________________________ > >>> Warewulf mailing list > >>> Warewulf at caoslinux.org > >>> http://lists.caosity.org/mailman/listinfo/warewulf > >> > > > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080715/5774427a/attachment.html From stefano.bridi at gmail.com Tue Jul 15 07:26:09 2008 From: stefano.bridi at gmail.com (Stefano Bridi) Date: Tue, 15 Jul 2008 16:26:09 +0200 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: And what is the content of the following files: /etc/perceus/dnsmasq.conf /etc/hosts Is hyrax1 the hostname of the master? bye stef On Tue, Jul 15, 2008 at 4:11 PM, Caleb Morse wrote: > No, I am definitely not running the dhcp server on the public interface. > > I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. It still > isn't working. Here's some output from the command line that might help > narrow this down. > > **************** > [root at hyrax1 ~]# perceus node list > n0000 > [root at hyrax1 ~]# nslookup >> n0000.domain.com > Server: 127.0.0.1 > Address: 127.0.0.1#53 > > ** server can't find n0000.domain.com: NXDOMAIN >> hyrax1.domain.com > Server: 127.0.0.1 > Address: 127.0.0.1#53 > > Name: hyrax1.domain.com > Address: 10.0.0.1 > ***************** > > -- Caleb > > On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer wrote: >> >> Are you running DHCP on the head node's public interface? The comment >> makes it look like it. >> >> Try adding: >> >> nameserver 127.0.0.1 >> >> to the top of the master's /etc/resolv.conf. >> >> With that said, you really should be using static IP addresses for the >> nodes. With Perceus 1.3 static IP addresses over DHCP is outlined in >> the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it has >> more features in 1.4 compared to 1.3). >> >> Good luck! >> >> Greg >> >> On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse wrote: >> > By the way, I tried changing the search and nameserver lines on the head >> > node to match the compute node settings and this did not fix it. >> > >> > -- Caleb >> > >> > On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: >> >> >> >> It looks like this on the compute node: >> >> >> >> *********** >> >> ; generated by /sbin/dhclient-script >> >> search domain.com >> >> nameserver 10.0.0.1 >> >> *********** >> >> >> >> Head node: >> >> >> >> *********** >> >> ; generated by /sbin/dhclient-script >> >> search lanl.gov >> >> nameserver 128.165.4.4 >> >> nameserver 128.165.11.88 >> >> *********** >> >> >> >> -- Caleb >> >> >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks wrote: >> >>> >> >>> What does /etc/resolv.conf look like on your head node and compute >> >>> nodes? >> >>> >> >>> jbh >> >>> >> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse wrote: >> >>> > I'm trying to get the dnsmasq server in Perceus to resolve node >> >>> > names. >> >>> > It's >> >>> > resolving the head node just fine, but if I ping n0000 it doesn't >> >>> > work. >> >>> > Am I >> >>> > missing something simple here? >> >>> > >> >>> > -- Caleb >> >>> > _______________________________________________ >> >>> > Warewulf mailing list >> >>> > Warewulf at caoslinux.org >> >>> > http://lists.caosity.org/mailman/listinfo/warewulf >> >>> > >> >>> > >> >>> _______________________________________________ >> >>> Warewulf mailing list >> >>> Warewulf at caoslinux.org >> >>> http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> > >> > >> > _______________________________________________ >> > Warewulf mailing list >> > Warewulf at caoslinux.org >> > http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> >> >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > From cmorse at unm.edu Tue Jul 15 07:31:09 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 15 Jul 2008 08:31:09 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: ********************* [root at hyrax1 ~]# cat /etc/perceus/dnsmasq.conf interface=eth2 enable-tftp tftp-root=/var/lib/perceus/tftp dhcp-boot=pxelinux.0 local=// domain=domain.com expand-hosts dhcp-range=10.0.0.5,10.0.0.254 dhcp-lease-max=21600 read-ethers [root at hyrax1 ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 10.0.0.1 hyrax1 ******************* Yes, hyrax1 is the hostname of the master. -- Caleb On Tue, Jul 15, 2008 at 08:26, Stefano Bridi wrote: > And what is the content of the following files: > /etc/perceus/dnsmasq.conf > /etc/hosts > > Is hyrax1 the hostname of the master? > > bye > stef > > > On Tue, Jul 15, 2008 at 4:11 PM, Caleb Morse wrote: > > No, I am definitely not running the dhcp server on the public interface. > > > > I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. It > still > > isn't working. Here's some output from the command line that might help > > narrow this down. > > > > **************** > > [root at hyrax1 ~]# perceus node list > > n0000 > > [root at hyrax1 ~]# nslookup > >> n0000.domain.com > > Server: 127.0.0.1 > > Address: 127.0.0.1#53 > > > > ** server can't find n0000.domain.com: NXDOMAIN > >> hyrax1.domain.com > > Server: 127.0.0.1 > > Address: 127.0.0.1#53 > > > > Name: hyrax1.domain.com > > Address: 10.0.0.1 > > ***************** > > > > -- Caleb > > > > On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer wrote: > >> > >> Are you running DHCP on the head node's public interface? The comment > >> makes it look like it. > >> > >> Try adding: > >> > >> nameserver 127.0.0.1 > >> > >> to the top of the master's /etc/resolv.conf. > >> > >> With that said, you really should be using static IP addresses for the > >> nodes. With Perceus 1.3 static IP addresses over DHCP is outlined in > >> the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it has > >> more features in 1.4 compared to 1.3). > >> > >> Good luck! > >> > >> Greg > >> > >> On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse wrote: > >> > By the way, I tried changing the search and nameserver lines on the > head > >> > node to match the compute node settings and this did not fix it. > >> > > >> > -- Caleb > >> > > >> > On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: > >> >> > >> >> It looks like this on the compute node: > >> >> > >> >> *********** > >> >> ; generated by /sbin/dhclient-script > >> >> search domain.com > >> >> nameserver 10.0.0.1 > >> >> *********** > >> >> > >> >> Head node: > >> >> > >> >> *********** > >> >> ; generated by /sbin/dhclient-script > >> >> search lanl.gov > >> >> nameserver 128.165.4.4 > >> >> nameserver 128.165.11.88 > >> >> *********** > >> >> > >> >> -- Caleb > >> >> > >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks wrote: > >> >>> > >> >>> What does /etc/resolv.conf look like on your head node and compute > >> >>> nodes? > >> >>> > >> >>> jbh > >> >>> > >> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse > wrote: > >> >>> > I'm trying to get the dnsmasq server in Perceus to resolve node > >> >>> > names. > >> >>> > It's > >> >>> > resolving the head node just fine, but if I ping n0000 it doesn't > >> >>> > work. > >> >>> > Am I > >> >>> > missing something simple here? > >> >>> > > >> >>> > -- Caleb > >> >>> > _______________________________________________ > >> >>> > Warewulf mailing list > >> >>> > Warewulf at caoslinux.org > >> >>> > http://lists.caosity.org/mailman/listinfo/warewulf > >> >>> > > >> >>> > > >> >>> _______________________________________________ > >> >>> Warewulf mailing list > >> >>> Warewulf at caoslinux.org > >> >>> http://lists.caosity.org/mailman/listinfo/warewulf > >> >> > >> > > >> > > >> > _______________________________________________ > >> > Warewulf mailing list > >> > Warewulf at caoslinux.org > >> > http://lists.caosity.org/mailman/listinfo/warewulf > >> > > >> > > >> > >> > >> > >> -- > >> Greg Kurtzer > >> http://www.infiscale.com/ > >> http://www.runlevelzero.net/ > >> http://www.perceus.org/ > >> http://www.caoslinux.org/ > >> _______________________________________________ > >> Warewulf mailing list > >> Warewulf at caoslinux.org > >> http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080715/7e2baf7c/attachment.html From stefano.bridi at gmail.com Tue Jul 15 07:55:49 2008 From: stefano.bridi at gmail.com (Stefano Bridi) Date: Tue, 15 Jul 2008 16:55:49 +0200 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: As suggest you Greg, you can try to set up the nodes with "static" ip address. If I remember correctly you can do this populating the /etc/hosts on the master adding: 10.0.0.2 n0000 10.0.0.3 n0001 10.0.0.4 n0002 10.0.0.5 n0003 10.0.0.6 n0004 10.0.0.7 n0005 and then restart perceus and then the nodes. I hope i didn't forget some step... ;) bye stef On Tue, Jul 15, 2008 at 4:31 PM, Caleb Morse wrote: > ********************* > [root at hyrax1 ~]# cat /etc/perceus/dnsmasq.conf > interface=eth2 > enable-tftp > tftp-root=/var/lib/perceus/tftp > dhcp-boot=pxelinux.0 > local=// > domain=domain.com > expand-hosts > dhcp-range=10.0.0.5,10.0.0.254 > dhcp-lease-max=21600 > read-ethers > > [root at hyrax1 ~]# cat /etc/hosts > # Do not remove the following line, or various programs > # that require network functionality will fail. > 127.0.0.1 localhost.localdomain localhost > ::1 localhost6.localdomain6 localhost6 > 10.0.0.1 hyrax1 > ******************* > > Yes, hyrax1 is the hostname of the master. > > -- Caleb > > On Tue, Jul 15, 2008 at 08:26, Stefano Bridi > wrote: >> >> And what is the content of the following files: >> /etc/perceus/dnsmasq.conf >> /etc/hosts >> >> Is hyrax1 the hostname of the master? >> >> bye >> stef >> >> >> On Tue, Jul 15, 2008 at 4:11 PM, Caleb Morse wrote: >> > No, I am definitely not running the dhcp server on the public interface. >> > >> > I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. It >> > still >> > isn't working. Here's some output from the command line that might help >> > narrow this down. >> > >> > **************** >> > [root at hyrax1 ~]# perceus node list >> > n0000 >> > [root at hyrax1 ~]# nslookup >> >> n0000.domain.com >> > Server: 127.0.0.1 >> > Address: 127.0.0.1#53 >> > >> > ** server can't find n0000.domain.com: NXDOMAIN >> >> hyrax1.domain.com >> > Server: 127.0.0.1 >> > Address: 127.0.0.1#53 >> > >> > Name: hyrax1.domain.com >> > Address: 10.0.0.1 >> > ***************** >> > >> > -- Caleb >> > >> > On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer wrote: >> >> >> >> Are you running DHCP on the head node's public interface? The comment >> >> makes it look like it. >> >> >> >> Try adding: >> >> >> >> nameserver 127.0.0.1 >> >> >> >> to the top of the master's /etc/resolv.conf. >> >> >> >> With that said, you really should be using static IP addresses for the >> >> nodes. With Perceus 1.3 static IP addresses over DHCP is outlined in >> >> the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it has >> >> more features in 1.4 compared to 1.3). >> >> >> >> Good luck! >> >> >> >> Greg >> >> >> >> On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse wrote: >> >> > By the way, I tried changing the search and nameserver lines on the >> >> > head >> >> > node to match the compute node settings and this did not fix it. >> >> > >> >> > -- Caleb >> >> > >> >> > On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: >> >> >> >> >> >> It looks like this on the compute node: >> >> >> >> >> >> *********** >> >> >> ; generated by /sbin/dhclient-script >> >> >> search domain.com >> >> >> nameserver 10.0.0.1 >> >> >> *********** >> >> >> >> >> >> Head node: >> >> >> >> >> >> *********** >> >> >> ; generated by /sbin/dhclient-script >> >> >> search lanl.gov >> >> >> nameserver 128.165.4.4 >> >> >> nameserver 128.165.11.88 >> >> >> *********** >> >> >> >> >> >> -- Caleb >> >> >> >> >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks wrote: >> >> >>> >> >> >>> What does /etc/resolv.conf look like on your head node and compute >> >> >>> nodes? >> >> >>> >> >> >>> jbh >> >> >>> >> >> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse >> >> >>> wrote: >> >> >>> > I'm trying to get the dnsmasq server in Perceus to resolve node >> >> >>> > names. >> >> >>> > It's >> >> >>> > resolving the head node just fine, but if I ping n0000 it doesn't >> >> >>> > work. >> >> >>> > Am I >> >> >>> > missing something simple here? >> >> >>> > >> >> >>> > -- Caleb >> >> >>> > _______________________________________________ >> >> >>> > Warewulf mailing list >> >> >>> > Warewulf at caoslinux.org >> >> >>> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> >>> > >> >> >>> > >> >> >>> _______________________________________________ >> >> >>> Warewulf mailing list >> >> >>> Warewulf at caoslinux.org >> >> >>> http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> >> >> > >> >> > >> >> > _______________________________________________ >> >> > Warewulf mailing list >> >> > Warewulf at caoslinux.org >> >> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Greg Kurtzer >> >> http://www.infiscale.com/ >> >> http://www.runlevelzero.net/ >> >> http://www.perceus.org/ >> >> http://www.caoslinux.org/ >> >> _______________________________________________ >> >> Warewulf mailing list >> >> Warewulf at caoslinux.org >> >> http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> > _______________________________________________ >> > Warewulf mailing list >> > Warewulf at caoslinux.org >> > http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > From cmorse at unm.edu Tue Jul 15 09:36:02 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 15 Jul 2008 10:36:02 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: Ok, it is resolving the names correctly now. Problem is that the dhcp server is not following the ip addresses outline in the /etc/hosts file. -- Caleb On Tue, Jul 15, 2008 at 08:55, Stefano Bridi wrote: > As suggest you Greg, you can try to set up the nodes with "static" ip > address. > If I remember correctly you can do this populating the /etc/hosts on > the master adding: > > 10.0.0.2 n0000 > 10.0.0.3 n0001 > 10.0.0.4 n0002 > 10.0.0.5 n0003 > 10.0.0.6 n0004 > 10.0.0.7 n0005 > > and then restart perceus and then the nodes. > I hope i didn't forget some step... ;) > > bye > stef > > > On Tue, Jul 15, 2008 at 4:31 PM, Caleb Morse wrote: > > ********************* > > [root at hyrax1 ~]# cat /etc/perceus/dnsmasq.conf > > interface=eth2 > > enable-tftp > > tftp-root=/var/lib/perceus/tftp > > dhcp-boot=pxelinux.0 > > local=// > > domain=domain.com > > expand-hosts > > dhcp-range=10.0.0.5,10.0.0.254 > > dhcp-lease-max=21600 > > read-ethers > > > > [root at hyrax1 ~]# cat /etc/hosts > > # Do not remove the following line, or various programs > > # that require network functionality will fail. > > 127.0.0.1 localhost.localdomain localhost > > ::1 localhost6.localdomain6 localhost6 > > 10.0.0.1 hyrax1 > > ******************* > > > > Yes, hyrax1 is the hostname of the master. > > > > -- Caleb > > > > On Tue, Jul 15, 2008 at 08:26, Stefano Bridi > > wrote: > >> > >> And what is the content of the following files: > >> /etc/perceus/dnsmasq.conf > >> /etc/hosts > >> > >> Is hyrax1 the hostname of the master? > >> > >> bye > >> stef > >> > >> > >> On Tue, Jul 15, 2008 at 4:11 PM, Caleb Morse wrote: > >> > No, I am definitely not running the dhcp server on the public > interface. > >> > > >> > I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. It > >> > still > >> > isn't working. Here's some output from the command line that might > help > >> > narrow this down. > >> > > >> > **************** > >> > [root at hyrax1 ~]# perceus node list > >> > n0000 > >> > [root at hyrax1 ~]# nslookup > >> >> n0000.domain.com > >> > Server: 127.0.0.1 > >> > Address: 127.0.0.1#53 > >> > > >> > ** server can't find n0000.domain.com: NXDOMAIN > >> >> hyrax1.domain.com > >> > Server: 127.0.0.1 > >> > Address: 127.0.0.1#53 > >> > > >> > Name: hyrax1.domain.com > >> > Address: 10.0.0.1 > >> > ***************** > >> > > >> > -- Caleb > >> > > >> > On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer > wrote: > >> >> > >> >> Are you running DHCP on the head node's public interface? The comment > >> >> makes it look like it. > >> >> > >> >> Try adding: > >> >> > >> >> nameserver 127.0.0.1 > >> >> > >> >> to the top of the master's /etc/resolv.conf. > >> >> > >> >> With that said, you really should be using static IP addresses for > the > >> >> nodes. With Perceus 1.3 static IP addresses over DHCP is outlined in > >> >> the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it has > >> >> more features in 1.4 compared to 1.3). > >> >> > >> >> Good luck! > >> >> > >> >> Greg > >> >> > >> >> On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse wrote: > >> >> > By the way, I tried changing the search and nameserver lines on the > >> >> > head > >> >> > node to match the compute node settings and this did not fix it. > >> >> > > >> >> > -- Caleb > >> >> > > >> >> > On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: > >> >> >> > >> >> >> It looks like this on the compute node: > >> >> >> > >> >> >> *********** > >> >> >> ; generated by /sbin/dhclient-script > >> >> >> search domain.com > >> >> >> nameserver 10.0.0.1 > >> >> >> *********** > >> >> >> > >> >> >> Head node: > >> >> >> > >> >> >> *********** > >> >> >> ; generated by /sbin/dhclient-script > >> >> >> search lanl.gov > >> >> >> nameserver 128.165.4.4 > >> >> >> nameserver 128.165.11.88 > >> >> >> *********** > >> >> >> > >> >> >> -- Caleb > >> >> >> > >> >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks > wrote: > >> >> >>> > >> >> >>> What does /etc/resolv.conf look like on your head node and > compute > >> >> >>> nodes? > >> >> >>> > >> >> >>> jbh > >> >> >>> > >> >> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse > >> >> >>> wrote: > >> >> >>> > I'm trying to get the dnsmasq server in Perceus to resolve node > >> >> >>> > names. > >> >> >>> > It's > >> >> >>> > resolving the head node just fine, but if I ping n0000 it > doesn't > >> >> >>> > work. > >> >> >>> > Am I > >> >> >>> > missing something simple here? > >> >> >>> > > >> >> >>> > -- Caleb > >> >> >>> > _______________________________________________ > >> >> >>> > Warewulf mailing list > >> >> >>> > Warewulf at caoslinux.org > >> >> >>> > http://lists.caosity.org/mailman/listinfo/warewulf > >> >> >>> > > >> >> >>> > > >> >> >>> _______________________________________________ > >> >> >>> Warewulf mailing list > >> >> >>> Warewulf at caoslinux.org > >> >> >>> http://lists.caosity.org/mailman/listinfo/warewulf > >> >> >> > >> >> > > >> >> > > >> >> > _______________________________________________ > >> >> > Warewulf mailing list > >> >> > Warewulf at caoslinux.org > >> >> > http://lists.caosity.org/mailman/listinfo/warewulf > >> >> > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Greg Kurtzer > >> >> http://www.infiscale.com/ > >> >> http://www.runlevelzero.net/ > >> >> http://www.perceus.org/ > >> >> http://www.caoslinux.org/ > >> >> _______________________________________________ > >> >> Warewulf mailing list > >> >> Warewulf at caoslinux.org > >> >> http://lists.caosity.org/mailman/listinfo/warewulf > >> > > >> > > >> > _______________________________________________ > >> > Warewulf mailing list > >> > Warewulf at caoslinux.org > >> > http://lists.caosity.org/mailman/listinfo/warewulf > >> > > >> > > >> _______________________________________________ > >> Warewulf mailing list > >> Warewulf at caoslinux.org > >> http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080715/9d5470c0/attachment.html From stefano.bridi at gmail.com Tue Jul 15 09:44:40 2008 From: stefano.bridi at gmail.com (Stefano Bridi) Date: Tue, 15 Jul 2008 18:44:40 +0200 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: On Tue, Jul 15, 2008 at 6:36 PM, Caleb Morse wrote: > Ok, it is resolving the names correctly now. Problem is that the dhcp server > is not following the ip addresses outline in the /etc/hosts file. Maybe I have forgotten some steps... in "/etc/perceus/ethers" I have something like # DO NOT EDIT THIS FILE! # It is generated automatically by Perceus 00:04:23:D8:15:12 n0000 00:04:23:D8:74:42 n0001 00:04:23:D8:77:62 n0002 and you? I don't remember how I have configured that aspect :( tomorrow I'll check...) Maybe try to remove and re-add a node anche check in this file. bye stef > > -- Caleb > > On Tue, Jul 15, 2008 at 08:55, Stefano Bridi > wrote: >> >> As suggest you Greg, you can try to set up the nodes with "static" ip >> address. >> If I remember correctly you can do this populating the /etc/hosts on >> the master adding: >> >> 10.0.0.2 n0000 >> 10.0.0.3 n0001 >> 10.0.0.4 n0002 >> 10.0.0.5 n0003 >> 10.0.0.6 n0004 >> 10.0.0.7 n0005 >> >> and then restart perceus and then the nodes. >> I hope i didn't forget some step... ;) >> >> bye >> stef >> >> >> On Tue, Jul 15, 2008 at 4:31 PM, Caleb Morse wrote: >> > ********************* >> > [root at hyrax1 ~]# cat /etc/perceus/dnsmasq.conf >> > interface=eth2 >> > enable-tftp >> > tftp-root=/var/lib/perceus/tftp >> > dhcp-boot=pxelinux.0 >> > local=// >> > domain=domain.com >> > expand-hosts >> > dhcp-range=10.0.0.5,10.0.0.254 >> > dhcp-lease-max=21600 >> > read-ethers >> > >> > [root at hyrax1 ~]# cat /etc/hosts >> > # Do not remove the following line, or various programs >> > # that require network functionality will fail. >> > 127.0.0.1 localhost.localdomain localhost >> > ::1 localhost6.localdomain6 localhost6 >> > 10.0.0.1 hyrax1 >> > ******************* >> > >> > Yes, hyrax1 is the hostname of the master. >> > >> > -- Caleb >> > >> > On Tue, Jul 15, 2008 at 08:26, Stefano Bridi >> > wrote: >> >> >> >> And what is the content of the following files: >> >> /etc/perceus/dnsmasq.conf >> >> /etc/hosts >> >> >> >> Is hyrax1 the hostname of the master? >> >> >> >> bye >> >> stef >> >> >> >> >> >> On Tue, Jul 15, 2008 at 4:11 PM, Caleb Morse wrote: >> >> > No, I am definitely not running the dhcp server on the public >> >> > interface. >> >> > >> >> > I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. It >> >> > still >> >> > isn't working. Here's some output from the command line that might >> >> > help >> >> > narrow this down. >> >> > >> >> > **************** >> >> > [root at hyrax1 ~]# perceus node list >> >> > n0000 >> >> > [root at hyrax1 ~]# nslookup >> >> >> n0000.domain.com >> >> > Server: 127.0.0.1 >> >> > Address: 127.0.0.1#53 >> >> > >> >> > ** server can't find n0000.domain.com: NXDOMAIN >> >> >> hyrax1.domain.com >> >> > Server: 127.0.0.1 >> >> > Address: 127.0.0.1#53 >> >> > >> >> > Name: hyrax1.domain.com >> >> > Address: 10.0.0.1 >> >> > ***************** >> >> > >> >> > -- Caleb >> >> > >> >> > On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer >> >> > wrote: >> >> >> >> >> >> Are you running DHCP on the head node's public interface? The >> >> >> comment >> >> >> makes it look like it. >> >> >> >> >> >> Try adding: >> >> >> >> >> >> nameserver 127.0.0.1 >> >> >> >> >> >> to the top of the master's /etc/resolv.conf. >> >> >> >> >> >> With that said, you really should be using static IP addresses for >> >> >> the >> >> >> nodes. With Perceus 1.3 static IP addresses over DHCP is outlined in >> >> >> the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it has >> >> >> more features in 1.4 compared to 1.3). >> >> >> >> >> >> Good luck! >> >> >> >> >> >> Greg >> >> >> >> >> >> On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse wrote: >> >> >> > By the way, I tried changing the search and nameserver lines on >> >> >> > the >> >> >> > head >> >> >> > node to match the compute node settings and this did not fix it. >> >> >> > >> >> >> > -- Caleb >> >> >> > >> >> >> > On Mon, Jul 14, 2008 at 11:50, Caleb Morse wrote: >> >> >> >> >> >> >> >> It looks like this on the compute node: >> >> >> >> >> >> >> >> *********** >> >> >> >> ; generated by /sbin/dhclient-script >> >> >> >> search domain.com >> >> >> >> nameserver 10.0.0.1 >> >> >> >> *********** >> >> >> >> >> >> >> >> Head node: >> >> >> >> >> >> >> >> *********** >> >> >> >> ; generated by /sbin/dhclient-script >> >> >> >> search lanl.gov >> >> >> >> nameserver 128.165.4.4 >> >> >> >> nameserver 128.165.11.88 >> >> >> >> *********** >> >> >> >> >> >> >> >> -- Caleb >> >> >> >> >> >> >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks >> >> >> >> wrote: >> >> >> >>> >> >> >> >>> What does /etc/resolv.conf look like on your head node and >> >> >> >>> compute >> >> >> >>> nodes? >> >> >> >>> >> >> >> >>> jbh >> >> >> >>> >> >> >> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse >> >> >> >>> wrote: >> >> >> >>> > I'm trying to get the dnsmasq server in Perceus to resolve >> >> >> >>> > node >> >> >> >>> > names. >> >> >> >>> > It's >> >> >> >>> > resolving the head node just fine, but if I ping n0000 it >> >> >> >>> > doesn't >> >> >> >>> > work. >> >> >> >>> > Am I >> >> >> >>> > missing something simple here? >> >> >> >>> > >> >> >> >>> > -- Caleb >> >> >> >>> > _______________________________________________ >> >> >> >>> > Warewulf mailing list >> >> >> >>> > Warewulf at caoslinux.org >> >> >> >>> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> >>> > >> >> >> >>> > >> >> >> >>> _______________________________________________ >> >> >> >>> Warewulf mailing list >> >> >> >>> Warewulf at caoslinux.org >> >> >> >>> http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> >> >> >> >> > >> >> >> > >> >> >> > _______________________________________________ >> >> >> > Warewulf mailing list >> >> >> > Warewulf at caoslinux.org >> >> >> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Greg Kurtzer >> >> >> http://www.infiscale.com/ >> >> >> http://www.runlevelzero.net/ >> >> >> http://www.perceus.org/ >> >> >> http://www.caoslinux.org/ >> >> >> _______________________________________________ >> >> >> Warewulf mailing list >> >> >> Warewulf at caoslinux.org >> >> >> http://lists.caosity.org/mailman/listinfo/warewulf >> >> > >> >> > >> >> > _______________________________________________ >> >> > Warewulf mailing list >> >> > Warewulf at caoslinux.org >> >> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> > >> >> > >> >> _______________________________________________ >> >> Warewulf mailing list >> >> Warewulf at caoslinux.org >> >> http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> > _______________________________________________ >> > Warewulf mailing list >> > Warewulf at caoslinux.org >> > http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > From cmorse at unm.edu Tue Jul 15 09:48:21 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 15 Jul 2008 10:48:21 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: Hmm, interesting. My /etc/perceus/ethers file looks similar to yours. But, I was getting an "access denied" error in /var/log/messages for /etc/perceus/ethers. I fixed the permissions and now the nodes can correctly ping each other. But I still cannot resolve the node names from the head node. -- Caleb On Tue, Jul 15, 2008 at 10:44, Stefano Bridi wrote: > On Tue, Jul 15, 2008 at 6:36 PM, Caleb Morse wrote: > > Ok, it is resolving the names correctly now. Problem is that the dhcp > server > > is not following the ip addresses outline in the /etc/hosts file. > > Maybe I have forgotten some steps... > > in "/etc/perceus/ethers" I have something like > > # DO NOT EDIT THIS FILE! > # It is generated automatically by Perceus > 00:04:23:D8:15:12 n0000 > 00:04:23:D8:74:42 n0001 > 00:04:23:D8:77:62 n0002 > > and you? > > I don't remember how I have configured that aspect :( tomorrow I'll > check...) > > Maybe try to remove and re-add a node anche check in this file. > > bye > stef > > > > > -- Caleb > > > > On Tue, Jul 15, 2008 at 08:55, Stefano Bridi > > wrote: > >> > >> As suggest you Greg, you can try to set up the nodes with "static" ip > >> address. > >> If I remember correctly you can do this populating the /etc/hosts on > >> the master adding: > >> > >> 10.0.0.2 n0000 > >> 10.0.0.3 n0001 > >> 10.0.0.4 n0002 > >> 10.0.0.5 n0003 > >> 10.0.0.6 n0004 > >> 10.0.0.7 n0005 > >> > >> and then restart perceus and then the nodes. > >> I hope i didn't forget some step... ;) > >> > >> bye > >> stef > >> > >> > >> On Tue, Jul 15, 2008 at 4:31 PM, Caleb Morse wrote: > >> > ********************* > >> > [root at hyrax1 ~]# cat /etc/perceus/dnsmasq.conf > >> > interface=eth2 > >> > enable-tftp > >> > tftp-root=/var/lib/perceus/tftp > >> > dhcp-boot=pxelinux.0 > >> > local=// > >> > domain=domain.com > >> > expand-hosts > >> > dhcp-range=10.0.0.5,10.0.0.254 > >> > dhcp-lease-max=21600 > >> > read-ethers > >> > > >> > [root at hyrax1 ~]# cat /etc/hosts > >> > # Do not remove the following line, or various programs > >> > # that require network functionality will fail. > >> > 127.0.0.1 localhost.localdomain localhost > >> > ::1 localhost6.localdomain6 localhost6 > >> > 10.0.0.1 hyrax1 > >> > ******************* > >> > > >> > Yes, hyrax1 is the hostname of the master. > >> > > >> > -- Caleb > >> > > >> > On Tue, Jul 15, 2008 at 08:26, Stefano Bridi > > >> > wrote: > >> >> > >> >> And what is the content of the following files: > >> >> /etc/perceus/dnsmasq.conf > >> >> /etc/hosts > >> >> > >> >> Is hyrax1 the hostname of the master? > >> >> > >> >> bye > >> >> stef > >> >> > >> >> > >> >> On Tue, Jul 15, 2008 at 4:11 PM, Caleb Morse wrote: > >> >> > No, I am definitely not running the dhcp server on the public > >> >> > interface. > >> >> > > >> >> > I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. > It > >> >> > still > >> >> > isn't working. Here's some output from the command line that might > >> >> > help > >> >> > narrow this down. > >> >> > > >> >> > **************** > >> >> > [root at hyrax1 ~]# perceus node list > >> >> > n0000 > >> >> > [root at hyrax1 ~]# nslookup > >> >> >> n0000.domain.com > >> >> > Server: 127.0.0.1 > >> >> > Address: 127.0.0.1#53 > >> >> > > >> >> > ** server can't find n0000.domain.com: NXDOMAIN > >> >> >> hyrax1.domain.com > >> >> > Server: 127.0.0.1 > >> >> > Address: 127.0.0.1#53 > >> >> > > >> >> > Name: hyrax1.domain.com > >> >> > Address: 10.0.0.1 > >> >> > ***************** > >> >> > > >> >> > -- Caleb > >> >> > > >> >> > On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer > >> >> > wrote: > >> >> >> > >> >> >> Are you running DHCP on the head node's public interface? The > >> >> >> comment > >> >> >> makes it look like it. > >> >> >> > >> >> >> Try adding: > >> >> >> > >> >> >> nameserver 127.0.0.1 > >> >> >> > >> >> >> to the top of the master's /etc/resolv.conf. > >> >> >> > >> >> >> With that said, you really should be using static IP addresses for > >> >> >> the > >> >> >> nodes. With Perceus 1.3 static IP addresses over DHCP is outlined > in > >> >> >> the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it > has > >> >> >> more features in 1.4 compared to 1.3). > >> >> >> > >> >> >> Good luck! > >> >> >> > >> >> >> Greg > >> >> >> > >> >> >> On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse > wrote: > >> >> >> > By the way, I tried changing the search and nameserver lines on > >> >> >> > the > >> >> >> > head > >> >> >> > node to match the compute node settings and this did not fix it. > >> >> >> > > >> >> >> > -- Caleb > >> >> >> > > >> >> >> > On Mon, Jul 14, 2008 at 11:50, Caleb Morse > wrote: > >> >> >> >> > >> >> >> >> It looks like this on the compute node: > >> >> >> >> > >> >> >> >> *********** > >> >> >> >> ; generated by /sbin/dhclient-script > >> >> >> >> search domain.com > >> >> >> >> nameserver 10.0.0.1 > >> >> >> >> *********** > >> >> >> >> > >> >> >> >> Head node: > >> >> >> >> > >> >> >> >> *********** > >> >> >> >> ; generated by /sbin/dhclient-script > >> >> >> >> search lanl.gov > >> >> >> >> nameserver 128.165.4.4 > >> >> >> >> nameserver 128.165.11.88 > >> >> >> >> *********** > >> >> >> >> > >> >> >> >> -- Caleb > >> >> >> >> > >> >> >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks > >> >> >> >> wrote: > >> >> >> >>> > >> >> >> >>> What does /etc/resolv.conf look like on your head node and > >> >> >> >>> compute > >> >> >> >>> nodes? > >> >> >> >>> > >> >> >> >>> jbh > >> >> >> >>> > >> >> >> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse > >> >> >> >>> wrote: > >> >> >> >>> > I'm trying to get the dnsmasq server in Perceus to resolve > >> >> >> >>> > node > >> >> >> >>> > names. > >> >> >> >>> > It's > >> >> >> >>> > resolving the head node just fine, but if I ping n0000 it > >> >> >> >>> > doesn't > >> >> >> >>> > work. > >> >> >> >>> > Am I > >> >> >> >>> > missing something simple here? > >> >> >> >>> > > >> >> >> >>> > -- Caleb > >> >> >> >>> > _______________________________________________ > >> >> >> >>> > Warewulf mailing list > >> >> >> >>> > Warewulf at caoslinux.org > >> >> >> >>> > http://lists.caosity.org/mailman/listinfo/warewulf > >> >> >> >>> > > >> >> >> >>> > > >> >> >> >>> _______________________________________________ > >> >> >> >>> Warewulf mailing list > >> >> >> >>> Warewulf at caoslinux.org > >> >> >> >>> http://lists.caosity.org/mailman/listinfo/warewulf > >> >> >> >> > >> >> >> > > >> >> >> > > >> >> >> > _______________________________________________ > >> >> >> > Warewulf mailing list > >> >> >> > Warewulf at caoslinux.org > >> >> >> > http://lists.caosity.org/mailman/listinfo/warewulf > >> >> >> > > >> >> >> > > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Greg Kurtzer > >> >> >> http://www.infiscale.com/ > >> >> >> http://www.runlevelzero.net/ > >> >> >> http://www.perceus.org/ > >> >> >> http://www.caoslinux.org/ > >> >> >> _______________________________________________ > >> >> >> Warewulf mailing list > >> >> >> Warewulf at caoslinux.org > >> >> >> http://lists.caosity.org/mailman/listinfo/warewulf > >> >> > > >> >> > > >> >> > _______________________________________________ > >> >> > Warewulf mailing list > >> >> > Warewulf at caoslinux.org > >> >> > http://lists.caosity.org/mailman/listinfo/warewulf > >> >> > > >> >> > > >> >> _______________________________________________ > >> >> Warewulf mailing list > >> >> Warewulf at caoslinux.org > >> >> http://lists.caosity.org/mailman/listinfo/warewulf > >> > > >> > > >> > _______________________________________________ > >> > Warewulf mailing list > >> > Warewulf at caoslinux.org > >> > http://lists.caosity.org/mailman/listinfo/warewulf > >> > > >> > > >> _______________________________________________ > >> Warewulf mailing list > >> Warewulf at caoslinux.org > >> http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080715/1899474c/attachment.html From cmorse at unm.edu Wed Jul 16 06:56:39 2008 From: cmorse at unm.edu (Caleb Morse) Date: Wed, 16 Jul 2008 07:56:39 -0600 Subject: [Warewulf] dnsmasq name resolution In-Reply-To: References: <571f1a060807141624h79be7b10xe51b508a58111d53@mail.gmail.com> Message-ID: I figured it out. After changing /etc/resolv.conf I needed to restart nscd to flush the dns cache. Thanks for the help! -- Caleb On Tue, Jul 15, 2008 at 10:48, Caleb Morse wrote: > Hmm, interesting. My /etc/perceus/ethers file looks similar to yours. But, > I was getting an "access denied" error in /var/log/messages for > /etc/perceus/ethers. I fixed the permissions and now the nodes can correctly > ping each other. But I still cannot resolve the node names from the head > node. > > -- Caleb > > On Tue, Jul 15, 2008 at 10:44, Stefano Bridi > wrote: > >> On Tue, Jul 15, 2008 at 6:36 PM, Caleb Morse wrote: >> > Ok, it is resolving the names correctly now. Problem is that the dhcp >> server >> > is not following the ip addresses outline in the /etc/hosts file. >> >> Maybe I have forgotten some steps... >> >> in "/etc/perceus/ethers" I have something like >> >> # DO NOT EDIT THIS FILE! >> # It is generated automatically by Perceus >> 00:04:23:D8:15:12 n0000 >> 00:04:23:D8:74:42 n0001 >> 00:04:23:D8:77:62 n0002 >> >> and you? >> >> I don't remember how I have configured that aspect :( tomorrow I'll >> check...) >> >> Maybe try to remove and re-add a node anche check in this file. >> >> bye >> stef >> >> > >> > -- Caleb >> > >> > On Tue, Jul 15, 2008 at 08:55, Stefano Bridi >> > wrote: >> >> >> >> As suggest you Greg, you can try to set up the nodes with "static" ip >> >> address. >> >> If I remember correctly you can do this populating the /etc/hosts on >> >> the master adding: >> >> >> >> 10.0.0.2 n0000 >> >> 10.0.0.3 n0001 >> >> 10.0.0.4 n0002 >> >> 10.0.0.5 n0003 >> >> 10.0.0.6 n0004 >> >> 10.0.0.7 n0005 >> >> >> >> and then restart perceus and then the nodes. >> >> I hope i didn't forget some step... ;) >> >> >> >> bye >> >> stef >> >> >> >> >> >> On Tue, Jul 15, 2008 at 4:31 PM, Caleb Morse wrote: >> >> > ********************* >> >> > [root at hyrax1 ~]# cat /etc/perceus/dnsmasq.conf >> >> > interface=eth2 >> >> > enable-tftp >> >> > tftp-root=/var/lib/perceus/tftp >> >> > dhcp-boot=pxelinux.0 >> >> > local=// >> >> > domain=domain.com >> >> > expand-hosts >> >> > dhcp-range=10.0.0.5,10.0.0.254 >> >> > dhcp-lease-max=21600 >> >> > read-ethers >> >> > >> >> > [root at hyrax1 ~]# cat /etc/hosts >> >> > # Do not remove the following line, or various programs >> >> > # that require network functionality will fail. >> >> > 127.0.0.1 localhost.localdomain localhost >> >> > ::1 localhost6.localdomain6 localhost6 >> >> > 10.0.0.1 hyrax1 >> >> > ******************* >> >> > >> >> > Yes, hyrax1 is the hostname of the master. >> >> > >> >> > -- Caleb >> >> > >> >> > On Tue, Jul 15, 2008 at 08:26, Stefano Bridi < >> stefano.bridi at gmail.com> >> >> > wrote: >> >> >> >> >> >> And what is the content of the following files: >> >> >> /etc/perceus/dnsmasq.conf >> >> >> /etc/hosts >> >> >> >> >> >> Is hyrax1 the hostname of the master? >> >> >> >> >> >> bye >> >> >> stef >> >> >> >> >> >> >> >> >> On Tue, Jul 15, 2008 at 4:11 PM, Caleb Morse >> wrote: >> >> >> > No, I am definitely not running the dhcp server on the public >> >> >> > interface. >> >> >> > >> >> >> > I added "nameserver 127.0.0.1" to the master's /etc/resolve.conf. >> It >> >> >> > still >> >> >> > isn't working. Here's some output from the command line that might >> >> >> > help >> >> >> > narrow this down. >> >> >> > >> >> >> > **************** >> >> >> > [root at hyrax1 ~]# perceus node list >> >> >> > n0000 >> >> >> > [root at hyrax1 ~]# nslookup >> >> >> >> n0000.domain.com >> >> >> > Server: 127.0.0.1 >> >> >> > Address: 127.0.0.1#53 >> >> >> > >> >> >> > ** server can't find n0000.domain.com: NXDOMAIN >> >> >> >> hyrax1.domain.com >> >> >> > Server: 127.0.0.1 >> >> >> > Address: 127.0.0.1#53 >> >> >> > >> >> >> > Name: hyrax1.domain.com >> >> >> > Address: 10.0.0.1 >> >> >> > ***************** >> >> >> > >> >> >> > -- Caleb >> >> >> > >> >> >> > On Mon, Jul 14, 2008 at 17:24, Greg Kurtzer >> >> >> > wrote: >> >> >> >> >> >> >> >> Are you running DHCP on the head node's public interface? The >> >> >> >> comment >> >> >> >> makes it look like it. >> >> >> >> >> >> >> >> Try adding: >> >> >> >> >> >> >> >> nameserver 127.0.0.1 >> >> >> >> >> >> >> >> to the top of the master's /etc/resolv.conf. >> >> >> >> >> >> >> >> With that said, you really should be using static IP addresses >> for >> >> >> >> the >> >> >> >> nodes. With Perceus 1.3 static IP addresses over DHCP is outlined >> in >> >> >> >> the docs, and Perceus 1.4 using the "ipaddr" Perceus module (it >> has >> >> >> >> more features in 1.4 compared to 1.3). >> >> >> >> >> >> >> >> Good luck! >> >> >> >> >> >> >> >> Greg >> >> >> >> >> >> >> >> On Mon, Jul 14, 2008 at 1:31 PM, Caleb Morse >> wrote: >> >> >> >> > By the way, I tried changing the search and nameserver lines on >> >> >> >> > the >> >> >> >> > head >> >> >> >> > node to match the compute node settings and this did not fix >> it. >> >> >> >> > >> >> >> >> > -- Caleb >> >> >> >> > >> >> >> >> > On Mon, Jul 14, 2008 at 11:50, Caleb Morse >> wrote: >> >> >> >> >> >> >> >> >> >> It looks like this on the compute node: >> >> >> >> >> >> >> >> >> >> *********** >> >> >> >> >> ; generated by /sbin/dhclient-script >> >> >> >> >> search domain.com >> >> >> >> >> nameserver 10.0.0.1 >> >> >> >> >> *********** >> >> >> >> >> >> >> >> >> >> Head node: >> >> >> >> >> >> >> >> >> >> *********** >> >> >> >> >> ; generated by /sbin/dhclient-script >> >> >> >> >> search lanl.gov >> >> >> >> >> nameserver 128.165.4.4 >> >> >> >> >> nameserver 128.165.11.88 >> >> >> >> >> *********** >> >> >> >> >> >> >> >> >> >> -- Caleb >> >> >> >> >> >> >> >> >> >> On Mon, Jul 14, 2008 at 11:35, John Hanks >> >> >> >> >> wrote: >> >> >> >> >>> >> >> >> >> >>> What does /etc/resolv.conf look like on your head node and >> >> >> >> >>> compute >> >> >> >> >>> nodes? >> >> >> >> >>> >> >> >> >> >>> jbh >> >> >> >> >>> >> >> >> >> >>> On Mon, Jul 14, 2008 at 9:10 AM, Caleb Morse > > >> >> >> >> >>> wrote: >> >> >> >> >>> > I'm trying to get the dnsmasq server in Perceus to resolve >> >> >> >> >>> > node >> >> >> >> >>> > names. >> >> >> >> >>> > It's >> >> >> >> >>> > resolving the head node just fine, but if I ping n0000 it >> >> >> >> >>> > doesn't >> >> >> >> >>> > work. >> >> >> >> >>> > Am I >> >> >> >> >>> > missing something simple here? >> >> >> >> >>> > >> >> >> >> >>> > -- Caleb >> >> >> >> >>> > _______________________________________________ >> >> >> >> >>> > Warewulf mailing list >> >> >> >> >>> > Warewulf at caoslinux.org >> >> >> >> >>> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> >> >>> > >> >> >> >> >>> > >> >> >> >> >>> _______________________________________________ >> >> >> >> >>> Warewulf mailing list >> >> >> >> >>> Warewulf at caoslinux.org >> >> >> >> >>> http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> >> >> >> >> >> >> > >> >> >> >> > >> >> >> >> > _______________________________________________ >> >> >> >> > Warewulf mailing list >> >> >> >> > Warewulf at caoslinux.org >> >> >> >> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> >> > >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> Greg Kurtzer >> >> >> >> http://www.infiscale.com/ >> >> >> >> http://www.runlevelzero.net/ >> >> >> >> http://www.perceus.org/ >> >> >> >> http://www.caoslinux.org/ >> >> >> >> _______________________________________________ >> >> >> >> Warewulf mailing list >> >> >> >> Warewulf at caoslinux.org >> >> >> >> http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> > >> >> >> > >> >> >> > _______________________________________________ >> >> >> > Warewulf mailing list >> >> >> > Warewulf at caoslinux.org >> >> >> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> >> > >> >> >> > >> >> >> _______________________________________________ >> >> >> Warewulf mailing list >> >> >> Warewulf at caoslinux.org >> >> >> http://lists.caosity.org/mailman/listinfo/warewulf >> >> > >> >> > >> >> > _______________________________________________ >> >> > Warewulf mailing list >> >> > Warewulf at caoslinux.org >> >> > http://lists.caosity.org/mailman/listinfo/warewulf >> >> > >> >> > >> >> _______________________________________________ >> >> Warewulf mailing list >> >> Warewulf at caoslinux.org >> >> http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> > _______________________________________________ >> > Warewulf mailing list >> > Warewulf at caoslinux.org >> > http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080716/a69515ad/attachment.html From gmkurtzer at gmail.com Tue Jul 22 21:12:08 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Tue, 22 Jul 2008 21:12:08 -0700 Subject: [Warewulf] Linux World 2008 in SF! Message-ID: <571f1a060807222112u1539f88bsddf1f2a46a5418fe@mail.gmail.com> Join us at Linux World this year in Moscone Center August 4-7th! Infiscale will be in the .Org section in booth 17 demonstrating the latest Perceus 1.4 (including embedded via Intel Rapid Boot), Caos NSA, and some previews of cloud computing with Abstractual. We will also have some cool give-aways (including an Intel mini cluster: dual systems enclosed in a single enclosure) as well as guest vendor representatives. Our Linux World booths have always been successful and great fun to hang out at. Bring your questions, comments and experiences to the Infiscale booth and show your support for Infiscale! Make arrangements before its too late to stop by and enjoy the show. If anyone needs complimentary VIP passes please use the priority code "VPL472" when registering for Linux World for free exhibits and keynote or 20% off conference price and VIP checkin. http://www.linuxworldexpo.com See ya there! Greg -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From gmkurtzer at gmail.com Wed Jul 23 01:01:01 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 23 Jul 2008 01:01:01 -0700 Subject: [Warewulf] Perceus 1.3.8 Released! Message-ID: <571f1a060807230101v2b0fdb79g91f22063e4cc9d06@mail.gmail.com> The Infiscale team is happy to announce the release of Perceus 1.3.8. This is a minor bugfix and feature release in the 1.3 tree and can be used to non-destructively update any of the previous 1.3 releases. Changes include: * Daemon overflow potential fixup * User guide cleanups * Minor "hostname" Perceus module bug fix * Minor "modprobe" Perceus module bug fix * Typo fix in chroot2stateless.sh * Bug where "vnfs list" showed extraneous "stuff" * If there are no devices defined in a private network, default to eth0 during initialization among others! Enjoy! note: Perceus 1.4 is in last stages of testing and preparations for release very shortly now! :) -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From abbyzcool at gmail.com Thu Jul 24 13:21:01 2008 From: abbyzcool at gmail.com (Abhishek K) Date: Thu, 24 Jul 2008 14:21:01 -0600 Subject: [Warewulf] Perceus on Ubuntu Message-ID: <223eadbc0807241321u4f4b9eacs5b2f9283a63a600@mail.gmail.com> Hello, Trying out Perceus 1.4 (r1887) on an Ubuntu machine, the configure fails to identify the correct linux distribution. Looking into it, the release file for Ubuntu is /etc/lsb-release (and not /etc/ubuntu-release). I am not aware if the same file is used by any other distribution, but there's still a check in the configure to check if DISTRIB_ID matches Ubuntu. Patch attached. -- Abhishek -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080724/1a3991ba/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: perceus_ubuntu.patch Type: text/x-diff Size: 1084 bytes Desc: not available Url : http://altruistic.infiscale.org/pipermail/perceus/attachments/20080724/1a3991ba/attachment.bin From gmkurtzer at gmail.com Thu Jul 24 13:26:48 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Thu, 24 Jul 2008 13:26:48 -0700 Subject: [Warewulf] Perceus on Ubuntu In-Reply-To: <223eadbc0807241321u4f4b9eacs5b2f9283a63a600@mail.gmail.com> References: <223eadbc0807241321u4f4b9eacs5b2f9283a63a600@mail.gmail.com> Message-ID: <571f1a060807241326r60eb3cccj58545672f6eb1bc@mail.gmail.com> Since /etc/lsb-release isn't specific to Ubuntu, it might be better to grep it to identify the correct OS. Can you send the contents of lsb-release on ubuntu? Thanks! On Thu, Jul 24, 2008 at 1:21 PM, Abhishek K wrote: > Hello, > > Trying out Perceus 1.4 (r1887) on an Ubuntu machine, the configure fails to > identify the correct linux distribution. > Looking into it, the release file for Ubuntu is /etc/lsb-release (and not > /etc/ubuntu-release). > > I am not aware if the same file is used by any other distribution, but > there's still a check in the configure to check if DISTRIB_ID matches > Ubuntu. > > Patch attached. > > -- Abhishek > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From abbyzcool at gmail.com Thu Jul 24 13:37:35 2008 From: abbyzcool at gmail.com (Abhishek K) Date: Thu, 24 Jul 2008 14:37:35 -0600 Subject: [Warewulf] Perceus on Ubuntu In-Reply-To: <571f1a060807241326r60eb3cccj58545672f6eb1bc@mail.gmail.com> References: <223eadbc0807241321u4f4b9eacs5b2f9283a63a600@mail.gmail.com> <571f1a060807241326r60eb3cccj58545672f6eb1bc@mail.gmail.com> Message-ID: <223eadbc0807241337w5ee98630p515e53343de737d7@mail.gmail.com> Greg, Sure, here goes -- $ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=8.04 DISTRIB_CODENAME=hardy DISTRIB_DESCRIPTION="Ubuntu 8.04.1" I did grep it in my patch. -- Abhishek On Thu, Jul 24, 2008 at 2:26 PM, Greg Kurtzer wrote: > Since /etc/lsb-release isn't specific to Ubuntu, it might be better to > grep it to identify the correct OS. > > Can you send the contents of lsb-release on ubuntu? > > Thanks! > > > On Thu, Jul 24, 2008 at 1:21 PM, Abhishek K wrote: > > Hello, > > > > Trying out Perceus 1.4 (r1887) on an Ubuntu machine, the configure fails > to > > identify the correct linux distribution. > > Looking into it, the release file for Ubuntu is /etc/lsb-release (and not > > /etc/ubuntu-release). > > > > I am not aware if the same file is used by any other distribution, but > > there's still a check in the configure to check if DISTRIB_ID matches > > Ubuntu. > > > > Patch attached. > > > > -- Abhishek > > > > _______________________________________________ > > Warewulf mailing list > > Warewulf at caoslinux.org > > http://lists.caosity.org/mailman/listinfo/warewulf > > > > > > > > -- > Greg Kurtzer > http://www.infiscale.com/ > http://www.runlevelzero.net/ > http://www.perceus.org/ > http://www.caoslinux.org/ > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080724/49d6b949/attachment.html From gmkurtzer at gmail.com Thu Jul 24 13:58:45 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Thu, 24 Jul 2008 13:58:45 -0700 Subject: [Warewulf] Perceus on Ubuntu In-Reply-To: <223eadbc0807241337w5ee98630p515e53343de737d7@mail.gmail.com> References: <223eadbc0807241321u4f4b9eacs5b2f9283a63a600@mail.gmail.com> <571f1a060807241326r60eb3cccj58545672f6eb1bc@mail.gmail.com> <223eadbc0807241337w5ee98630p515e53343de737d7@mail.gmail.com> Message-ID: <571f1a060807241358r4bcaf7ddr101b5da5ad2b0350@mail.gmail.com> Ahh, yes you did. I was thinking of doing that grep at the top of the conditional like this: elif test -f /etc/lsb-release && grep -qi "^DISTRIB_ID=Ubuntu"; then ... blah blah ... I will have that in SVN by this evening. Thanks! Greg On Thu, Jul 24, 2008 at 1:37 PM, Abhishek K wrote: > Greg, > > Sure, here goes -- > > $ cat /etc/lsb-release > DISTRIB_ID=Ubuntu > DISTRIB_RELEASE=8.04 > DISTRIB_CODENAME=hardy > DISTRIB_DESCRIPTION="Ubuntu 8.04.1" > > I did grep it in my patch. > > -- Abhishek > > > On Thu, Jul 24, 2008 at 2:26 PM, Greg Kurtzer wrote: >> >> Since /etc/lsb-release isn't specific to Ubuntu, it might be better to >> grep it to identify the correct OS. >> >> Can you send the contents of lsb-release on ubuntu? >> >> Thanks! >> >> >> On Thu, Jul 24, 2008 at 1:21 PM, Abhishek K wrote: >> > Hello, >> > >> > Trying out Perceus 1.4 (r1887) on an Ubuntu machine, the configure fails >> > to >> > identify the correct linux distribution. >> > Looking into it, the release file for Ubuntu is /etc/lsb-release (and >> > not >> > /etc/ubuntu-release). >> > >> > I am not aware if the same file is used by any other distribution, but >> > there's still a check in the configure to check if DISTRIB_ID matches >> > Ubuntu. >> > >> > Patch attached. >> > >> > -- Abhishek >> > >> > _______________________________________________ >> > Warewulf mailing list >> > Warewulf at caoslinux.org >> > http://lists.caosity.org/mailman/listinfo/warewulf >> > >> > >> >> >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf > > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From gmkurtzer at gmail.com Fri Jul 25 08:55:54 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Fri, 25 Jul 2008 08:55:54 -0700 Subject: [Warewulf] Perceus 1.4.0 release! Message-ID: <571f1a060807250855p3682035r90c4647728de2239@mail.gmail.com> The Infiscale team is happy to make public the grand release of Perceus 1.4.0! This is the first release in the 1.4 series and undergoes a lot of changes. The primary focus of this release is scalability and ease of use. Non-specific major changes include: * Pre-forking Perceus daemon, able to scale to thousands of nodes. * Scalable VNFS transfer mechanism using XGET * Rewrite of the back end database subsystem. Now it can utilize Hash, Btree, and MySQL. Other options are now possible (e.g. flat text configuration file like Warewulf used, other databases structures like the ones used in Oscar, Rocks, xCat, etc...) * Stage one bootstrap OS has been reworked to support more types of installs, hardware, and reliability. * Extensions to the "ipaddr" module so that it is much more capable of doing all IP addressing: static, dynamic and automatic based on probing /etc/hosts. * New Perceus module for XCPU support right from the Perceus stage one bootstrap OS. * Slaves (via Perceus node client daemon) now capable of load balancing and fail over between Perceus masters. * General fixups, cleanups, code design, optimizing based on profiling, etc.. This release contains many changes so please report bugs, issues and concerns as soon as possible to help it mature quickly. Thank you. The Infiscale team. -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From cmorse at unm.edu Tue Jul 29 08:21:10 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 29 Jul 2008 09:21:10 -0600 Subject: [Warewulf] Error when building rpm for Perceus 1.4 Message-ID: I'm getting an RPM build error when I try to build perceus 1.4 with the following command. rpmbuild -ta --nodeps perceus-1.4.0.tar.gz The error is kind of long so I made it an attachment. -- Caleb -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080729/d6707b9a/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: perceus_error.txt Url: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080729/d6707b9a/attachment.txt From tru at pasteur.fr Tue Jul 29 09:37:01 2008 From: tru at pasteur.fr (Tru Huynh) Date: Tue, 29 Jul 2008 18:37:01 +0200 Subject: [Warewulf] Error when building rpm for Perceus 1.4 In-Reply-To: References: Message-ID: <20080729163701.GA15600@sillage.bis.pasteur.fr> On Tue, Jul 29, 2008 at 09:21:10AM -0600, Caleb Morse wrote: > I'm getting an RPM build error when I try to build perceus 1.4 with the > following command. ... > kexec/arch/i386/kexec-multiboot-x86.c: In function 'multiboot_x86_load': > kexec/arch/i386/kexec-multiboot-x86.c:216: warning: pointer targets in passing argument 3 of 'elf_rel_build_load' differ in signedness > kexec/arch/i386/kexec-multiboot-x86.c:344: error: 'PAGE_SIZE' undeclared (first use in this function) > kexec/arch/i386/kexec-multiboot-x86.c:344: error: (Each undeclared identifier is reported only once > kexec/arch/i386/kexec-multiboot-x86.c:344: error: for each function it appears in.) ... known issue for CentOS-5 build (you will also need to have openssl-devel elfutils-libelf-devel installed). The developpers already have a patch commited in the svn verson. I have attached a possible fix. Cheers. Tru -- Dr Tru Huynh | http://www.pasteur.fr/recherche/unites/Binfs/ mailto:tru at pasteur.fr | tel/fax +33 1 45 68 87 37/19 Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France -------------- next part -------------- diff -uNr perceus-1.4.0.ori/3rd_party/Makefile.in perceus-1.4.0.ok/3rd_party/Makefile.in --- perceus-1.4.0.ori/3rd_party/Makefile.in 2008-07-25 17:27:13.000000000 +0200 +++ perceus-1.4.0.ok/3rd_party/Makefile.in 2008-07-25 19:44:00.000000000 +0200 @@ -35,7 +35,7 @@ KEXEC_LEGACY_VERSION = 1.101 KEXEC_LEGACY_SOURCE = kexec-tools-$(KEXEC_LEGACY_VERSION).tar.gz KEXEC_LEGACY_DIR = kexec-tools-$(KEXEC_LEGACY_VERSION) -KEXEC_LEGACY_PATCHES = kexec-tools-1.101-perceusfixes.patch +KEXEC_LEGACY_PATCHES = kexec-tools-1.101-perceusfixes.patch kexec-tools-1.101-PAGE_SIZE.patch DNSMASQ_VERSION = 2.38 DNSMASQ_SOURCE = dnsmasq-$(DNSMASQ_VERSION).tar.gz diff -uNr perceus-1.4.0.ori/3rd_party/patches/kexec-tools-1.101-PAGE_SIZE.patch perceus-1.4.0.ok/3rd_party/patches/kexec-tools-1.101-PAGE_SIZE.patch --- perceus-1.4.0.ori/3rd_party/patches/kexec-tools-1.101-PAGE_SIZE.patch 1970-01-01 01:00:00.000000000 +0100 +++ perceus-1.4.0.ok/3rd_party/patches/kexec-tools-1.101-PAGE_SIZE.patch 2008-07-25 19:12:17.000000000 +0200 @@ -0,0 +1,24 @@ +diff -uNr kexec-tools-1.101.ori/kexec/arch/i386/kexec-multiboot-x86.c kexec-tools-1.101/kexec/arch/i386/kexec-multiboot-x86.c +--- kexec-tools-1.101.ori/kexec/arch/i386/kexec-multiboot-x86.c 2005-01-24 20:58:04.000000000 +0100 ++++ kexec-tools-1.101/kexec/arch/i386/kexec-multiboot-x86.c 2008-07-25 19:10:54.000000000 +0200 +@@ -47,7 +47,9 @@ + #include + #include + #include ++/* Tru http://osdir.com/ml/boot-loaders.fastboot.general/2007-02/msg00307.html + #include ++*/ + #include + #include "../../kexec.h" + #include "../../kexec-elf.h" +@@ -341,7 +343,10 @@ + /* Pick the next aligned spot to load it in */ + freespace = add_buffer(info, + buf, mod_size, mod_size, ++ getpagesize(), 0, 0xffffffffUL, 1); ++/* Tru http://osdir.com/ml/boot-loaders.fastboot.general/2007-02/msg00307.html + PAGE_SIZE, 0, 0xffffffffUL, 1); ++*/ + + /* Add the module command line */ + sprintf(mod_clp, "%s", mod_command_line); From cmorse at unm.edu Tue Jul 29 12:12:05 2008 From: cmorse at unm.edu (Caleb Morse) Date: Tue, 29 Jul 2008 13:12:05 -0600 Subject: [Warewulf] Error when building rpm for Perceus 1.4 In-Reply-To: <20080729163701.GA15600@sillage.bis.pasteur.fr> References: <20080729163701.GA15600@sillage.bis.pasteur.fr> Message-ID: I'm not familiar with .patch files. How would I go about applying this patch to the tar file so that I can still use rpmbuild? -- Caleb On Tue, Jul 29, 2008 at 10:37, Tru Huynh wrote: > On Tue, Jul 29, 2008 at 09:21:10AM -0600, Caleb Morse wrote: > > I'm getting an RPM build error when I try to build perceus 1.4 with the > > following command. > ... > > kexec/arch/i386/kexec-multiboot-x86.c: In function 'multiboot_x86_load': > > kexec/arch/i386/kexec-multiboot-x86.c:216: warning: pointer targets in > passing argument 3 of 'elf_rel_build_load' differ in signedness > > kexec/arch/i386/kexec-multiboot-x86.c:344: error: 'PAGE_SIZE' undeclared > (first use in this function) > > kexec/arch/i386/kexec-multiboot-x86.c:344: error: (Each undeclared > identifier is reported only once > > kexec/arch/i386/kexec-multiboot-x86.c:344: error: for each function it > appears in.) > ... > > known issue for CentOS-5 build (you will also need to have openssl-devel > elfutils-libelf-devel installed). The developpers already have a patch > commited in the svn verson. > > I have attached a possible fix. > > Cheers. > > Tru > -- > Dr Tru Huynh | http://www.pasteur.fr/recherche/unites/Binfs/ > mailto:tru at pasteur.fr | tel/fax +33 1 45 68 87 37/19 > Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France > > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080729/6ecd6346/attachment.html From gt3rx3 at visi.com Tue Jul 29 14:04:13 2008 From: gt3rx3 at visi.com (Michael Allen) Date: Tue, 29 Jul 2008 16:04:13 -0500 Subject: [Warewulf] The patch seems to be failing, too Message-ID: <488F85CD.2040608@visi.com> Apparently the patch is failing too, with this message: [root at godelsrevenge perceus-1.4.0]# patch References: <488F85CD.2040608@visi.com> Message-ID: <571f1a060807291510i6a177758tc448b4156cf51d15@mail.gmail.com> The goal is to release 1.4.1 this evening which should include a fix for this. Greg On Tue, Jul 29, 2008 at 2:04 PM, Michael Allen wrote: > Apparently the patch is failing too, with this message: > > [root at godelsrevenge perceus-1.4.0]# patch > patching file Makefile.in > Hunk #1 FAILED at 35. > 1 out of 1 hunk FAILED -- saving rejects to file Makefile.in.rej > The next patch would create the file kexec-tools-1.101-PAGE_SIZE.patch, > which already exists! Assume -R? [n] > Apply anyway? [n] > Skipping patch. > 1 out of 1 hunk ignored -- saving rejects to file > kexec-tools-1.101-PAGE_SIZE.patch.rej > > All this is being done in /usr/src/redhat/BUILD/perceus-1.4.0 > > I'm running Fedora 9 on a dual-processor box with Opteron 248 chips, 2 > GB RAM, etc. and three NICs configured. I also have double-checked to > make sure I have installed required additional packages. > > After moving the patch file to the installed directory (above) I ran > "patch > Michael Allen > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From gt3rx3 at visi.com Tue Jul 29 18:58:24 2008 From: gt3rx3 at visi.com (Michael Allen) Date: Tue, 29 Jul 2008 20:58:24 -0500 Subject: [Warewulf] Patch failure Message-ID: <488FCAC0.3050508@visi.com> Thanks Greg. This gets to be confusing, and there must be a bazillion different systems out there. I, for one, really appreciate it. Thanks very much. Michael Allen From cmorse at unm.edu Wed Jul 30 06:32:22 2008 From: cmorse at unm.edu (Caleb Morse) Date: Wed, 30 Jul 2008 07:32:22 -0600 Subject: [Warewulf] Error when building rpm for Perceus 1.4 In-Reply-To: References: <20080729163701.GA15600@sillage.bis.pasteur.fr> Message-ID: Since it looks like Greg is planning on releasing 1.4.1 in the near future I will just wait for that release. -- Caleb On Tue, Jul 29, 2008 at 13:12, Caleb Morse wrote: > I'm not familiar with .patch files. How would I go about applying this > patch to the tar file so that I can still use rpmbuild? > > -- Caleb > > > On Tue, Jul 29, 2008 at 10:37, Tru Huynh wrote: > >> On Tue, Jul 29, 2008 at 09:21:10AM -0600, Caleb Morse wrote: >> > I'm getting an RPM build error when I try to build perceus 1.4 with the >> > following command. >> ... >> > kexec/arch/i386/kexec-multiboot-x86.c: In function 'multiboot_x86_load': >> > kexec/arch/i386/kexec-multiboot-x86.c:216: warning: pointer targets in >> passing argument 3 of 'elf_rel_build_load' differ in signedness >> > kexec/arch/i386/kexec-multiboot-x86.c:344: error: 'PAGE_SIZE' undeclared >> (first use in this function) >> > kexec/arch/i386/kexec-multiboot-x86.c:344: error: (Each undeclared >> identifier is reported only once >> > kexec/arch/i386/kexec-multiboot-x86.c:344: error: for each function it >> appears in.) >> ... >> >> known issue for CentOS-5 build (you will also need to have openssl-devel >> elfutils-libelf-devel installed). The developpers already have a patch >> commited in the svn verson. >> >> I have attached a possible fix. >> >> Cheers. >> >> Tru >> -- >> Dr Tru Huynh | http://www.pasteur.fr/recherche/unites/Binfs/ >> mailto:tru at pasteur.fr | tel/fax +33 1 45 68 87 37/19 >> Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France >> >> _______________________________________________ >> Warewulf mailing list >> Warewulf at caoslinux.org >> http://lists.caosity.org/mailman/listinfo/warewulf >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://altruistic.infiscale.org/pipermail/perceus/attachments/20080730/cab2a5a5/attachment.html From gmkurtzer at gmail.com Wed Jul 30 07:18:31 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 30 Jul 2008 07:18:31 -0700 Subject: [Warewulf] Perceus 1.4.1 has been released! Message-ID: <571f1a060807300718l3dfcf0a7ve7fe0947bb123217@mail.gmail.com> The Infiscale team is happy to announce the release of Perceus 1.4.1. This is a minor build and bugfix release in the 1.4 tree and can be used to non-destructively update the previous 1.4.0 release. Enjoy. -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From gmkurtzer at gmail.com Wed Jul 30 08:54:10 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 30 Jul 2008 08:54:10 -0700 Subject: [Warewulf] Patch failure In-Reply-To: <488FCAC0.3050508@visi.com> References: <488FCAC0.3050508@visi.com> Message-ID: <571f1a060807300854v78a492e5nf27dfd3e2cabb3ce@mail.gmail.com> Thanks for the thanks! :) Please let me know how the 1.4.1 release works for ya. Greg On Tue, Jul 29, 2008 at 6:58 PM, Michael Allen wrote: > Thanks Greg. This gets to be confusing, and there must be a bazillion > different systems out there. I, for one, really appreciate it. > > Thanks very much. > > Michael Allen > _______________________________________________ > Warewulf mailing list > Warewulf at caoslinux.org > http://lists.caosity.org/mailman/listinfo/warewulf > -- Greg Kurtzer http://www.infiscale.com/ http://www.runlevelzero.net/ http://www.perceus.org/ http://www.caoslinux.org/ From gt3rx3 at visi.com Wed Jul 30 09:03:13 2008 From: gt3rx3 at visi.com (Michael Allen) Date: Wed, 30 Jul 2008 11:03:13 -0500 Subject: [Warewulf] You're welcome! Message-ID: <489090C1.1070607@visi.com> I won't have time to try it until this afternoon, but I'll keep you advised!! Goofy computers... I always note how stupid they are; they don't do what I want them to do, they do what I TELL them to do. Can you imagine? Michael Allen From gt3rx3 at visi.com Wed Jul 30 10:10:37 2008 From: gt3rx3 at visi.com (Michael Allen) Date: Wed, 30 Jul 2008 12:10:37 -0500 Subject: [Warewulf] Hmmm. Still Failing. Log documents enclosed Message-ID: <4890A08D.2070503@visi.com> Greg: (or list moderator): It still fails. I placed perceus-1.4.1 in /root/Download, then ran " # export TAR_OPTIONS=--wildcards # rpmbuild -ta perceus-1.4.0.tar.gz from that directory. All scripts started normally and created /perceus-1.4.1 in /usr/src/redhat/BUILD/ as I expected it to do. However, it appears to be breaking at the make step. I'm enclosing three files to help sort this out: perceus.spec (which I'm sure you know very well) configure.log make.log The last two files, of course, are the output from the two steps. Am I committing some grievous sin? I'm running this on a master node, dual processor (Opteron 248) box with Fedora 9 with all required files. I have enabled three NICs on the master and two prospective slave nodes. The master has 2 GB RAM and 750 GB hard disk memory. Am I missing some step? Michael Allen From mej at caoslinux.org Wed Jul 30 11:04:47 2008 From: mej at caoslinux.org (Michael Jennings) Date: Wed, 30 Jul 2008 11:04:47 -0700 Subject: [Warewulf] Hmmm. Still Failing. Log documents enclosed In-Reply-To: <4890A08D.2070503@visi.com> References: <4890A08D.2070503@visi.com> Message-ID: <20080730180446.GY11633@kainx.org> On Wednesday, 30 July 2008, at 12:10:37 (-0500), Michael Allen wrote: > Greg: (or list moderator): > > It still fails. I placed perceus-1.4.1 in /root/Download, then ran " > # export TAR_OPTIONS=--wildcards > # rpmbuild -ta perceus-1.4.0.tar.gz > > from that directory. All scripts started normally and created > /perceus-1.4.1 in /usr/src/redhat/BUILD/ as I expected it to do. > > However, it appears to be breaking at the make step. I'm enclosing > three files to help sort this out: > > perceus.spec (which I'm sure you know very well) > configure.log > make.log > > The last two files, of course, are the output from the two steps. Am I > committing some grievous sin? > > I'm running this on a master node, dual processor (Opteron 248) box with > Fedora 9 with all required files. I have enabled three NICs on the > master and two prospective slave nodes. The master has 2 GB RAM and 750 > GB hard disk memory. > > Am I missing some step? You mean like actually attaching the logs? ;-) Seriously, I've built Perceus 1.4.1 on CentOS 4, CentOS 5, and Caos. All work fine out-of-the-box. Michael -- Michael Jennings (a.k.a. KainX) http://www.kainx.org/ Linux Server/Cluster Admin, LBL.gov Author, Eterm (www.eterm.org) ----------------------------------------------------------------------- "The world isn't run by weapons any more, or energy, or money. It's run by little 1's and 0's, little bits of data. It's all just electrons! There's a war out there, old friend -- a world war, and it's not about who's got the most bullets. It's about who controls the information: what we see and hear, how we work, what we think. It's all about the information." -- Cosmo (Ben Kingsley), "Sneakers" From gt3rx3 at visi.com Wed Jul 30 12:54:36 2008 From: gt3rx3 at visi.com (Michael Allen) Date: Wed, 30 Jul 2008 14:54:36 -0500 Subject: [Warewulf] Ah, well. I'll Take a Close Look. Message-ID: <4890C6FC.3070102@visi.com> Who knows. I may be able to actually attach the logs!! No, but I may be able to decide what's missing (aside my my shortage of experience, that is!) I'll let you know if I find an answer. Michael Allen