Hello to all,
I have been battling this situation now for 3 days and still have not found a resolution. I appeal to any and all for help.
Here are the facts as far as I can tell.
1) I moved a 66 node rocks based cluster to a diskless cluster using the latest version of Centos and all updates in place. 2) users are added with home directory mounted across the nodes on the cluster so a user's home directory would sit on /export/home with sym link from home on head node 3) sshkey-gen used to create public/private key in .ssh so that user can have keyless access to all nodes, allowing sge jobs to run across nodes.
( if I have left any information out above - please let me know )
So here is the problem:
All of my current user base has had not issues what so ever with current arrangement and up to about a month ago I could create a user, use the the script that I have to build the keys and do an expect script out to each node answering yes to add access to known_hosts etc.... permissions correctly set on user .ssh directory and files.
Now the problem:
I build a new user - run my script and the users when ssh c33 ( name of a node ) gets a password challenge. I dink with the sshd_config for the nodes and not matter what I do I keep getting the password challenge or a permission error on publickey.
I have done the ssh -vvv c33 and get "sending packet" but no return from openssh and then it defaults out to next method with no results.
I have checked permissions with users that have no issue and as-far-as-I-can tell there is not issue.
Now here is the real strange thing:
I take a user that is already been on the system with no issues and delete the .ssh directory. Then I re-run my keybuilder bash script rebuilding the keys and setting the known_hosts and I get seamless ssh to all nodes. With the new users in the last month, do the same thing and I get the issues above.
I am totally confused. Has something changed with an update? Do I need to do something different with the build of a new user that I did not have to do before? Do I have to do something in particular with my sshd_config file?
Please give me any and all observations - I really need to resolve this issue.
Thanks.
Joseph Norris ha scritto:
Hello to all,
I have been battling this situation now for 3 days and still have not found a resolution. I appeal to any and all for help.
Here are the facts as far as I can tell.
- I moved a 66 node rocks based cluster to a diskless cluster using
the latest version of Centos and all updates in place. 2) users are added with home directory mounted across the nodes on the cluster so a user's home directory would sit on /export/home with sym link from home on head node 3) sshkey-gen used to create public/private key in .ssh so that user can have keyless access to all nodes, allowing sge jobs to run across nodes.
For my cluster, I have set it up this way:
- on the mater node, set up /etc/ssh/ssh_known_hosts so that it has a line for each node in the cluster, the key taken from /etc/ssh/ssh_host_rsa_key.pub
- copy master:/etc/ssh to each node in the cluster
- the new user script only adds the user to the master node and then pushes passwd and shadow and possibly group onto the other cluster nodes.
That's it. Hope that helps. Best regards. Robi
That's a very good idea - I will give it a try.
On 11/23/2010 12:12 PM, Roberto Nunnari wrote:
Joseph Norris ha scritto:
Hello to all,
I have been battling this situation now for 3 days and still have not found a resolution. I appeal to any and all for help.
Here are the facts as far as I can tell.
- I moved a 66 node rocks based cluster to a diskless cluster using
the latest version of Centos and all updates in place. 2) users are added with home directory mounted across the nodes on the cluster so a user's home directory would sit on /export/home with sym link from home on head node 3) sshkey-gen used to create public/private key in .ssh so that user can have keyless access to all nodes, allowing sge jobs to run across nodes.
For my cluster, I have set it up this way:
- on the mater node, set up /etc/ssh/ssh_known_hosts
so that it has a line for each node in the cluster, the key taken from /etc/ssh/ssh_host_rsa_key.pub
copy master:/etc/ssh to each node in the cluster
the new user script only adds the user to the master node
and then pushes passwd and shadow and possibly group onto the other cluster nodes.
That's it. Hope that helps. Best regards. Robi
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos