Hello to all,
I have been battling this situation now for 3 days and still have not found a resolution. I appeal to any and all for help.
Here are the facts as far as I can tell.
1) I moved a 66 node rocks based cluster to a diskless cluster using the latest version of Centos and all updates in place. 2) users are added with home directory mounted across the nodes on the cluster so a user's home directory would sit on /export/home with sym link from home on head node 3) sshkey-gen used to create public/private key in .ssh so that user can have keyless access to all nodes, allowing sge jobs to run across nodes.
( if I have left any information out above - please let me know )
So here is the problem:
All of my current user base has had not issues what so ever with current arrangement and up to about a month ago I could create a user, use the the script that I have to build the keys and do an expect script out to each node answering yes to add access to known_hosts etc.... permissions correctly set on user .ssh directory and files.
Now the problem:
I build a new user - run my script and the users when ssh c33 ( name of a node ) gets a password challenge. I dink with the sshd_config for the nodes and not matter what I do I keep getting the password challenge or a permission error on publickey.
I have done the ssh -vvv c33 and get "sending packet" but no return from openssh and then it defaults out to next method with no results.
I have checked permissions with users that have no issue and as-far-as-I-can tell there is not issue.
Now here is the real strange thing:
I take a user that is already been on the system with no issues and delete the .ssh directory. Then I re-run my keybuilder bash script rebuilding the keys and setting the known_hosts and I get seamless ssh to all nodes. With the new users in the last month, do the same thing and I get the issues above.
I am totally confused. Has something changed with an update? Do I need to do something different with the build of a new user that I did not have to do before? Do I have to do something in particular with my sshd_config file?
Please give me any and all observations - I really need to resolve this issue.
Thanks.