Skip to content

Using pycharm with an interactive SLURM job

This tutorial focuses on using pycharm with the ML Cloud computing resources. pycharm uses the cluster as a project mirror for Deployment.

Note

You will need to run this set of commands every time you run pycharm. This means that you can develop your code on your personal computer, and then that code is automatically mirrored to the cluster.

In addition, this tutorial will keep an interactive compute-node terminal available to run your deployed code.

Download and [install pycharm following the instructions on the website][600].

Setting up pycharm on the compute nodes.

  • (Always) First, open a terminal window and log into the login node over SSH. In the login node terminal, run

    salloc --job-name pycharm --partition=2080-galvani --gres=gpu:1 --time 8:00:00 --no-shell
    
    which will create a 8-hour resource allocation on a compute node for pycharm, requesting a one GPU on the 2080-ti gpu partition.

  • Write down (remember) the Job ID and the temporary $NODE name in the salloc output.

Note

By running squeue --me --name pycharm -l on the login node you can check on the state of the allocation. If you no longer need the job, then simply execute scancel JOB_ID to cancel it.

  • (Once) You may want to create a Pycharm directory with mkdir $WORK/pycharm on the now active connection to the login node. Run echo $WORK/pycharm and (always) remember this as your Pycharm directory.

  • You can close the connection to the login node.

  • (Always) Run the following tunnel commands on your personal computer.

PYCHARM_PORT=11122 # alternatively pick a random number between 10000 and 40000 - it is only important that it is available on your PC
ssh -AL $PYCHARM_PORT:$NODE:22 $YOURLOGIN@134.2.168.43
and note down/remember your $PYCHARM_PORT value for later.

Note

Don't forget to replace $YOURLOGIN with your ML Cloud username, and $NODE with the temporary compute node name.

Congratulations, you've established the SSH tunnel needed for pycharm.

Note

Remember that, each time, you'll need to potentially change the value of $NODE to point to the compute node you allocated. "in rare cases" you might connect to the wrong, non-idle allocation if you have another job on the same compute node: This can be tested by confirming that ssh $NODE env | grep SLURM_JOB_ID on the login node returns the correct Job ID. If it is the wrong one simply cancel aboves allocation and start from the top.

Configure pycharm to Deploy Code to the Server

  • Start pycharm, and, in the settings, go to the Build, Execution, Deployment tab, and then the Deployment subtab.

  • Click the (+) in the upper left to create a new connection; choose SFTP. This will create a new connection and present you with a new view, where the first line in the new view is "Type: SFTP" The second line is "SSH Configuration" - click the [...] button to open the SSH Configuration pop-up window.

  • In the SSH Configuration window, click the (+) in the upper left and enter the following parameters:

    • Host: localhost
    • Port: the value of $PYCHARM_PORT
    • Username: your cluster username
    • Authentication type: Key pair
    • Private key file: the private key file you use to login to the cluster, often within your home directory as ~/.ssh/id_ed25519.
    • Passphrase: The passphrase to your private key file.
  • Now, press "Test Connection". If the tunnel command mentioned earlier is running, this should pass the test. "OK" out of the window to return to the Deployment subtab.

  • Within the Deployment subtab, for "SSH Configuration", select the connection you just made from the SSH Configuration dropdown.

  • For "Root Path", click the folder. This will show the directory structure on the compute node. Simply navigate to your Pycharm directory path from above and select it. This will be your Pycharm root path.

This has set up the Pycharm connection. When you have Pycharm projects, you will need to complete the mappings stage.

Note

For guidance on this, please see [this explanatory documentation][601]. Note that the documentation there refers to virtualenvs; you can install the virtualenv package into your compute-node conda by following the instructions above to conda activate your remote environment, and then run conda install virtualenv. There are several different but similar ways to manage virtual environments in Python; our suggestion is to pick one, read the documentation for that approach, and be consistent. No matter which you pick, you will want the same virtual environment configuration on both the compute node and your personal computer.

Using pycharm on the compute nodes

Now, each time you want to use pycharm to connect to the server, run the steps indicated with (Always) above. Remember to always use the same Pycharm port number, to update the temporary compute node, and that the connection will not work if you have not started the tunnel, if you have the wrong temporary compute node, or if your interactive job has completed.


Last update: September 9, 2024
Created: September 9, 2024