2 Getting started
This section will help you get started using the lab servers.
2.1 Connecting to the VPN
If you’re not currently in the 401 Park / Landmark building, you’ll have to first connect to the Harvard VPN using your HarvardKey credentials to access the lab servers. If you don’t have the VPN client already, you can download it at https://vpn.harvard.edu. AnyConnect functionality comes pre-installed in Linux systems (OpenConnect -> Cisco AnyConnect in the NetworkManager VPN manager).
The VPN address is vpn.harvard.edu/hsph
and your username is your Harvard email. You will be prompted to enter two passwords: first password will be your HarvardKey password and the second password will be push
, which will activate a Duo push on your phone.
Some notes on the VPN:
- If you only have a FASRC VPN account, you’ll need to get a separate, generic Harvard VPN account. FASRC VPN accounts seem to be unable to access the lab servers.
- If you’re still having trouble, try to email IT and ask to be added to the HSPH tunnel as your account might not yet have access.
2.2 Topology of the lab servers
The lab’s servers consist primarily of two categories of servers:
- Work servers, which have high CPU and memory capacity, and
- Storage servers, which have high disk capacity but low CPU and memory capacity.
Each of the storage servers is accessible from the /media/
directory of each work server. For example, the qnap5
storage server is available by accessing /media/qnap5/
on any of the work servers, Isilon
is available at /media/Isilon
, etc.
The structure of /media/
is set up such that it is identical on each of the work servers, so code that references e.g. /media/qnap4/some-file.txt
will be able to run without modification on any of the work servers.
For detailed information about what servers are available and what resources each possesses, see the section on server attributes.
2.3 Best practices
In general, you will want to do your work (i.e. running R code) on one of the work servers and save data to / load data from one or more of the storage servers. The connections of the servers are fast enough that speed shouldn’t be an issue.
The work servers have limited local space, and this space should generally be reserved for R libraries, RStudio temporary files, etc. which are necessary for the RStudio Server to function. It’s OK to keep a few small files in your /home/
directory (which is unique to each of the work servers), but don’t go too crazy!
For guidance on which server to pick for your work, see the section on how to pick a server to work on.
2.4 Connecting to RStudio
RStudio is available on port 8787
of any of the work. You can access RStudio directly by clicking on one of the links below (helpful to bookmark!):
- tiamat: http://sph-tiamat.sph.harvard.edu:8787/ or http://10.174.192.10:8787/
- mithra: http://sph-mithra.sph.harvard.edu:8787/ or http://10.174.192.11:8787/
- dagda: http://sph-dagda.sph.harvard.edu:8787/ or http://10.174.192.12:8787/
- shiva: http://sph-shiva.sph.harvard.edu:8787/ or http://10.174.192.14:8787/
- GCMC: http://sph-gcmc-js.sph.harvard.edu:8787/ or http://10.174.192.15:8787/
Accounts on GCMC must be requested from William Kessler.
2.5 Connecting to the command line of a server
In some cases, you may need to connect directly to the server command lines rather than the RStudio Servers, e.g. to kill jobs, install software, move around a lot of files, etc. RStudio Server has a Terminal tab next to the Console tab that you can use, but it may be more convenient to connect directly via SSH (“Secure Shell”, the protocol for connecting to Linux and Unix servers).
To connect via SSH on macOS or Linux, open the terminal application and type ssh your_username@server_ip
, e.g. ssh jharvard@10.174.192.10
.
To connect via SSH on Windows, you can either:
- Install PuTTY, MobaXTerm, or some other terminal client and then enter the server IP address into the main dialog, or
- Install Linux and then type
ssh your_username@server_ip
from the Linux terminal