Running statistical software on Unix servers

If you have statistical programs that take a long time to run or if you have several statistical programs that you want to run simultaneously, you may be able to save time and free up your own computer by running them on a Unix server. The basic idea (see below for details) is that you upload your data and code to your account on the server and then run the program from there. If done correctly, you're program will keep running even after you have logged out. I list some information below on: obtaining unix accounts at Guelph   software to connect to a unix account from a windows PC   running long jobs on unix servers   and additional unix commands

Obtaining an account on a statistical server at the University of Guelph

  • Stats-machine: If you have a Guelph central login, you should also have an account on stats.uoguelph.ca with the same logon and password. You can use the software listed below to connect to it. Software available on the Stats-machine includes MATLAB (type 'matlab') and Stata (type 'stata'). The command line in the stats server will be much more user friendly if you type 'bash' and then hit enter when you log in.

  • Sharcnet: For jobs that take an excessively long time on the stats-machine it may be useful to sign up for an account on sharcnet , which allows you to distribute the job over a large number of computers. Contact information for Sharcnet can be found https://www.sharcnet.ca/my/contact . A list of the software available on sharcnet can be found at http://www.sharcnet.ca/Facilities/software/softwarePage.php. Instructions for obtaining an account on sharcnet can be obtained at https://www.sharcnet.ca/my/help/faq#Getting_An_Account. Instructions for running MATLAB on Sharcnet can be found at https://www.sharcnet.ca/help/index.php/Using_MATLAB   ( some additional notes.) Also see MATLAB's documentation on parallel processing

    Software for using Unix servers from windows

    To log in securely into most Unix servers you will need one of the software programs listed below.

    X-Win-32

    X-Win-32 This software is useful because it allows you to work in Unix windows (Xwindows) environment rather just on a command line. That means, for example, by typing matlab, you will get a MATLAB interface similar to the one you see on your windows pc and not just a command line prompt.

    SSH Communications Security

    Also a secure telnet/ftp program. This gives you a command line. It's secure FTP feature is also particularly convenient for moving files back and forth between your computer and the Unix server.

    This can be obtained on the web at

    ftp://ftp.ssh.com/pub/ssh/

    Download SSHSecureShellClient-3.2.3.exe

    or the latest version that looks like this and ends in .exe.

    (Note: In case the location changes go to the home page located at http://www.ssh.com/ Then click on products, Then SSH Secure Shell for Workstations Then under non commercial downloads click on SSH Secure Shell for Workstations 3.2)

    Putty

    To download go to:

    http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

    Then download putty.exe

    Logging via (unsecured) FTP

  • filezilla software (free download)   some instructions

    SSH Communications Security

    Also a secure telnet/ftp program. This gives you a command line. It's secure FTP feature is also particularly convenient for moving files back and forth between your computer and the Unix server.

    This can be obtained on the web at

    ftp://ftp.ssh.com/pub/ssh/

    Download SSHSecureShellClient-3.2.3.exe

    or the latest version that looks like this and ends in .exe.

    (Note: In case the location changes go to the home page located at http://www.ssh.com/ Then click on products, Then SSH Secure Shell for Workstations Then under non commercial downloads click on SSH Secure Shell for Workstations 3.2)

    Putty

    To download go to:

    http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

    Then download putty.exe

    Logging via (unsecured) FTP

  • filezilla software (free download)   some instructions

    Running long jobs on unix servers

    There are a few differences relative to the way statistical software is often run for short jobs on windows PCs. In describing them below I use MATLAB examples.

  • You will need a separate program file (e.g. myprogram.m) that contains all the commands you want to run, rather than typing commands in one at a time. (Also a good practice on a PC.)

  • Rather than opening up the statistical software first and then running your program, you run it directly from the command line (e.g. muncher [25]% matlab -nodisplay myprogram.m)

  • You also want to run the program in such a way that the program keeps running even after you log out. This referred to as running the program in the background. Usually this done adding 'nohup' before the command and '&' at the end of the command (e.g. muncher [25]% nohup matlab -nodisplay < myprogram.m &)

  • However, since the server is a shared resource, for long jobs you may need to lower your jobs priority so that it does not get in other people's way. On servers your job may be terminated if you don't do this. This can be done using the command nice . (e.g. muncher [25]% nice 19 nohup matlab -nodisplay < myprogram.m &)

  • you may also want to save your output to a file (e.g. log_file) by adding'> log_file' to your command (e.g. (e.g. muncher [25]% nice 19 nohup matlab -nodisplay < myprogram.m >log_file &)

  • Although the above commands are general and can work with most statistical software, MATLAB also provides some MATLAB specific commands for running jobs from the command line, as described here. For instance, the syntax in the last bullet point could be replaced by: nice --adjustment=10 nohup matlab -nodesktop -nosplash -logfile log_file -r 'myprogram; exit;' &

    Other unix Commands

    You will need a few commands to get around in Unix. These include
    ls -lt |less -- to list files in a directory page by page starting with the most recent
    cd a -- change directory to directory called a
    cp a b -- copies file named a to file named b
    chmod o+r a.html -- makes a.html viewable (useful for web pages)
    man commandname -- gives a description of the command commandname
    apropos topic -- lists commands related to the word topic

    For a more thorough introduction and list of commands try these google search results:

    Unix command google search

    Unix Introduction google search