KLC offers the same large library of scientific computing software - or, "modules" - that you find on the campus-wide Quest system. At Kellogg, the most popular programs are R, Stata, Matlab, SAS, and Python. See the Software on Quest for a full list of titles. This page describes details for how you can launch and monitor your jobs on KLC.


Modules

In order to run licensed software on KLC or Quest, you must first "load" that software module. For example, if you want to run a Python program that uses the Anaconda release of Python version 3.6, these two commands will do that:


interactive python

If you want to check with modules are available to load, you can see the full list (which is very long!) by using module avail. You can also filter the results by adding a case-sensitive search term after module avail. For example, if you wanted to see all the versions of Stata available, you could type:


module avail stata

Notice, in this example, the default version of Stata is version 14. This is the version that you would get if you simply typed module load stata. We recommend you to specify an explicit version whenever you load software on KLC. This is especially important when you use open-source programs like Python or R.

If you then want to run Stata version 15 with a graphical interface, you would just type:


Launching Stata15

Similarly, to run RStudio with version 3.3.3 of R, you would just type:

Launch RStudio

Command Line Syntax

Often you will want to run programs in batch mode - that is, executing a program you have already written in a program such as Stata without using any graphical or interactive interfaces. Running in batch mode definitely makes your work easier when you need to run the same pieces of code repeatedly, and it definitely helps make your work more reproducible.

Suppose you have a program called "my_program" that is written in Stata, SAS, or one of the other popular languages for statistical analysis. Below you can find the recommended command line syntax for running "my_program" in batch mode:

Matlab
[~]$ matlab -nodisplay -nodesktop -nosplash -r "run(my_program.m);exit;" & 
R
[~]$ Rscript my_program.R &
SAS
[~]$ sas -work /kellogg/tmp -memsize 175G my_program.sas &

Note: This examples sets the maximum working memory for SAS to 175 GB. We overwrite the default memory allocation that Quest has set because the Quest limit tends to be much lower than the memory typically available on the KLC servers. You may need to adjust the number up or down depending on the size of data you want SAS to use.

Stata
[~]$ stata -b do my_program.do &
Long Running Jobs

One advantage of running jobs on KLC is so that you can allow your long-running jobs to continue for several days, if necessary, to complete. It is important for you to understand when KLC will automatically keep your jobs running and when it will automatically terminate your jobs.

If you make an ssh connection and launch jobs (either batch or interactive), then those jobs will die abruptly when you log out of your session. To keep your jobs running after you log out, you can either...

  • Run your jobs inside a screen session, or
  • Use nohup before running your batch process. For example: nohup stata -b do my_program.do &
To check on the status of your jobs after you have logged out, you must log in to the same KLC node as before. Then you can issue the command
  ps -u your-netid 
...to see all of the processes running under your NetID on that node.
You can terminate any unwanted or orphaned processes with the command
  kill -9 the-PID-number

When you make a FastX connection, the behavior is the opposite. FastX will continue to run your jobs even after your close your browser window and reboot your computer. Your connections will remain active until you explicitly Terminate the FastX session. For this reason, we recommend you take care to terminate sessions after your work is complete.

One final piece of advice: Even though it is possible to run jobs on KLC for a very long time, your jobs will not always run indefinitely. Sometimes servers need to reboot for maintenance. Other times, a server could go offline because of an unplanned crash. It is always a good idea to write your programs in such a way that your work will not be completely lost if your jobs end early. Long, iterative jobs should write out intermediate output periodically and be written in a way that you could easily restart them from their last checkpoint.


Job Limits

Although you can submit as many jobs as you like, each user is allowed up to 8 CPU cores concurrently across all the KLC nodes at normal priority. When one goes beyond this limit, all their processes incur a reduction in priority. This is how we protect users from having their work slow down because somebody else is using too much of the system.


If your work needs more than eight CPU cores at a time, please ask Kellogg Research Support to advise you on your options.


Kellogg School of Management