How to Use Condor

HTCondor is an open-source high-throughput computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated cluster of computers, and/or to farm out work to idle desktop computers. HTCondor runs on Linux, Unix, Mac OS X, FreeBSD, and contemporary Windows operating systems. HTCondor can seamlessly integrate both dedicated resources (rack-mounted clusters) and non-dedicated desktop machines (cycle scavenging) into one computing environment.

The CIS Condor pool is made up of CIS Linux desktops and grid nodes - currently 1018 64-bit cores are participating. Condor jobs can be run from your Linux desktop or Eniac.

Note: Condor can run 32 and 64 bit executables in the vanilla universe, and can compile 32 bit executables in the standard universe, but we do not have support for compiling 64 bit executables for the standard universe.

Will Condor jobs affect the performance of my own desktop?

Condor will stop jobs on a workstation if:

You can prevent Condor from starting jobs on your workstation by running the condor_off command. Run condor_on to turn it back on. It will also turn back on the next time the system is rebooted.

On managed Linux workstations, CETS has configured KDE and Gnome to automatically stop Condor when the keyboard or mouse are used. If you use a desktop environment other than KDE or Gnome, add the following command to your startup script:

/usr/local/condor/sbin/condor_kbdd -l $HOME/.kbdd.log -pidfile $HOME/.kbdd.pid

To test condor_compile jobs, use this simple hello world program and procedure:

  1. copy the example text below and save it as a file named hello.c
  2. #include <stdio.h>
    int main(void)
    {
     	printf("hello, Condor\n");
      	return 0;
    }                           
  3. to compile condor support into the program, run the command "condor_compile gcc hello.c -m32 -o hello"
    • copy the example file below and save it as submit.hello:
      Executable = hello
      Universe = standard
      Output = hello.out
      Log = /tmp/hello.log
      Error = hello.err
      Queue
    • run "condor_submit submit.hello" to submit your job.

      The job should be accepted by the scheduler (you can see it in the queue by running the command "condor_q"). You can monitor the job by looking at the log files created in your present working directory, as defined in submit.hello. You should receive an email from condor about your job when it completes, whether it runs successfully or not.

To test jobs that do not have condor support compiled in, i.e. "vanilla universe" support

  1. copy the example file below and save it as submit.vanilla:
  2. Executable = [path to your executable]
    Universe = vanilla
    Output = vanilla.out
    Log = vanilla.log
    Error = vanilla.err
    Queue                                                 
  3. create WORLD WRITEABLE vanilla.out, vanilla.log, and vanilla.err files
  4. run condor_submit submit.vanilla

    Since condor support is not compiled into the program, they don't "run" as your uid and will not be able to write output to these files unless they are world writeable.

How to run Java on Condor.

$ cat > Hello.java
public class Hello {
	public static void main( String [] args ) {
		System.out.println("Hello, world!\n");
	}
}

$ javac Hello.java

$ cat > submit.hello
universe	 = java
executable	 = Hello.class
arguments	 = Hello
output	 = Hello.output
error 	 = Hello.error
queue

$ condor_submit submit.hello

Parallel Processing with Condor

Condor works well for embarrassingly parallel jobs. Here's an example. The Python program demo.py will count the number of lines in a file. I'm going to run this on 23 different files in parallel. I'm only printing the host name to make it clear that the jobs are being distributed over multiple hosts.

$ ls data

in.0  in.10  in.12  in.14  in.16  in.18  in.2	in.21  in.3  in.5  in.7 
in.9  in.1  in.11  in.13  in.15  in.17	in.19  in.20  in.22  in.4  in.6 
in.8

$ more demo.py		   

#!/usr/bin/env python

import sys, socket
import fileinput

hostname = socket.gethostname()

print "Host name: ", hostname

count = 0

for line in fileinput.input():
    count = count + 1

print "number of lines: ", count


$ more submit.demo

Executable = demo.py
Universe = vanilla

Input  = data/in.$(Process)
Output = data/out.$(Process)
Error = data/err.$(Process)

Log = /tmp/demo.log

Queue 23

$ condor_submit submit.demo

Submitting job(s).......................
Logging submit event(s).......................
23 job(s) submitted to cluster 17.

$ ls data

err.0	err.15	err.21	err.8  in.13  in.2   in.6    out.11  out.18  out.4
err.1	err.16	err.22	err.9  in.14  in.20  in.7    out.12  out.19  out.5
err.10	err.17	err.3	in.0   in.15  in.21  in.8    out.13  out.2   out.6
err.11	err.18	err.4	in.1   in.16  in.22  in.9    out.14  out.20  out.7
err.12	err.19	err.5	in.10  in.17  in.3   out.0   out.15  out.21  out.8
err.13	err.2	err.6	in.11  in.18  in.4   out.1   out.16  out.22  out.9
err.14	err.20	err.7	in.12  in.19  in.5   out.10  out.17  out.3

$ more data/out.0

Host name:  CUPIDITY
number of lines:  0

$ more data/out.7

Host name:  hustle
number of lines:  9
© Computing and Educational Technology Services
cets@seas.upenn.edu | 215.898.4707