Guide for End-UsersAnalysis Scope ConstructionPredefined Analyses

Predefined Analyses

Chord provides many standard analyses. This chapter first explains how to run any such analysis and then provides descriptions of various predefined analyses.

Running an Analysis

Each analysis in Chord is written modularly, independent of other analyses, along with lightweight annotations specifying the inputs and outputs of the analysis. Chord's runtime automatically computes dependencies between analyses (e.g., determines which analysis produces as output a result that is needed as input by another analysis). Before running a desired analysis, Chord recursively runs other analyses until the inputs to the desired analysis have been computed; it finally runs the desired analysis to produce the outputs of that analysis.

Each analysis in Chord has a unique name that can be used to run the analysis from the command-line. You can obtain the names of all analyses predefined in Chord by running the following command:

ant -Dchord.out.dir=<OUT_DIR> -Dchord.print.project=true run

This command produces files targets_sortby_name.html, targets_sortby_kind.html, and targets_sortby_producers.html in directory <OUT_DIR>. These files publish the names and descriptions of all analyses that are predefined in Chord, along with the inputs and outputs of each analysis.

The following command runs the analysis named <ANALYSIS_NAME> on the program specified by directory <WORK_DIR> (see Chapter * for how to setup a program):

ant -Dchord.work.dir=<WORK_DIR> -Dchord.run.analyses=<ANALYSIS_NAME> run

For instance, the following command runs a basic may-alias and call-graph analysis (called 0CFA) provided in Chord:

ant -Dchord.work.dir=<WORK_DIR> -Dchord.run.analyses=cipa-0cfa-dlog run

This instructs Chord to run the analysis named cipa-0cfa-dlog, which is defined in file main/src/chord/analyses/alias/cipa_0cfa.dlog.

The output of the above command is of the form:

Buildfile: build.xml

run:
     [java] Chord run initiated at: Mar 13, 2011 10:31:08 PM
     [java] ENTER: cipa-0cfa-dlog
     [java] ENTER: T
     [java] ENTER: RTA
     [java] Iteration: 0
     [java] Iteration: 1
     [java] Iteration: 2
     [java] LEAVE: RTA
     [java] SAVING dom T size: 1386
     [java] LEAVE: T
     [java] ENTER: F
     [java] SAVING dom F size: 4120
     [java] LEAVE: F
     ...
     [java] ENTER: MputStatFldInst
     [java] SAVING rel MputStatFldInst size: 739
     [java] LEAVE: MputStatFldInst
     [java] ENTER: statIM
     [java] SAVING rel statIM size: 3359
     [java] LEAVE: statIM
     [java] Starting command: 'java ... chord_analyses_alias_cipa_0cfa.dlog'
     [java] Relation VH: 541 nodes, 449.0 elements (V0,H0)
     [java] Relation FH: 137 nodes, 8.0 elements (H0,F0)
     [java] Relation HFH: 199 nodes, 35.0 elements (H0,F0,H1)
     [java] Relation IM: 590 nodes, 69.0 elements (I0,M0)
     [java] Finished command: 'java ... chord_analyses_alias_cipa_0cfa.dlog'
     [java] LEAVE: cipa-0cfa-dlog
     [java] Chord run completed at: Mar 13, 2011 10:31:36 PM
     [java] Total time: 00:00:27:671 hh:mm:ss:ms

BUILD SUCCESSFUL
Total time: 28 seconds

The 0CFA analysis consumes and produces multiple program relations. The consumed program relations include MputStatFldInst and statIM, each of which is produced by a separate analysis with the corresponding name. The produced program relations include VH, FH, HFH, and IM. We next briefly discuss these relations.

The program relations consumed by the 0CFA analysis contain basic program facts. For instance, MputStatFldInst is a relation containing each tuple (m,f,v) such that method m in the input Java program contains a putstatic instruction of the form "f = v", while statIM is a relation containing each tuple (i,m) such that m is the target method of invokestatic instruction i.

The program relations produced by the 0CFA analysis represent points-to information and the call graph of the input Java program as computed by the analysis. Specifically, relations VH, FH, and HFH represent points-to information for local variables, static fields, and instance fields, respectively, while relation IM represents the call graph, namely, it contain each tuple (i,m) such that m is a possible target method of call site i.

Metavariables m, f, i, and v above range over values in so-called program domains M, F, I, and V, respectively. A program domain is a set of values of a fixed kind in the input Java program. For instance, M is the domain representing the set of all methods in the input Java program, F is the domain representing the set of all fields, I is the domain representing the set of all call sites in methods in M, and V is the domain representing the set of all local variables of reference type in methods in M. Each of these domains is produced by a separate analysis with the corresponding name. Notice that the analyses producing these domains run upfront because these domains are consumed by the analyses that produce relations such as MputStatFldInst and statIM, which in turn are consumed by the desired 0CFA analysis.

During execution, Chord dumps intermediate and final results to files in the directory specified by property chord.out.dir, whose default value is [chord.work.dir]/chord_output/ and typically does not need to be changed by users. For the above example, this directory is <WORK_DIR>/chord_output/.

The verbosity of Chord's output is controlled by property chord.verbose, whose default value is 1. At verbosity level 0, the above command produces less voluminous output of the form:

Buildfile: build.xml

run:
     [java] Chord run initiated at: Mar 13, 2011 10:35:01 PM
     [java] Chord run completed at: Mar 13, 2011 10:35:28 PM
     [java] Total time: 00:00:26:297 hh:mm:ss:ms

BUILD SUCCESSFUL
Total time: 26 seconds

Points-to and Call-Graph Analyses

Chord offers several choices for computing points-to and call-graph information of Java programs. In each of these choices, the points-to and call-graph information is computed simultaneously (called "on-the-fly call-graph construction" in the literature in contrast to "ahead-of-time call-graph construction" in which the call graph is computed first followed by points-to information). On-the-fly approaches are more precise because, in a dynamically dispatching language like Java, as more points-to facts are discovered, more (dynamically dispatched) methods are deemed reachable, thereby growing the call graph; the code in these newly added methods in turn results in more points-to facts.

Flow-insensitive analysis computes a single abstract heap whereas flow-sensitive analysis computes per-program-point abstract heaps. Context-insensitive analysis analyzes each method at most once (i.e. in a single abstract context), whereas context-sensitive analyses potentially analyze each method multiple times, in different abstract contexts. Thus, flow- and context-sensitive analyses are more precise but less scalable than flow- and context-insensitive analyses, respectively.

Flow-sensitive analysis does not offer much precision over flow-insensitive analysis in practice, especially in the absence of strong updates and in the presence of SSA (Static Single Assignment form), a program representation that renders a flow-insensitive analysis almost as precise as a flow-sensitive analysis. Since analyses in Chord currently perform only weak updates, and since they all operate on an SSA form of the input Java program by default, the rest of this section focuses only on flow-insensitive analysis, which is the predominant kind of points-to/call-graph analysis in Chord.

We describe context-insensitive analysis first because understanding the concepts behind it will help understand the more sophisticated context-sensitive analyses. We first recall some relevant program domains:

Context-Insensitive Analysis

The context-insensitive points-to/call-graph analysis treats each object allocation site as a separate abstract memory location; in other words, it can distinguish objects created at different sites but not those created at the same site. Additionally, it is field-sensitive, in that it distinguishes between different instance fields of the same object, but array-insensitive, in that it cannot distinguish between different elements of the same array; all array elements are modeled using a distinguished hypothetical instance field (that has index 0 in domain F).

The following command can be used to run this analysis on the program specified by directory <WORK_DIR> (see Chapter * for how to setup a program):

ant -Dchord.work.dir=<WORK_DIR> -Dchord.run.analyses=cipa-0cfa-dlog run

This analysis outputs the following relations:

Context-Sensitive Analysis

In a context-sensitive analysis, there is no longer just one abstract memory location per object allocation site. Rather, the set of objects a reference can point to depends on the context in which the method containing the reference is called. Whereas a context-insensitive analysis talks about the domain of methods (M) and the domain of allocation sites (H), a context-sensitive analysis talks about the domain of abstract contexts, labeled C. Elements of domain C contain both abstract calling contexts and abstract objects. (These are merged for reasons described below.)

Chord has several context-sensitive analyses, but they all expose the same relations, which are described below:

Under Construction: explain object-sensitive vs context-sensitive, plus how to invoke.

Static Datarace Analysis

The following command can be used to run Chord's static datarace analysis on the program specified by directory <WORK_DIR> (see Chapter * for how to setup a program):

ant -Dchord.work.dir=<WORK_DIR> -Dchord.run.analyses=datarace-java run

Directory examples/datarace_test/ provides a toy Java program on which to run the datarace analysis. First run ant in that directory (in order to compile the program's .java files to .class files) and then run the above command with <WORK_DIR> replaced by examples/datarace_test/. Upon successful completion, the following files should be produced in directory examples/datarace_test/chord_output/:

Static Deadlock Analysis

The following command can be used to run Chord's static deadlock analysis on the program specified by directory <WORK_DIR> (see Chapter * for how to setup a program):

ant -Dchord.work.dir=<WORK_DIR> -Dchord.run.analyses=deadlock-java run

Directory examples/deadlock_test/ provides a toy Java program on which to run the deadlock analysis. First run ant in that directory (in order to compile the program's .java files to .class files) and then run the above command with <WORK_DIR> replaced by examples/deadlock_test/. Upon successful completion, the file deadlocks.html should be produced in directory examples/deadlock_test/chord_output/.


mhn@cs.stanford.edu

Guide for End-UsersAnalysis Scope ConstructionPredefined Analyses