HW1-无代写
时间:2025-02-13
1Dataflow Analysis in Practice:
Program Analysis Frameworks,
Analysis Scope and Approximation
1
Announcements
n HW1 due today
n HW2 posted
n Your task is to set the infrastructure locally
n Start as soon as you can
n Ask questions on Submitty, in class, in office
hours
n Run Soot on toy programs and study Jimple IR
CSCI 4450/6450, A Milanova 2
2
2So Far and Moving On…
n Dataflow analysis
n Four classical dataflow problems
n Dataflow frameworks
n CFGs, lattices, transfer functions and properties,
worklist algorithm, MFP vs. MOP solutions
n Non-distributive analysis
n Constant propagation
n Points-to analysis (will cover in catchup week!)
n Program analysis in practice
CSCI 4450/6450, A Milanova 3
3
Outline of Today’s Class
n Constant propagation (recap)
n Program analysis in practice
n Program analysis frameworks
n Soot program analysis framework
n Ghidra framework
n Analysis scope and approximation
n Class analysis
CSCI 4450/6450, A Milanova 4
4
3Constant Propagation fits into
Monotone Dataflow Framework
n Property space
n Product lattice L = Lx x Ly x … x Lz
n L satisfies the ACC
and
n Function space F: Là L is monotone
n Thus, analysis fits into the monotone
dataflow framework and can be solved using
the worklist algorithm
CSCI 4450/6450, A Milanova 5
T
T
… -2 -1 0 1 2 ...
Lx:
5
Example
CSCI 4450/6450, A Milanova 6
2.
x=1
y=2
3.
x=2
y=1
4.
z=x+y
1.
if (b>0)
5.
w=10*z
out(2): out(3):
in(4):
out(4):
in(5):
in(1) is T =
6
4Constant Propagation is
Monotone but Not Distributive!
n f4(f2(f1(T))) computes zà 3
n f4(f3(f1(T))) computes zà 3
n Thus, MOP at 5
f4(f2(f1(T))) V f4(f3(f1(T)))
computes zà 3
MFP at 5 computes zà T
(i.e., z is NOT a const)
7
2.
x=1
y=2
3.
x=2
y=1
4.
z=x+y
1.
if (b>0)
5.
w=10*z
out(2): xà1, yà2 out(3): xà2, yà1
in(4): xà T, yà T
out(4): zà T
in(5): zà T
in(1) is T
7
More Product Lattices
n Problem statement: Is integer variable x odd
or even at program point n?
n Lx:
CSCI 4450/6450, A Milanova (Example program from MIT OCW Program Analysis) 8
x=x+1
y=y+2

if (x≥10) T
F
xà T, yà T
y=0
T
T
odd even
xà T, yà even
xà T, yà even
xà T, yà even
8
5More Product Lattices
n Problem statement: What sign does a
variable hold at a given program point, i.e., is
it positive, negative, or 0
n Lx:
E.g., < xà+,yàT,zà0 >
CSCI 4450/6450, A Milanova 9
T
T
+ 0 -
9
So far and moving on
n Intraprocedural dataflow analysis
n CFGs, lattices, transfer functions, worklist
algorithm, etc.
n Classical analyses
n Interprocedural analysis
n Analysis scope and approximation
CSCI 4450/6450, A Milanova 10
10
6Program Analysis in Practice
n Program analysis frameworks
n LLVM
n Ghidra
n Soot
n WALA, other
CSCI 4450/6450, A Milanova 11
11
Soot: a framework for analysis and
optimization of Java/Dalvik bytecode
n https://soot-oss.github.io/soot/
n History
n Overview of Soot
n From Java bytecode/Dalvik bytecode to typed
3-address code (Jimple)
n 3-address code analysis and optimization
n From Jimple to Java/Dalvik
n Jimple
n Analysis
12
12
7History
n https://soot-oss.github.io/soot/
n Started by Prof. Laurie Hendren at McGill
n First paper on Soot came in 1999
n Patrick Lam
n Ondřej Lhoták
n Eric Bodden
n and other…
n Now developed by Eric Bodden and his
group: https://github.com/soot-oss/soot
13CSCI 4450/6450, A Milanova
13
Overview of Soot
Class files/APK
JIMPLIFY
Some IR
ANALYSIS/
OPTIMIZATION
Optimized jimple
Class files/APK
14CSCI 4450/6450, A Milanova
14
8Advantages of Jimple and Soot
n Jimple
n Typed local variables
n 16 simple 3-address statements (1 operator per
statement). Bridges gap from analysis
abstraction to analysis implementation
n Soot provides
n Itraprocedural dataflow analysis framework
n Points-to analysis for Java
n IR from Dalvik and taint analysis
n Other analyses and optimizations 15
15
n Run soot: java soot.Main –jimple A
(need paths)
public class A {
main(String[] args) {
A a = new A();
a.m();
}
public void m() {
}
}
public class A extends java.lang.Object
{
public void () {
A r0;
r0 := @this: A;
specialinvoke r0.
()>();
return;
}

(continues on next slide…)
16
Jimple
CSCI 4450/6450, A Milanova
16
9public class A {
main(String[] args) {
A a = new A();
a.m();
}
public void m() {
}
}

public void m() {
A r0;
r0 := @this: A;
return;
}

Jimple
CSCI 4450/6450, A Milanova 17
Java: Jimple:
17
public class A {
main(String[] args) {
A a = new A();
a.m();
}
public void m() {
}
}

main(java.lang.String[]) {
java.lang.String[] r0;
A $r1, r2;
r0 := @parameter0: java.lang.String[];
$r1 = new A;
specialinvoke $r1.()>();
r2 = $r1;
virtualinvoke r2.();
return;
}
}
Jimple
CSCI 4450/6450, A Milanova 18
Java: Jimple:
18
10
n Abstracts program constructs
n Some basic Soot classes and interfaces
n SootClass
n SootMethod
n SootMethod sm; sm.isMain(), sm.isStatic(), etc.
n Local
n Local l; … l.getType()
n InstanceInvokeExpr
n Represents an instance (as opposed to static) invoke
expression
n InstanceInvokeExpr iie; … receiver = iie.getBase();
Soot Abstractions. Look up API!
CSCI 4450/6450, A Milanova 19
19
n Github project:
https://github.com/soot-oss/soot
n Javadoc:
https://soot-build.cs.uni-paderborn.de/public/origin/develop/soot/soot-
develop/jdoc/ (old)
https://javadoc.io/doc/ca.mcgill.sable/soot/latest/index.html
Resources
CSCI 4450/6450, A Milanova 20
20
11
n Constructor/Super Call:
n Virtual Call:
n Static Call:
n Interface Call:
1. We should not need to worry about dynamicInvoke. (Soot
does support it.)
A a = new A(); $r1 = new A;
specialinvoke $r1.()>();
a.m(); virtualinvoke r2.();
sm(); staticinvoke ();
interfaceinvoke r0.();x.m();
4 Kinds of Calls1
21
21
Outline of Today’s Class
n Program analysis in practice
n Program analysis frameworks
n Soot program analysis framework
n Ghidra framework
n Analysis scope and approximation
n Overview of class analysis framework (HW2)
n Class analysis
CSCI 4450/6450, A Milanova 22
22
12
Analysis Scope
n Intraprocedural analysis
n Scope is the CFG of a single subroutine
n Assumes no call and returns in routine, or
models calls and returns
n What we did so far
n Interprocedural analysis
n Scope of analysis is the ICFG (Interprocedural
CFG), which models flow of control between
routines
CSCI 4450/6450, A Milanova 23
23
Analysis Scope
n Whole-program analysis
n Usually, assumes entry point “main”
n Application code + libraries
n Intricate interdependences, e.g., Android apps
n Modular analysis
n Scope either a library without entry point
n or application code with missing libraries
n … or a library that depends on other missing
libraries
CSCI 4450/6450, A Milanova 24
24
13
Approximations
n Once we tackle the “whole program”
maintaining a solution per program point (i.e.,
in(j) and out(j) sets) becomes too expensive
n Approximations
n Transfer function space
n Lattice
n Context sensitivity
n Flow sensitivity
CSCI 4450/6450, A Milanova 25
25
Context Sensitivity
n So far, we studied intraprocedural analysis
n Once we extend to interprocedural analysis
the issue of “context sensitivity” comes up
n Interprocedural analysis can be context-
insensitive or context-sensitive
n In our Java homework, we’ll work with context-
insensitive analyses
n We’ll talk more about context-sensitive analysis
26CSCI 4450/6450, A Milanova
26
14
Context Insensitivity
n Context-insensitive analysis makes one big
CFG; reduces the problem to standard
dataflow, which we know how to solve
n Treats implicit assignment of actual-to-
parameter and return-to-left_hand_side as
explicit assignment
n E.g., x = id(y) where id: int id(int p) { return p; }
adds p = y // flow of values from arg to param
and x = ret // flow of return to left_hand_side
n Can be flow-sensitive or flow-insensitive 27
27
Context Insensitivity
int id(int p) {
return p;
}
a = 5;
2: b = id(a);
x = b*b;
c = 6;
5: d = id(c);
CSCI 4450/6450, A Milanova 28
1. a = 5
2. p = a
call id
3. return id
b = ret
5. p = c
call id
6. return id
d = ret
4. x = b*b
c = 6
7. entry id
8. ret = p
9. exit id
28
15
Flow Sensitivity
n Flow-sensitive vs. flow-insensitive analysis
n Flow-sensitive analysis maintains the CFG
and computes a solution per each node in
CFG (i.e. each program point)
n Standard dataflow analysis is flow-sensitive
n For large programs, maintaining CFG and
solution per program point does not scale
CSCI 4450/6450, A Milanova 29
29
Flow Insensitivity
n Flow-insensitive analysis discards CFG
edges and computes a single solution S
n A “declarative” definition, i.e., specification:
n Least solution S of equations S = fj(S) V S
n Points-to analysis is an example where such a
solution makes sense!
CSCI 4450/6450, A Milanova 30
30
16
Flow Insensitivity
n An “operational” definition. A worklist
algorithm:
S = 0, W = { 1, 2, … n } /* all nodes */
while W ≠ Ø do {
remove j from W
S = fj(S) V S
if S changed then
W = W U { k | k is ”successor” of j }
}
n “successor” is not CFG successor nodes, but
more generally, nodes k whose transfer
function fk may be affected as a result of the
change in S by j 31
31
Your Homework
n A bunch of flow-insensitive, context-
insensitive analyses for Java
n RTA, 0-CFA, PTA, other
n Simple property space
n Simple transfer functions
n E.g., in fact, RTA gets rid of most CFG nodes,
processes just 2 kinds of nodes!
n Millions of lines of code in seconds
CSCI 4450/6450, A Milanova
32
32
17
Homework
n Install and run starter code
n Please let me as soon as possible if you have
issues
n Frameworks are very fragile. They anger a lot
n Look into your git_repo/sootOutput directory
and study Jimple
n Study framework code and API
n Soot API
n Class analysis framework API
CSCI 4450/6450, A Milanova 33
33
Homework
n Overview of class analysis framework
n We’ll discuss more on Thursday
n Come prepared with questions
CSCI 4450/6450, A Milanova 34
34


essay、essay代写