Archive for April, 2008

This time, maybe?

30 April 2008

I think I now have the array length optimisation from the last post working properly and safely, as well as being slightly more general. Rather than relying on variables being declared final, I just check whether the variable in question is ever assigned to within the loop by iterating through all the statements in the loop body. This seems to be working.

A first simple optimisation

25 April 2008

As a very simple but safe code-movement optimisation, I have made my stage move calls to Array.length where the Array in question is stored in a final local variable out of loops. This is safe because once an array has been created its length cannot change, even though its contents may. As the reference is in a final local variable we are also guaranteed that the same array is indeed being referred to each time around the loop.

I have just realised that this is still not quite safe, as there could be a final local variable declared inside the scope of the loop, in which case it would be re-initialised (possibly to a different value) each time through the loop.

For example:

int j = 0;
while (j < 5) {
final int[] innerArray = new int[j];
System.out.println("innerArray length " + innerArray.length);
++j;
}

Here the value of innerArray.length will change on each iteration of the loop, even though innerArray is final.

Code movement working

23 April 2008

I now have the basic mechanism of code movement working in JKit: I can move method calls out of loops. The method invocation is replaced with local variable access to a freshly generated variable, and a new statement is added before the start of the loop to assign the result of the method call to that variable. Currently I do not have any way to detect which method calls can safely be moved, so I am explicitly annotating them as @moveable in the source.

To demonstrate this, I used the following program:

public class Loop {
public static @moveable void testMethod() {
System.out.println("testMethod()");
}

public static @moveable int one() {
System.out.println("one()");
return 1;
}

public static void main(String[] args) {
int j = 0;
while (j < 5) {
System.out.println(j);
testMethod();
System.out.println("one = " + one());
++j;
}
}
}

Without the code movement applied, the output is:

0
testMethod()
one()
one = 1
1
testMethod()
one()
one = 1
2
testMethod()
one()
one = 1
3
testMethod()
one()
one = 1
4
testMethod()
one()
one = 1

With the code movement optimisation stage applied, the output is:

one()
0
one = 1
1
one = 1
2
one = 1
3
one = 1
4
one = 1

This demonstrates that both marked method calls are indeed being removed from the loop body. The void method testMethod() is never called at all because it has no return value to be used, which the one() method is called once at the start of the loop and its return value is then stored and reused.

To see what is happening to the flow graph, compare the flow graph before optimisation to the flow graph after the code movement optimisation has been applied. You can see here that the call to testMethod() is replaced with a no-operation, while the call to one() is replaced with the fresh local variable 9999invariant0. This name is generated to be unlikely to clash with any real variable, though it could if there are more than 9999 scopes in the program and scope 9999 contains a variable called invariant0. A better scheme to generate fresh variable names should be found at some point.

Constant folding etc.

17 April 2008

To make a start with writing optimisation stages in JKit I have written a simple constant folding optimisation stage. This will do some simple compile-time evaluation of arithmetic expressions, for example turning int x = 1 + 2; into int x = 3;.

After talking to Lindsay yesterday we have decided that the next step is for me to write some code to traverse the flow graph to find loops, in particular to identify which vertices are the tops of loops. I still need to work out where this and other information discovered during the traversal should be stored: perhaps in the flow graph somehow, perhaps in separate data structures kept locally, or maybe somewhere else.

As a first step to doing the actual code movement of loop invariants I will probably add an annotation for certain method calls to say that they certainly can be moved (e.g. if they just return a constant value and have no side-effects). The next step will probably be more annotations for objects which the programmer guarantees will have no aliases (somewhat like ownership, I think), or perhaps some sort of immutability.

Lindsay suggested that another possible direction is to do some sort of constant folding / propagation for immutable objects like Strings (for the length() method, for example), or partially immutable objects like arrays where the length cannot change. The length could be found once and cached in a local variable, or it might be possible to infer the length from how the array is constructed. For example, it would be nice if the following code:

int[] a = new int[n];
System.out.println(a.length);

could be transformed to

int[] a = new int[n];
System.out.println(n);

SSA; loop optimisations

11 April 2008

Unfortunately it seems that JKit does not currently implement any SSA (Single Static Assignment) transformation. This is annoying, because SSA would make the code movement optimisations I am looking at easier.

On another note, I read [1] this week. This paper looks at what benefit there would be in doing traditional loop transformations in a Java compiler, given that JVM typically do some optimisation in there JIT compiler anyway. The authors find that these optimisations are generally worthwhile, which suggests that my optimisations are also likely to be worthwhile. Traditional loop transformations like this will probably be my first step in working towards moving pure function calls out of loops. I am working with IR (JKIL) though, rather than bytecode.

  1. Simon Hammond and David Lacey, “Loop Transformations in the Ahead-of-Time Optimization of Java Bytecode,” in Proceedings of the 15th International Conference on Compiler Construction, ed. Alan Mycroft and Andreas Zeller (Springer Verlag, 2006)

Annotations

3 April 2008

The original annotation that started this project is @Pure. This would be applied to methods, and I think the proper definition is that a method is pure only if it does not change any state at all (either in the target object or anywhere else) and its return value depends only on the state of the parameters to the method call and the target object. This requires that the method not call any other methods that are not also marked as @Pure.

To determine when the state of objects might change, it will also be necessary to mark appropriate methods as @Const. This will mean that the method does not change the state of the target object, just like const functions in C++.

This is still not enough, however, as there could be aliases to the object in question which could be used to change it in a way that cannot easily be tracked at compile-time. As an initial way to work around this problem (in very limited circumstances), classes will be marked as @Immutable to mean that they can never be changed at all. This will allow me to make a start on making some optimisations and think further about better solutions.