Periklis Ntanasis:
Master's Touch

fade out

Overflowing the stack for fun

In Java it is very common to fine tune the JVM heap size. However, it is not so common to specifically set the stack size.

In this article we are going to see how the JVM stack works and how we may manipulate the stack size. Then, we are going to understand why the StackOverflowException occurs and what is the relation between the stack size and the thread count.

The included code examples use Java 11. The operating system I used to run them is Linux and the mentioned results may vary depending the system available resources.

JVM Stack 101

When a new Java thread is created some space is allocated for the stack.

When a method call is performed then some information is pushed into the stack. The information stored into the stack is called stack frame and contains 3 kind of data. The Local Variable Array, the Operand Stack and the Frame Data.

The Local Variable Array contains entries (called words) for the arguments and the local variables of the method. The array slots have a size of 4 bytes. The types of int, float, byte, short, char and object reference take over 1 slot (4 bytes). Double and long occupy 2 slots (8 bytes). Boolean usual takes over 1 slot but this may vary depending the JVM implementation.

The Operand Stack is used to store and retrieve values between various byte-code operations. The Operand Stack is used by the JVM like the CPU registers which stores values between the various operations.

The Frame Data is used to store data required for the constant pool resolution, the normal method return and the exception dispatch.

The memory for the stack is allocated when a new thread is created. If there is not enough space available then an OutOfMemoryError will occur. Afterwards, the stack is populated by stack frames when method call happens. If there is not enough space available when a stack frame is about to be pushed into the stack then a StackOverflowException is thrown.

The JVM -Xss argument

The size of the stack is configurable through the JVM -Xss argument (stack size).

This value is about the stack of each created thread. So, if for example we set the stack size to 2 megabytes (-Xss2M) then each created thread will allocate 2 megabytes of memory.

Have in mind that the stack used by thread may be explicitly set at the creation point of the specific thread and that it’s up to the specific JVM implementation used to ignore it or not.

Recursion and StackOverflowException

The smaller the stack size the sooner the stack will be overflowed.

In the start of the article I have mentioned that it is rare to adjust the stack size. This is why most of the times the default stack size is sufficient for the operations we perform.

However, if we perform too many method calls then the stack’s available free space will be reduced significantly. This is a common case when recursion is involved.

Recursion is a programming technique where a method calls itself for a number of times until the desired outcome is achieved.

We may write any code which uses a loop with recursion.

For example here it is a loop that prints some sequential numbers:

ArrayTraversal.java
int[] arr = new int[] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
for (int i=0; i< arr.length; i++) {
  System.out.println(i);
}

Here is an equivalent recursive method:

RecursiveArrayTraversal.java
public static void printArray(int[] arr, int idx) {
  if (idx < arr.length) {
    System.out.println(arr[idx]);
    printArray(arr, ++idx);
  }
}

Which may be used like that:

printArray(arr, 0);

While the above method works as expected the number of method calls is increased in a linear way depending the length of the provided array. Each time the method is called then some space is used in the stack.

So, the above example for a very big array it could eventually cause a StackOverflowException.

RecursiveArrayTraversalStackOverflow.java
int[] arr = IntStream.range(1, 19122).toArray();
printArray(arr, 0);

This code if it is executed with the -Xss2M -Xint it will cause a StackOverflowException.

Below is an example which demonstrates the exhaustion of the stack regardless the stack size:

StackExhaustionRecursion.java
public class StackExhaustionRecursion {

    public static final AtomicInteger counter = new AtomicInteger(0);

    public static void recur() {
        counter.incrementAndGet();
        recur();
    }

    public static void main(String[] args) {
        try {
            recur();
        } catch (StackOverflowError ex) {
            ex.printStackTrace();
            System.out.println("Depth: " + counter.get());
        }
    }

}

The recur() method performs infinite recursion. This means that it calls itself infinitely because there isn’t any terminating condition.

Execute this by doing java -Xint -Xss2M StackExhaustionRecursion.java. The -Xint turns off the just in time (JIT) compilation which performs various optimizations on the spot resulting in variable stack depth to be printed between the executions of the code. So, with the -Xint you’ll get consistent results each time you run the above code (considering that there are enough resources such as physical memory available between the executions).

The above example in my machine will print 22612 with the default configuration.

Changing the stack size to 1M will print 10696 while changing the stack size to 4M will print 46443.

The recur() method do not have any local variable or parameter. If we add some then the stack overflow will happen sooner because the stack frames and more specifically the local variables array part would be bigger.

See the below example:

StackExhaustionRecursionWithLocalParameters.java
public class StackExhaustionRecursionWithLocalParameters {

    public static final AtomicInteger counter = new AtomicInteger(0);

    public static void recur() {
        int a = 5;
        int b = 10;
        int c = 15;
        int d = 20;
        int e = 25;
        int f = 30;
        int g = 35;
        int h = 40;
        counter.incrementAndGet();
        recur();
    }

    public static void main(String[] args) {
        try {
            recur();
        } catch (StackOverflowError ex) {
            ex.printStackTrace();
            System.out.println("Depth: " + counter.get());
        }
    }
}

This prints 13091 in my machine which is considerably smaller compared to 22612 (all these examples use a stack size of 2M except if stated otherwise).

Similarly the method arguments populate the Local Variable Array and exhausts the available stack space sooner.

StackExhaustionRecursionWithPrimitiveMethodArguments.java
public class StackExhaustionRecursionWithPrimitiveMethodArguments {

    public static final AtomicInteger counter = new AtomicInteger(0);

    public static void recur(int a, int b, int c, int d, int e, int f, int g, int h) {
        counter.incrementAndGet();
        recur(a, b, c, d, e, f, g, h);
    }

    public static void main(String[] args) {
        try {
            recur(0, 1, 2, 3, 4, 5, 6, 7);
        } catch (StackOverflowError ex) {
            ex.printStackTrace();
            System.out.println("Depth: " + counter.get());
        }
    }

}

This will print in my machine 13091. This is as expected similar to having the same arguments as local variables.

Based on the above, have in mind that in case you have valid reasons to have many method calls and you want to reduce the allocated stack space you may save some of it by replacing the individual method arguments by object references. Consider the following example:

StackExhaustionRecursionWithObjectRef.java
public class StackExhaustionRecursionWithObjectRef {

    public static final AtomicInteger counter = new AtomicInteger(0);

    static class ParamObject {
        int a = 5;
        int b = 10;
        int c = 15;
        int d = 20;
        int e = 25;
        int f = 30;
        int g = 35;
        int h = 40;
    }

    public static void recur(ParamObject a) {
        counter.incrementAndGet();
        recur(a);
    }

    public static void main(String[] args) {
        ParamObject po = new ParamObject();
        try {
            recur(po);
        } catch (StackOverflowError ex) {
            ex.printStackTrace();
            System.out.println("Depth: " + counter.get());
        }
    }

}

In my machine this will crash after 20727 method calls. This is much better than having all the values as method arguments.

However, have in mind that this is not the same. This example creates once the ParamObject in the heap and copies the object reference into the Local Variables Array each time a stack frame is pushed into the stack.

In the case of the method arguments the values are copied into the stack. So, the two examples are not equivalent.

The equivalent would be to pass a clone copy of the object as a parameter to the method call.

public static void recur(ParamObject a) {
        counter.incrementAndGet();
        recur(a.clone());
}

Of course, this will exhaust the heap size sooner instead of the stack. So, a new trade-off is introduced.

Please, keep in mind all the above and make your choices wisely!

Indirect Recursion

There are times where recursion may happen in an indirect way. For example assume that the method A calls method B, method B calls method C and method C calls method A again.

These kind of scenarios may be less obvious to debug and may happen unknowingly if the flow of the program is somewhat dynamic.

For example if there are things like handlers or interceptors that calls one the other and they are configured externally this thing may happen.

Tail Call Optimization

Because we have mentioned extensively the recursion I would like to briefly mention the tail call optimization too.

In a few words this is a technique used by the compiler to optimize recursive method calls to use less stack space.

Unfortunately, it is not implemented in Java but it is an interesting topic. If this was available in Java the previous examples could be written in a recursive way that wouldn’t throw StackOverflowError.

Stack size and thread count

As mentioned before the stack size is allocated in the memory when a new thread is created. Considering that the memory space is finite then this means that a bigger stack size will cause the memory to be exhausted sooner while a smaller stack size will be exhausted later resulting to more threads to be created.

Let’s see an example. Consider the following code:

ThreadCountVsMaxSize.java
public class ThreadCountVsMaxSize {

    static AtomicInteger threadCount = new AtomicInteger(1);

    public static void main(String[] args) {
        try {
            for (;;) {
                new Thread(() -> {
                    for (;;) {
                        try {
                            Thread.sleep(1000);
                        } catch (Throwable ex) {
                            ex.printStackTrace();
                            System.err.println("This is a sleep exception");
                        }
                    }
                }).start();
                threadCount.incrementAndGet();
            }
        } catch (Throwable e) {
            System.out.println("Thread Count: " + threadCount.incrementAndGet());
            e.printStackTrace();
            System.exit(1);
        }
    }

This code creates new threads which are kept alive. When the new threads won’t be able to be created anymore it exits printing the number of the created threads.

Now, let’s see it in action. However, before doing so let’s explicitly set the amount of available virtual memory in Linux. We have to do this because virtual memory which utilizes both physical memory and pagination is usually too big and it would be harder to get meaningful results in our experiment. Also, there are other limits (i.e. max user processes etc.) which may affect the maximum number of created threads. By setting a relative small virtual memory size we will avoid to hit these other limits and will get more meaningful results.

The command ulimit -v will print the current virtual memory limit which should be set to unlimited. Let’s set it to 10 MB.

ulimit -v 10485760

Then, execute the code like that:

java -Xint -Xss2M ./ThreadCountVsMaxSize.java
[2.514s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for
attributes: stacksize: 2048k, guardsize: 0k, detached.
Thread Count: 1411
java.lang.OutOfMemoryError: unable to create native thread: possibly out of
memory or process/resource limits reached
        at java.base/java.lang.Thread.start0(Native Method)
        at java.base/java.lang.Thread.start(Thread.java:798)
        at com.github.masterex.stackdemo.ThreadCountVsMaxSize
                                 .main(ThreadCountVsMaxSize.java:41)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl
                                 .invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl
                                 .invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl
                                 .invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at jdk.compiler/com.sun.tools.javac.launcher.Main.execute(Main.java:404)
        at jdk.compiler/com.sun.tools.javac.launcher.Main.run(Main.java:179)
        at jdk.compiler/com.sun.tools.javac.launcher.Main.main(Main.java:119)

As you can see a total number of 1411 threads could be created before the memory is exhausted.

In the above execution the stack size is explicitly set to 2M. Running the same code with stack size 1M it will create 2827 threads on my machine, while running it with 4M will create 704.

It is clear that changing the stack size affects the number of the created threads. This means that an application which needs to create many threads then it should better use a smaller stack size while an application which needs to use more method calls (see recursion) will need a bigger stack size.

Epilogue

During the October’s session of JHUG, Dr. Heinz Kabutz talked about project Loom. During the talk he said that he never had StackOverflowError because his stack was too small but because there was an error on his code which caused infinite recursion (you should check the very interesting talk here, the specific part is around the 39th minute of the video).

I thought that this was interesting. Usually, the stack is not something that troubles us, the defaults are OK. However, I had once a case where I needed to change the stack size to avoid the Stack Overflow error. So, this was my motivation for exploring a little bit further how the JVM stack works and eventually write this article.

In my past experience I was working on a desktop application with OGG audio file playback capabilities. We have developed a custom audio player and we wanted to implement the seek functionality which would permit the user to jump to a specific point in the audio file. The OGG files have pages which contain some part of the total audio data. Roughly speaking, we needed to find the correct page which contained the data we wanted to jump to. The pages are sequential but do not contain the same amount of compressed audio data so it is not easy to find the specific page in constant time. Of course our very initial implementation was pretty naive and was just searching linearly for the point from the start to the end. The pages however were too many and for many cases there was a significant delay. For this reason we decided to perform a binary search which would achieve performance of O(logn) instead of O(n). We went with the binary search implementation which seemed simpler to us but as I have mentioned the files were too big and for some cases we were getting StackOverFlow error.

At the end we were fine by setting the stack size to 12M which was actually more than enough but was set to this in order to be safe for unexpected cases where the pages would be unusually big. As I have said our application was a desktop application. The number of created threads was limited and there were not throughput issues to consider. Now, this was a few years back and I cannot recall all the details but I think that this was a valid case for changing the stack size. There wasn’t infinite recursion but we needed to have too many method calls.

While I find that there are valid cases to use recursion as the one mentioned above my suggestion is to always avoid it. This may cause less straight forward code in some cases but it will save you from the trouble of having to provision what stack size you are going to use in the future or having limitations regarding the maximum number of created threads. Also, in general the non recursive code should perform better.

To be honest, personally I don’t face the kind of problems where a recursive solution would be the first thing to come in mind, in my day to day job. It is way more common though to see it in competitive code competitions and stuff like that. In that setting, it is perfectly fine to use it considering that the physical memory limitations may become an insurmountable obstacle when trying to solve certain problems.

I hope you enjoyed this article. At the end you may find all the code used in the examples and also many resources to further read and better understand how the JVM stack works.

If you have examples were you had to adjust your stack size I would love to read them in the comments! Take care!

Further reading…

JVM Stack Explained
Tail Call Optimization in Java

Comments

fade out