Type Erasure and Heap Pollution in Java

Image from https://unsplash.com/.

Overview

1. Introduction

In this article, we'll look at the concept of type erasure in Java for classes and methods and what heap pollution is.

2. What is Type Erasure in Java?

Type Erasure is a tool built in the Java compiler to implement Java Generics. That tool keeps the generic types at compile time and discards them at runtime.

The reasons that Java has type Erasure are:

  • Replace all generic types with the bounded type or with Object.
  • Auto-generate generic type casting to other types if necessary.
  • Auto-generate bridge methods to allow generic type polymorphism.

We'll look at each one of them in the following sections.

2.1. Class Type Erasure

When we define a** generic type at the class level, the compiler substitutes each type to its bound or Object if there's no bound**. For instance, if we define the following Vertex class with an unbounded generic type:

 1public class Vertex<T> {
 2
 3    T data;
 4
 5    public Vertex(T data) {
 6        this.data = data;
 7    }
 8
 9    public T getData() {
10        return data;
11    }
12
13    public void setData(T data) {
14        this.data = data;
15    }
16}

It will type erasure the generics to Object, auto-generating the following class:

 1public class Vertex {
 2
 3    Object data;
 4
 5    public Vertex(Object data) {
 6        this.data = data;
 7    }
 8
 9    public Object getData() {
10        return data;
11    }
12
13    public void setData(Object data) {
14        this.data = data;
15    }
16}

On the other hand, if we bound the generic type like the example below:

 1public class Vertex<T extends Comparable<T>> {
 2
 3    T data;
 4
 5    public Vertex(T data) {
 6        this.data = data;
 7    }
 8
 9    public T getData() {
10        return data;
11    }
12
13    public void setData(T data) {
14        this.data = data;
15    }
16}

The Java compiler erases _<T> and adds the bound type, like the code below:

 1public class Vertex {
 2
 3    Comparable data;
 4
 5    public Vertex(Comparable data) {
 6        this.data = data;
 7    }
 8
 9    public Comparable  getData() {
10        return data;
11    }
12
13    public void setData(Comparable data) {
14        this.data = data;
15    }
16}

2.2. Method Type Erasure

The Java compiler also erases the generic types defined in methods. To illustrate that, suppose the following generic version of bubble sort belongs to a non-generic class:

 1public <T extends Comparable<T>> void genericBubbleSort(T[] arr) {
 2    for (int i = 0; i < arr.length - 1; i++) {
 3        for (int j = 0; j < arr.length - i - 1; j++) {
 4            if (arr[j].compareTo(arr[j + 1]) > 0) {
 5                T temp = arr[j];
 6                arr[j] = arr[j + 1];
 7                arr[j + 1] = temp;
 8            }
 9        }
10    }
11}

The method above defines the bounded generic type as <T extends Comparable>. Thus, the compiler erases all T references and puts a Comparable in its place:

 1public void genericBubbleSort(Comparable[] arr) {
 2    for (int i = 0; i < arr.length - 1; i++) {
 3        for (int j = 0; j < arr.length - i - 1; j++) {
 4            if (arr[j].compareTo(arr[j + 1]) > 0) {
 5                Comparable temp = arr[j];
 6                arr[j] = arr[j + 1];
 7                arr[j + 1] = temp;
 8            }
 9        }
10    }
11}

If T is unbounded, the compiler uses an Object reference instead of Comparable, similar to the generic class example.

2.3. Bridge Methods

For subtypes of a generic class, the compiler needs to generate bridge methods to allow polymorphism. For example, let's define a subclass of the Vertex class as follows:

 1public class SubVertex<T> extends Vertex<T>{
 2    public SubVertex(T data) {
 3        super(data);
 4    }
 5
 6    @Override
 7    public void setData(T data) {
 8        super.setData(data);
 9    }
10}

Assume we call SubVertex.setData using T as an Integer. After type erasure, the Vertex.setData(T) method becomes Vertex.setData(Object). As a result, the SubVertex.setData(Integer) method does not override the Node.setData(Object) method, giving a compile-time error.

To avoid errors, the compiler generates a bridge method with the same signature as the superclass method, using Object as the argument. Then, the method casts the input to the subclass used. The bridge method generated is as follows:

 1public class SubVertex<T> extends Vertex<T>{
 2  //Bridge method
 3   public void setData(Object data) {
 4        setData((Integer) data);
 5   }
 6
 7    public SubVertex(T data) {
 8        super(data);
 9    }
10
11    @Override
12    public void setData(T data) {
13        super.setData(data);
14    }
15}

3. What is Heap Pollution in Java

Type Erasure is needed to implement generics in Java, as one of the generics author defends in this article. Although needed, type erasure is not a silver-bullet tool with no issues attached to it.

The heap pollution scenario refers to a situation where we attempt to make an unchecked cast when working with generics and non-generic type classes. Let's illustrate that with an example that directly assigns an incompatible type to another:

1Vertex<String> vtx1 = new Vertex<>("data");
2Vertex<?> vtx2 = vtx1;
3Vertex<Integer> vtx3 = vtx2;

The code above doesn't compile because Integer doesn't fit directly in a wildcard type (<?>). However, the compiler also doesn't know what type exactly vtx2 holds at compile time. Thus, to make the code above compile, we can cast the vtx2 variable to Vertex<Integer>, as follows:

1Vertex<String> vtx1 = new Vertex<>("data");
2Vertex<?> vtx2 = vtx1;
3Vertex<Integer> vtx3 = (Vertex<Integer>) vtx2; //compiler warning of heap pollution
4var data = vtx3.getData(); //ClassCastException

That is bad because although the code compiles normally, line 3 raises a warning, and a ClassCastException is thrown at line 4 with the message below:

1Exception in thread "main" java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Integer

That happens because we "polluted" the heap by forcing assigning an object (vtx2) into an unsupported reference (vtx3).

We can avoid any Heap Pollution scenario by following the compiler's warnings. If there's no warning of a possible heap pollution scenario, we're safe to use generics without any issues.

4. Conclusion

In this post, we've looked at what type erasure is and how the java compiler uses it to implement generic types in methods and classes.

We've also examined how type erasure might lead to heap pollution.

The central learning here is to follow the compiler warnings about potential heap pollution when working with Java Generics.