Java 8 Stream findFirst vs. findAny: Perfomance and Functional Differences

Image from https://unsplash.com/.

Overview

1. Introduction

Searching for a specific element in a stream of elements is common in Java applications.

The two terminal operations Java provides to search elements are the Stream.findFirst() and Stream.findAny() methods.

In this tutorial, we'll explore how to use the findAny() and findFirst() methods, their differences, and which is faster.

2. findAny() vs. findFirst()

Let's first look at the findAny() and findFirst() method signatures in the Stream class:

1Optional<T> findAny();
2Optional<T> findFirst();

The findAny() method terminates a stream and returns an Optional containing an element found. Otherwise, it returns an empty Optional.

The findFirst() does the same thing as findAny(), except for an implementation detail, which we'll check shortly.

2.1. Test Structure

To illustrate how findAny() and findFirst() methods work, having an object where we can compare the target element's memory address is useful. For that, let's define a Person class like below:

1public class Person {
2    public String name;
3
4    public Person(String name) {
5        this.name = name;
6    }
7}

Now, let's create the main structure of our unit tests:

 1public class FindFirstFindAnyUnitTest {
 2    private Stream<Person> testStream;
 3    private final Person target = new Person("John");
 4    private final Predicate<Person> johnFilter = j -> "John".equals(j.name);
 5
 6    @BeforeEach
 7    public void setUp() {
 8        testStream = Stream.of(new Person("Maria"), new Person("Jane"), target, new Person("John"));
 9    }
10    @AfterEach
11    public void tearDown() {
12        testStream = Stream.empty();
13    }
14}

2.2. How to Use findFirst()

The findFirst() method always returns the first element in the same order it appears in the stream in parallel or non-parallel streams.

For instance, if we search for a Person named "John", it should always return the first one added in the testStream:

 1@Test
 2public void givenJohnFilterFindFirst_whenSearchingAnyStream_thenReturnFirstJohn() {
 3    var parallelMatch = testStream
 4            .parallel()
 5            .filter(johnFilter)
 6            .findFirst()
 7            .get();
 8
 9    var nonParallelMatch = Stream.of(new Person("Maria"), new Person("Jane"), target, new Person("John"))
10            .filter(johnFilter)
11            .findFirst()
12            .get();
13
14    assertEquals(target.hashCode(), parallelMatch.hashCode());
15    assertEquals(target.hashCode(), nonParallelMatch.hashCode());
16}

Both findFirst() calls return the target object in the parallel and non-parallel stream. The hashCode() value comparison using assertEquals proves that.

Even though the stream contains two Person objects named John, findFirst() always returns the first in the order they appear in the stream.

2.3. How to Use findAny()

The findAny() returns any element in a non-deterministic way. In other words, it might return any element independent of the order it appears in the stream**.

Let's try an example of searching in a non-parallel stream using findAny():

1@Test
2public void givenJohnFilterFindAny_whenSearchingNonParallelStream_thenReturnAnyJohn() {
3    var match = testStream
4            .filter(johnFilter)
5            .findAny();
6
7    assertTrue(match.isPresent());
8}

findAny() guarantees that any Person named John is found in whichever order they appear. In non-parallel streams, it is likely that findAny() returns the first element, but that's not guaranteed.

2.4. Functional Differences

As we've seen in previous sections, both findFirst() and findAny() return an Optional containing the element found in a stream.

The only difference between the two methods is how they implement the search: findAny() might retrieve any stream element in a non-deterministic way. In contrast, findFirst() will always return the first element that appears in the stream.

2.5. Which is Faster: findFirst vs. findAny?

Let's compare the performance of non-parallel and parallel versions of findAny() and findFirst() methods using the configuration below:

  • Stream consisting of 5 million elements different from the target.
  • Two target objects are added randomly to the stream.
  • We measure milliseconds using the Instant.now and Duration.between methods from java.time package.
  • Each time corresponds to an individual run time after 20 runs using the same input.

The table below shows the value for each execution and the results of mean, median, and min:

Run # median mean min 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
non-parallel stream
findFirst 79.5ms 80.65ms 13ms 208ms 77ms 82ms 162ms 23ms 85ms 19ms 5ms 179ms 59ms 35ms 83ms 18ms 86ms 142ms 37ms 56ms 119ms 6ms 132ms
findAny 33.5ms 49.65ms 5ms 103ms 80ms 25ms 127ms 18ms 5ms 98ms 5ms 18ms 188ms 59ms 58ms 5ms 22ms 26ms 6ms 39ms 44ms 31ms 36ms
parallel stream
findFirst 22.5ms 22.5ms 5ms 30ms 26ms 16ms 30ms 15ms 20ms 23ms 21ms 28ms 28ms 15ms 22ms 20ms 18ms 26ms 13ms 16ms 30ms 29ms 24ms
findAny 17.5ms 21.4ms 5ms 105ms 8ms 27ms 13ms 18ms 29ms 9ms 30ms 7ms 14ms 13ms 19ms 5ms 23ms 20ms 7ms 16ms 17ms 24ms 24ms

Which is faster findFirst or findAny?

Regarding findAny() vs. findFirst() and in Java Streams performance:

  • Statistically, findAny() is faster than findFirst() in any scenario. Thus, if there's no requirement to get the first element of the stream, opt for the findAny() method.
  • Opt for parallel streams for big datasets whenever possible since they are faster.
  • Evaluate if the overhead time of parallel streams is worth using them. For small datasets, it might not.

3. Conclusion

In this post, we've investigated the differences between the findFirst() and findAny() methods to search for elements in streams.

findAny() returns any element in the stream, whereas findFirst() always picks the first.

If the requirement is to get any element from the stream, not precisely the first one, always choose the findAny() method since it's faster. Use parallel streams for big datasets whenever possible.