Functional or Imperative – Which Is Better for Processing Collections in Java?

4 min read

Let’s compare the performance of using functional and imperative programming styles to process collections in Java.

Here’s our benchmark: Given a collection of random numbers, get the distinct evens in sorted order.

Both approaches will have the same time complexity.

We’ll split this into two parts. First, we’ll build the test client. Then, we’ll run the benchmarks and examine the results.

Building the Test Client

Let’s create a wrapper containing a collection of random numbers.

RandomNumberWrapper.java
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;

public class RandomNumberWrapper {

  private final List<Integer> values;

  public RandomNumberWrapper(long n) {
    this.values = new Random()
        .ints(n)
        .boxed()
        .collect(Collectors.toList());
  }

  public List<Integer> getValues() {
    return this.values;
  }
}

Next we’ll define a class where we’ll run our benchmarks.

StreamVsLoop.java
import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.text.DecimalFormat;
import java.text.NumberFormat;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Locale;
import java.util.function.Supplier;
import java.util.stream.Collectors;

public class StreamVsLoop {}

From here on out, the Java snippets will be part of the StreamVsLoop class.

We’ll use 10 RandomNumberWrappers in each test, so for a given n, the test will process n * 10 numbers.

StreamVsLoop.java
private static final int N_WRAPPERS = 10;

private static List<RandomNumberWrapper> getRandomNumbers(final long n) {
  final List<RandomNumberWrapper> wrappers = new ArrayList<>();
  for (long i = 0; i < N_WRAPPERS; i++) {
    wrappers.add(new RandomNumberWrapper(n));
  }
  return wrappers;
}

The benchmark needs to output even numbers. Let’s create a utility that.

StreamVsLoop.java
private static boolean isEven(final int n) {
  return n % 2 == 0;
}

Now we’re ready to start the functional and imperative implementations.

We’ll leverage Java’s Stream API for the functional approach.

StreamVsLoop.java
private static long timeStreamImpl(final List<RandomNumberWrapper> payloads) {
  final long start = System.currentTimeMillis();
  payloads.stream()
    .map(RandomNumberWrapper::getValues)
    .flatMap(List::stream)
    .filter(StreamVsLoop::isEven)
    .distinct()
    .sorted()
    .collect(Collectors.toList());
  return System.currentTimeMillis() - start;
}

We’ll do the same for the imperative approach.

StreamVsLoop.java
private static long timeLoopImpl(final List<RandomNumberWrapper> payloads) {
  final long start = System.currentTimeMillis();
  List<Integer> randomNumbers = new ArrayList<>();

  // map() / flatMap()
  for (final RandomNumberWrapper payload : payloads) {
    randomNumbers.addAll(payload.getValues());
  }

  // distinct()
  randomNumbers = new ArrayList<>(new HashSet<>(randomNumbers));

  // filter()
  randomNumbers.removeIf(n -> !isEven(n));

  // sorted()
  randomNumbers.sort(Integer::compareTo);

  return System.currentTimeMillis() - start;
}

Each test method returns a long: the number of milliseconds to process the data.

We’ll create a new method to take the average from 100 test runs for a given implementation.

StreamVsLoop.java
private static final int N_TEST_RUNS = 100;

private static double getAverage(final Supplier<Long> testCase) {
  return Collections.nCopies(N_TEST_RUNS, testCase)
      .stream()
      .map(Supplier::get)
      .mapToLong(Long::valueOf)
      .average()
      .orElse(-1);
}

Finally, we want to be able to write our benchmarks to a CSV file, so we’ll write a utility for that.

StreamVsLoop.java
private static void writeToCSV(final List<String> lines) {
  final File csvOutputFile = new File("stream_vs_loop.csv");
  try (final PrintWriter pw = new PrintWriter(csvOutputFile)) {
    lines.forEach(pw::println);
  } catch (FileNotFoundException e) {
    System.err.println("failed to write csv file: " + e.getMessage());
    System.exit(1);
  }
  assert csvOutputFile.exists();
}

Now we have the pieces in place to build the test client.

The client gets the average time to process the collection using the functional and imperative approaches for each n.

StreamVsLoop.java
private static final NumberFormat NUM_FORMAT = new DecimalFormat("0E0");

public static void main(final String[] args) {
  final List<String> lines = new ArrayList<>();
  final String headers = "n,stream,loop";
  System.out.println(headers);
  lines.add(headers);
  for (long n = 1; n <= 1000000; n *= 10) {
    final List<RandomNumberWrapper> payloads = getRandomNumbers(n);
    final String line = NUM_FORMAT.format(n * N_WRAPPERS).toLowerCase(Locale.ROOT) + ','
        + getAverage(() -> timeStreamImpl(payloads)) + ','
        + getAverage(() -> timeLoopImpl(payloads));
    System.out.println(line);
    lines.add(line);
  }
  writeToCSV(lines);
}

Comparing Performance of Functional and Imperative Implementations

The imperative approach slightly outperforms the functional approach while n < 1e5. As n increases, streams are the clear winner.

nstream (ms)loop (ms)
1e10.180.05
1e20.180.21
1e30.360.25
1e41.531.07
1e511.5715.22
1e6213.31222.61
1e73343.5912392.42

In my opinion, the declarative Stream API is more readable and easier to maintain. When n is small, the performance difference is probably negligible for most apps.

For that reason, I prefer the functional approach unless I know the dataset is relatively small, and I need to squeeze out as much performance as possible.