interview-preparation-day-002

What is a Stream in Java 8? How is it different from a Collection?
What are the characteristics of a Stream (lazy, no storage, consumable once)?
What are intermediate and terminal operations in Stream? Give examples.
How would you optimize a Stream pipeline for performance in a large dataset?
What is the difference between stream() and parallelStream()?
When should you use parallelStream() and what are its pitfalls?
How do you handle exceptions in Stream operations?
Explain short-circuiting operations (e.g., findFirst, anyMatch) vs non-short-circuiting.
What is the Stream pipeline? How does lazy evaluation work?
What happens if you call a terminal operation twice on the same Stream?
How do you create a Stream (from collection, array, Stream.of, infinite stream)?
Explain map vs flatMap with example.
What are Collectors? Common Collectors (toList, toSet, joining, groupingBy, partitioningBy).
Explain reduce operation with identity, accumulator, combiner.
What is the difference between forEach and forEachOrdered in parallel streams?
How to debug a Stream pipeline (use peek)?
Best practices for using Streams (immutability, side-effects, performance).

What is a Stream in Java 8? How is it different from a Collection?

=> Stream API Designed for processing data in a pipelined based manner

=> Stream API performing two types of operations :

Intermediate Operations : Operations like filter, map, sorted, etc., that transform the stream and are lazy (not executed until a terminal operation is called).
Terminal Operations : Operations like collect, forEach, reduce, or count that trigger the stream processing pipeline and produce a result.

Stream API vs Traditional Collections

Stream API	Traditional Collections
Designed for Processing data in a pipelined based manner	Designed for storing and managing data
Does not store data. Acts as a pipeline to process the data from a source	Stores data in memory (eg. ArrayList)
Does not modify source data	Can modify the collection (Ex. add, remove elements)
Stream API handles Iteration internally	Explicit iteration needed (eg. for, while loop)
A Stream can be consumed only once. Attempting to reuse it throws an exception	Collections can be reused multiple times
Optimized for bulk datasets and parallel processing.	Better for small data sets or direct data manipulation

Best Practice: Use Streams for processing, Collections for storage.
Interview Tip: Emphasize "Streams are for computation, Collections for storage".
Example: Stream API vs. Traditional Collections
Task: Filter even numbers from a list and double them.Using Traditional Collections (Imperative Approach):java

import java.util.ArrayList;

import java.util.Arrays;

import java.util.List;

public class Main {

public static void main(String[] args) {

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6);

List<Integer> result = new ArrayList<>();

// External iteration with a loop

for (Integer num : numbers) {

if (num % 2 == 0) { // Filter even numbers

result.add(num * 2); // Double the number

}

System.out.println(result); // Output: [4, 8, 12]

}

Using Stream API (Declarative Approach):java

import java.util.Arrays;

import java.util.List;

import java.util.stream.Collectors;

public class Main {

public static void main(String[] args) {

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6);

// Stream-based processing

List<Integer> result = numbers.stream()

.filter(n -> n % 2 == 0)

.map(n -> n * 2)

.collect(Collectors.toList());

System.out.println(result); // Output: [4, 8, 12]

}

What are the characteristics of a Stream (lazy, no storage, consumable once)?

=> Stream is not a data structure. Stream is designed for Processing data in a pipelined based manner. It does not store data.

=> Intermediate operations are not executed until a terminal operation is called, ie. Lazy evaluation

=> After terminal operation, the stream is closed. Calling another terminal will throw IllegalStateException

=> A Stream can be consumed only once. Attempting to reuse it throws an exception

What are intermediate and terminal operations in Stream? Give examples.

=> Intermediate Operations : Operations like filter, map, sorted, etc., that transform the stream and are lazy (not executed until a terminal operation is called).

=> Terminal Operations : Operations like collect, forEach, reduce, or count that trigger the stream processing pipeline and produce a result.

=> Best Practice: Chain intermediate → end with terminal.
=> Interview Tip: "Intermediate are lazy, terminal are eager".

=> Code Example :

list.stream()
.filter(s -> s.startsWith("A")) // Intermediate
.map(String::toUpperCase) // Intermediate
.collect(Collectors.toList()); // Terminal — triggers execution

How would you optimize a Stream pipeline for performance in a large dataset?

=> Use parallelStream() for CPU-bound tasks on large datasets.

=> Minimize and combine intermediate operations.

=> Filter early to reduce dataset size.

=> Leverage short-circuiting operations.

=> Choose efficient collectors and data structures.

=> Avoid stateful operations like sorted or distinct unless necessary.

=> Use primitive streams (IntStream, etc.) for numeric data.

=> Profile and test with realistic data (Noting down the time taken)

=> Avoid multiple stream creations.

=> Monitor memory usage and GC impact.

=> Consider custom collectors for complex reductions.

What is the difference between stream() and parallelStream()?

Serial Stream	Parallel Stream
=> Sequential execution on single thread => Processes elements sequentially in a single thread	=> Parallel execution using ForkJoinPool (common pool). => common pool size (Runtime.availableProcessors() - 1) => Processes elements concurrently across multiple threads
All operations (eg. filter, map) are executed in the order defined, one element at a time	Splits the stream into chunks, process them in parallel, and combine results
Better for small datasets	Better for large datasets
No concurrency issues	Requires thread-safe operations
Method : stream ()	Method : parallelStream() or stream().parallel()

=> Best Practice: Use parallel for CPU-intensive, large data — avoid for side-effects (order not guaranteed).

Code Example :

list.stream().forEach(System.out::println); // Sequential order
list.parallelStream().forEach(System.out::println); // Unordered (faster for large data)

When should you use parallelStream() and what are its pitfalls?

Use parallelStream for:
Large datasets with CPU-intensive operations (e.g., complex calculations).
Independent operations (no shared state).

Pitfalls :

=> Overhead for small datasets

=> Non-deterministic behavior

=> Thread safety issues

=> Ineffective for stateful operations (Operations like sorted are stateful as they need to hold the elements in the memory)

=> Debugging complexity

How do you handle exceptions in Stream operations?

=> Checked Exceptions : Wrap in try-catch within lambdas or use utility functions to convert to unchecked exceptions.

=> Unchecked Exceptions : Handle locally to prevent pipeline termination, using nulls, default values, or wrapper classes. (In catch we can return null or default values)

=> Parallel Streams : Ensure thread-safe handling and consider custom ForkJoinPool for better control.

Example : Combining approaches

List<String> strings = Arrays.asList("123", "abc", "456");

List<Result<Integer>> results = strings.stream()

.map(s -> {

try {

return Result.success(Integer.parseInt(s));

} catch (NumberFormatException e) {

return Result.failure(e);

}

})

.collect(Collectors.toList());

What is ForkJoinPool?
ForkJoinPool is an implementation of the ExecutorService interface that manages a pool of worker threads to execute tasks that are split into smaller subtasks (forked) and later combined (joined) to produce a final result.

Explain short-circuiting operations (e.g., findFirst, anyMatch) vs non-short-circuiting.

=> Short-circuiting: Stop processing when result found (terminal or intermediate like limit).
Examples: anyMatch, allMatch, noneMatch, findFirst, findAny, limit.
=> Non-short-circuiting: Process all elements (collect, reduce, sorted).

=> Code example :

list.stream().anyMatch(s -> s.equals("Target")); // Stops at first match
list.stream().collect(Collectors.toList()); // Processes all

=> Best Practice: Use short-circuit for infinite streams or performance.

What is the Stream pipeline? How does lazy evaluation work?

=> Stream pipeline = source → intermediate operations → terminal operation.
=> Lazy evaluation: Intermediate operations are not executed until a terminal operation is called (chained as a plan).

=> Code Example :

Stream<String> stream = list.stream().filter(s -> s.length() > 5); // Nothing executed yet
stream.forEach(System.out::println); // Now filter + forEach run

What happens if you call a terminal operation twice on the same Stream?

=> IllegalStateException: "stream has already been operated upon or closed".

=> Code Example :

Stream<String> stream = list.stream();
stream.forEach(System.out::println);
stream.forEach(System.out::println); // Exception

=> Best Practice: Create new Stream for multiple operations.

How do you create a Stream (from collection, array, Stream.of, infinite stream)?

=> Collection: list.stream()
=> Array: Arrays.stream(array)
=> Values: Stream.of("A", "B")
=> Infinite: Stream.generate(() -> "Hello").limit(10)
=> Files: Files.lines(Path)

=> Code Example :

Stream<Integer> fromList = list.stream();
Stream<String> fromArray = Arrays.stream(new String[]{"A", "B"});
Stream<Integer> infinite = Stream.iterate(1, n -> n + 1).limit(5);

=> Best Practice: Use limit on infinite streams.

Explain map vs flatMap with example.

=> map: 1-to-1 transformation (Stream<R>).
=> flatMap: 1-to-many, flattens nested streams (Stream<Stream<R>> → Stream<R>).

=> Code Example :

List<List<String>> nested = Arrays.asList(
Arrays.asList("A", "B"),
Arrays.asList("C", "D")
);
nested.stream().map(List::stream).collect(Collectors.toList()); // Stream<Stream<String>>
nested.stream().flatMap(List::stream).collect(Collectors.toList()); // ["A", "B", "C", "D"]

=> Best Practice: Use flatMap for nested collections or Optional flattening.

What are Collectors? Common Collectors (toList, toSet, joining, groupingBy, partitioningBy).

=> Collectors are terminal operations that accumulate elements into results

Common:
=> toList(), toSet()
=> joining(", ") — concatenate strings
=> groupingBy(classifier) — Map<K, List<T>>
=> partitioningBy(predicate) — Map<Boolean, List<T>>

=> Code Example :

Map<Integer, List<String>> grouped = list.stream()
.collect(Collectors.groupingBy(String::length));

Explain reduce operation with identity, accumulator, combiner.

Reduce combines elements into single result.

Identity: Starting value.
Accumulator: BinaryOperator for sequential.
Combiner: For parallel (merge partial results).

Code Example:

int sum = list.stream().reduce(0, (a, b) -> a + b); // Identity 0, accumulator add

What is the difference between forEach and forEachOrdered in parallel streams?

=> forEach: Unordered in parallel (faster).

=> forEachOrdered: Maintains encounter order even in parallel (slower).

=> Best Practice: Use forEachOrdered if order matters.

How to debug a Stream pipeline (use peek)?

=> Use method peek to log the values during chain intermediate operations. This is called Intermediate inspection

=> peek is intermediate operation for debugging (side-effect).

Example : import java.util.Arrays;

import java.util.List;

public class DebugLambda {

public static void main(String[] args) {

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);

List<Integer> intermediate = numbers.stream()

.filter(n -> n % 2 == 0) // Step 1: Filter even numbers

.peek(n -> System.out.println("Filtered: " + n)) // Debug output

.map(n -> n * 2) // Step 2: Double the numbers

.peek(n -> System.out.println("Mapped: " + n)) // Debug output

.toList();

System.out.println("Result: " + intermediate);

}

Output :

Filtered: 2

Mapped: 4

Filtered: 4

Mapped: 8

Result: [4, 8]

Best practices for using Streams (immutability, side-effects, performance).

=> Prefer immutable sources.

=> Avoid side-effects in intermediate operations.

=> Use parallel only for large, CPU-intensive tasks.

=> Short-circuit when possible.

=> Don't use streams for simple loops (overhead).

_______________________________________________________________

Practice the following coding exercises (easy-medium)

Create Stream from list and print all elements using forEach.
Filter even numbers from list and print using forEach.
Map list of strings to uppercase and collect to List.
Sort list of integers ascending using sorted.
Sort descending using sorted(Comparator.reverseOrder()).
Count elements in list using count.
Find first element using findFirst (or Optional).
Check if any number is even using anyMatch.
Check if all numbers are positive using allMatch.
Check if no number is negative using noneMatch.
Reduce to sum of list using reduce.
Reduce to product of list using reduce.
FlatMap: Flatten List<List<Integer>> to List<Integer>.
Remove duplicates using distinct.
Limit to first 5 elements.
Skip first 3 elements.
Group strings by length using Collectors.groupingBy.
Partition numbers into even/odd using Collectors.partitioningBy.
Join strings with comma using Collectors.joining(", ").
Find max/min in list using max/min + Comparator.

TOPICS

Featured Post

Spring Framework basic interview Q&A

Popular Posts