If we look at the documentation for java.util.stream.Collectors, we’ll see that some of them are quite sophisticated.
For example:
// streams/TreeSetOfWords.java
// (c)2017 MindView LLC: see Copyright.txt
// We make no guarantees that this code is fit for any purpose.
// Visit http://OnJava8.com for more book information.
import java.nio.file.*;
import java.util.*;
import java.util.stream.*;
public class TreeSetOfWords {
public static void main(String[] args) throws Exception {
Set<String> words2 =
Files.lines(Paths.get("TreeSetOfWords.java"))
.flatMap(
s -> Arrays.stream(s.split("\\W+"))) // W+ means one or more non-word characters
.filter(s -> !s.matches("\\d+")) // No numbers
.map(String::trim)
.filter(s -> s.length() > 2)
.limit(100)
.collect(Collectors.toCollection(TreeSet::new));
System.out.println(words2);
}
}
/* My Output:
[Arrays, Collectors, Copyright, Exception, Files, LLC, MindView, OnJava8, Output, Paths, Set, String, System, TreeSet, TreeSetOfWords, Visit, any, args, book, cha
racters, class, code, collect, com, file, filter, fit, flatMap, for, get, guarantees, http, import, information, java, length, limit, lines, main, make, map, matc
hes, means, more, new, nio, non, numbers, one, out, println, public, purpose, see, split, static, stream, streams, that, this, throws, toCollection, trim, txt, ut
il, void, word, words2]
*/
public static Stream<String> lines(Path path) throws IOException
Read all lines from a file as a Stream
. Bytes from the file are decoded into characters using the UTF-8
charset
.
This method works as if invoking it were equivalent to evaluating the expression:
Files.lines(path, StandardCharsets.UTF_8)
Parameters:
path
- the path to the file
Returns:
the lines from the file as a Stream
Throws:
IOException
- if an I/O error occurs opening the file
SecurityException
- In the case of the default provider, and a security manager is installed, the checkRead
method is invoked to check read access to the file.
Since:
1.8
<R> Stream<R> flatMap(Function<? super T,? extends Stream<? extends R>> mapper)
Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. Each mapped stream isclosed
after its contents have been placed into this stream. (If a mapped stream is null
an empty stream is used, instead.)
This is an intermediate operation.
API Note:
The flatMap()
operation has the effect of applying a one-to-many transformation to the elements of the stream, and then flattening the resulting elements into a new stream.
Examples.
If orders
is a stream of purchase orders, and each purchase order contains a collection of line items, then the following produces a stream containing all the line items in all the orders:
orders.flatMap(order -> order.getLineItems().stream())...
If path
is the path to a file, then the following produces a stream of the words
contained in that file:
Stream<String> lines = Files.lines(path, StandardCharsets.UTF_8);
Stream<String> words = lines.flatMap(line -> Stream.of(line.split(" +")));
The mapper
function passed to flatMap
splits a line, using a simple regular expression, into an array of words, and then creates a stream of words from that array.
Type Parameters:
R
- The element type of the new stream
Parameters:
mapper
- a non-interfering, stateless function to apply to each element which produces a stream of new values
Returns:
the new stream
<R> Stream<R> map(Function<? super T,? extends R> mapper)
Returns a stream consisting of the results of applying the given function to the elements of this stream.
This is an intermediate operation.
Type Parameters:
R
- The element type of the new stream
Parameters:
mapper
- a non-interfering, stateless function to apply to each element
Returns:
the new stream
note this diff:
Both map
and flatMap
can be applied to a Stream<T>
and they both return a Stream<R>
. The difference is that the map
operation produces one output value for each input value, whereas the flatMap
operation produces an arbitrary number (zero or more) values for each input value.
This is reflected in the arguments to each operation.
The map
operation takes a Function
, which is called for each value in the input stream and produces one result value, which is sent to the output stream.
The flatMap
operation takes a function that conceptually wants to consume one value and produce an arbitrary number of values. However, in Java, it's cumbersome for a method to return an arbitrary number of values, since methods can return only zero or one value. One could imagine an API where the mapper function for flatMap
takes a value and returns an array or a List
of values, which are then sent to the output. Given that this is the streams library, a particularly apt way to represent an arbitrary number of return values is for the mapper function itself to return a stream! The values from the stream returned by the mapper are drained from the stream and are passed to the output stream. The "clumps" of values returned by each call to the mapper function are not distinguished at all in the output stream, thus the output is said to have been "flattened."
Typical use is for the mapper function of flatMap
to return Stream.empty()
if it wants to send zero values, or something like Stream.of(a, b, c)
if it wants to return several values. But of course any stream can be returned.
Stream<T> filter(Predicate<? super T> predicate)
Returns a stream consisting of the elements of this stream that match the given predicate.
This is an intermediate operation.
Parameters:
predicate
- a non-interfering, stateless predicate to apply to each element to determine if it should be included
Returns:
the new stream
public static <T,C extends Collection<T>> Collector<T,?,C> toCollection(Supplier<C> collectionFactory)
Returns a Collector
that accumulates the input elements into a new Collection
, in encounter order. The Collection
is created by the provided factory.
Type Parameters:
T
- the type of the input elements
C
- the type of the resulting Collection
Parameters:
collectionFactory
- a Supplier
which returns a new, empty Collection
of the appropriate type
Returns:
a Collector
which collects all the input elements into a Collection
, in encounter order
We can also produce a Map from a stream:
// streams/MapCollector.java
// (c)2017 MindView LLC: see Copyright.txt
// We make no guarantees that this code is fit for any purpose.
// Visit http://OnJava8.com for more book information.
import java.util.*;
import java.util.stream.*;
class Pair {
public final Character c;
public final Integer i;
Pair(Character c, Integer i) {
this.c = c;
this.i = i;
}
public Character getC() {
return c;
}
public Integer getI() {
return i;
}
@Override
public String toString() {
return "Pair(" + c + ", " + i + ")";
}
}
class RandomPair {
Random rand = new Random(47);
// An infinite iterator of random capital letters:
Iterator<Character> capChars = rand.ints(65, 91).mapToObj(i -> (char) i).iterator();
public Stream<Pair> stream() {
return rand.ints(100, 1000).distinct().mapToObj(i -> new Pair(capChars.next(), i));
}
}
public class MapCollector {
public static void main(String[] args) {
Map<Integer, Character> map =
new RandomPair().stream().limit(8).collect(Collectors.toMap(Pair::getI, Pair::getC));
System.out.println(map);
}
}
/* Output:
{688=W, 309=C, 293=B, 761=N, 858=N, 668=G, 622=F, 751=N}
*/
public static <T,K,U> Collector<T,?,Map<K,U>> toMap(Function<? super T,? extends K> keyMapper, Function<? super T,? extends U> valueMapper)
Returns a Collector
that accumulates elements into a Map
whose keys and values are the result of applying the provided mapping functions to the input elements.
If the mapped keys contains duplicates (according to Object.equals(Object)
), an IllegalStateException
is thrown when the collection operation is performed. If the mapped keys may have duplicates, use toMap(Function, Function, BinaryOperator)
instead.
API Note:
It is common for either the key or the value to be the input elements. In this case, the utility method Function.identity()
may be helpful. For example, the following produces a Map
mapping students to their grade point average:
Map<Student, Double> studentToGPA
students.stream().collect(toMap(Functions.identity(),
student -> computeGPA(student)));
And the following produces a Map
mapping a unique identifier to students:
Map<String, Student> studentIdToStudent
students.stream().collect(toMap(Student::getId,
Functions.identity());
Implementation Note:
The returned Collector
is not concurrent. For parallel stream pipelines, the combiner
function operates by merging the keys from one map into another, which can be an expensive operation. If it is not required that results are inserted into the Map
in encounter order, using toConcurrentMap(Function, Function)
may offer better parallel performance.
Type Parameters:
T
- the type of the input elements
K
- the output type of the key mapping function
U
- the output type of the value mapping function
Parameters:
keyMapper
- a mapping function to produce keys
valueMapper
- a mapping function to produce values
Returns:
a Collector
which collects elements into a Map
whose keys and values are the result of applying mapping functions to the input elements
See Also:
toMap(Function, Function, BinaryOperator)
, toMap(Function, Function, BinaryOperator, Supplier)
, toConcurrentMap(Function, Function)
references:
1. On Java 8 - Bruce Eckel
2. https://github.com/wangbingfeng/OnJava8-Examples/blob/master/streams/TreeSetOfWords.java
3. https://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#lines-java.nio.file.Path-