Read on to learn how to create a Sequence
, how they work under the hood & the difference between intermediate and terminal operations, and how laziness and infiniteness arise from a more fundamental property.
Creating a Sequence
There are three main ways you can create a sequence:
- By directly specifying its elements, via
sequenceOf
- From an
Iterable
, viaasSequence
- By using a lambda that calculates each element, via
generateSequence
- By using what are essentially generators, i.e. functions that can return multiple times using
yield
andyieldAll
, viasequence
. The way this works is sufficiently involved to warrant its own separate article.
sequenceOf
//sampleStart fun <T> sequenceOf(vararg elements: T): Sequence<T> //sampleEnd
Creates a Sequence
from the passed in elements
. Functionally equivalent to listOf
, setOf
etc., but for Sequence
.
Example
//sampleStart fun main() { sequenceOf(2, 4, 6, 8).also(::println) } //sampleEnd
asSequence
//sampleStart fun <T> Iterable<T>.asSequence(): Sequence<T> //sampleEnd
Creates a Sequence
from the contents of the Iterable
.
Example
//sampleStart fun main() { listOf(2, 4, 6, 8).asSequence().also(::println) } //sampleEnd
generateSequence
//sampleStart fun <T : Any> generateSequence(nextFunction: () -> T?): Sequence<T> //sampleEnd
Generates a Sequence
by repeatedly calling nextFunction
, until it returns null
.
There are a few variants of this method that we will discuss shortly, however be aware that the Sequence
returned by the above variant is constrained to only run once! Attempting to run it twice will result in an exception.
I tried to find out why this is so and couldn’t find any official material referencing the reason. I had a hunch this had something to do with Sequences
that cause side effects that you wouldn’t want repeated, e.g. writing something to the database. After asking on reddit, my suspicions were confirmed, and I’ll refer you to the excellent answer there instead of reproducing it.
It should be noted that you can constrain any Sequence
to run only once using the aptly named method constrainOnce
.
Example
//sampleStart import java.math.BigDecimal import java.time.Instant enum class Direction { INCOMING, OUTGOING } interface Transaction { val instant: Instant val amount: BigDecimal val direction: Direction } interface TransactionAPIClient { fun firstAfter(instant: Instant): Transaction? } interface BankAccountService { fun executeTransaction(transaction: Transaction) } fun syncTransactionsAfter( after: Instant, transactionAPIClient: TransactionAPIClient, bankAccountService: BankAccountService ): Sequence<Transaction> { var lastCall = after return generateSequence { transactionAPIClient .firstAfter(lastCall) ?.also { bankAccountService.executeTransaction(it) lastCall = it.instant } } } //sampleEnd fun main() { val poem = """ When you're sailing in the sea of code, Kotlin's syntax is the compass, the road. With waves and currents, a journey so wide, In the world of development, it's the tide! """.trimIndent() println(poem) }
There are two variants of generateSequence
which allow you to make the process dependent on some value (and specify an initial value to use). Both these variants are not constrained to run only once.
//sampleStart fun <T : Any> generateSequence( seed: T?, nextFunction: (T) -> T? ): Sequence<T> fun <T : Any> generateSequence( seedFunction: () -> T?, nextFunction: (T) -> T? ): Sequence<T> //sampleEnd
In both cases, the return value of nextFunction
is used as the input in the following iteration, assuming its non-null. If it’s null
, the Sequence
is terminated.
How sequences work & Terminal operations
When you apply operations to a Sequence
, you’re basically creating a hierarchy of objects that describe the operations you want to perform on its elements. In other words, instances of Sequence
are sets of instructions that describe how to create and transform their elements. The operations you have at your disposal are identical to the counterparts defined on Iterable
— map
, filter
, take
, fold
, groupBy
, you name it.
Whenever possible, calling an operation simply adds a new instruction, i.e. creates a new instance of Sequence
with the new operation wrapping the previous ones (Dave Leeds has a great article that talks about this in more detail). For example, both map
and filter
do this. Such operations are called intermediate operations.
However, some operations that require traversing all the elements to give the correct result. For example, fold
, count
, groupBy
, forEach
and others are all functions that need to traverse all the elements to compute their result. Such operations are called terminal operations, and the moment they are called is the moment all the operations in the sequence definition actually get executed.
//sampleStart fun main() { val list = listOf(1, 2, 3, 4, 5, 6) .filter { it % 2 == 0} .map { it * it } val sequence = sequenceOf(1, 2, 3, 4, 5, 6) .filter { it % 2 == 0} .map { it * it } println(list) // List<Int> [4, 16, 36] println(sequence) // kotlin.sequences.TransformingSequence@<memory address> println( list.groupBy { it.toString().length } ) // Map<Int, List<Int>> {1->[4], 2->[16, 36]} println( sequence.groupBy { it.toString().length } ) // Map<Int, List<Int>> {1->[4], 2->[16, 36]} } //sampleEnd
Eager vs. lazy, finite vs. infinite
You will often encounter articles that describe the difference between Sequence
and Iterable
in terms of eagerness and laziness — Iterable
is eager (runs immediately), while Sequence
is lazy (only builds a set of instructions and does not run until you apply a terminal operation). You will also often encounter descriptions in terms of finiteness — Sequences
can be infinite, as opposed to Iterable
, which is always finite.
While both statements are technically correct, I dislike this approach because I think that they address consequences, and not core properties. Framing the difference between Iterable
and Sequence
in this manner distracts from what’s fundamentally going on. If you’ve ever felt like laziness vs. eagerness and finite vs. infinite is an awfully ad-hoc way to discriminate between collection hierarchies, this is probably why — it’s not at the core of what’s actually going on.
The laziness of Sequence
is a consequence of the fact that it traverses operations first, and therefore, cannot execute immediately when an instance constructed. If it did, you couldn’t specify multiple operations! By definition, we require all the operations to be applied to the first element before it moves on to the second element, but that can’t happen if it executes immediately, before we’ve even specified which operations should be run. Therefore, by necessity, Sequence
cannot start immediately. On the other hand, since Iterable
traverses elements first, it is able to start immediately — no need to wait for all the operations to be specified, it traverses the elements, spits out the result, and we’re free to chain another operation if we want to.
But it also works the other way around — Iterable
traverses elements first, and so it can’t work unless all elements have been specified and are loaded into memory. By necessity, this means that you cannot have an infinite amount of these elements — Iterable
applies the first operation to all elements before moving on to the next operation, which would never finish executing. On the other hand, Sequence
traverses operations first, and so it doesn’t need all the elements to be loaded in memory — it suffices to fetch/compute the next one once it’s finished applying all the operations to the current one. Therefore, as a consequence, Sequences
can be infinite and still be used to give meaningful results.
The use-cases for infinite sequences are essentially the same as for infinite loops. Think of reading a sensor that is continuously producing data, which you want to transform and render to the screen. You could never model this using Iterable
— the stream of data never ends, so Iterable
would never get around to even running the first operation — and therefore, you couldn’t use map
etc.
With Sequence
, it’s no problem at all:
//sampleStart generateSequence { // Read sensor }.map { // Map to data class }.forEach { // Render to screen } //sampleEnd
You saw another example of an infinite sequence using generateSequence
:
//sampleStart fun main() { val sequence = generateSequence(1) { it * 2 } // An infinite sequence sequence.forEach(::println) // Throws OutOfMemoryError } //sampleEnd
To sum up, eagerness/laziness and finiteness/infiniteness are both a consequence of the fact that Sequence
requires all operations to be specified before it starts, while Iterable
requires all elements to be specified before it starts. And in turn, both of these are themselves consequences of the fact that Sequence
traverses operations first, while Iterable
traverses elements first. This is the fundamental property, from which the others follow.