Read on for a discussion about the performance of Sequence
vs. Iterable
, when to use which, and the differences between Sequence
and Java Streams.
When to use Iterable
vs. When to use Sequence
One of the main differences between Iterable
and Sequence
, and most often the reason one is used over the other, is differences in performance. However, before we get into this subject, let me just remind you that there is only a single correct way to determine which of the two perform better in a given situation — benchmarks. As you will see, there are multiple factors affecting the performance of Iterable
and Sequence
, and which of them is better really depends on the actual, real world conditions in which they are used.
That being said, there are certain guidelines that you can keep in mind when trying to estimate which choice is more appropriate when you’re first writing the code (i.e. before you have the ability to benchmark), and all of them can be deduced from looking at the imperative equivalents of what Iterable
and Sequence
do, as we did at the beginning of this series.
//sampleStart import java.time.Instant data class Post(val author: String, val contents: String, val instantPosted: Instant) fun List<Post>.byAuthorAfterInstant_likeIterable( author: String, after: Instant, limit: Int = -1 ): List<Post>{ val authorFilterResult = mutableListOf<Post>() for(post in this) { if(post.author == author) { authorFilterResult.add(post) } } val instantFilterResult = mutableListOf<Post>() for(post in authorFilterResult) { if(post.instantPosted > after) { instantFilterResult.add(post) } } return if(limit < 0 || limit >= instantFilterResult.size) { instantFilterResult } else { val limitResult = mutableListOf<Post>() for (i in 0 until limit) { limitResult.add(instantFilterResult[i]) } limitResult } } fun List<Post>.byAuthorAfterInstant_likeSequence( author: String, after: Instant, limit: Int = -1 ): List<Post>{ val result = mutableListOf<Post>() for(post in this) { if(post.author == author && post.instantPosted > after) { result.add(post) } if(result.size !in 0 .. limit) break } return result } //sampleEnd fun main() { val poem = """ Kotlin, the maestro in the code's symphony, With delegates and lambdas, pure harmony. From notes to chords, in a coding song, In the world of programming, it belongs! """.trimIndent() println(poem) }
Right off the bat, we notice several large differences that affect performance.
The likeIterable
version iterates the entire collection for each operation, and creates intermediate List
instances for each operation. Even worse, it inserts the elements one by one, and since the default backing implementation of MutableList
is ArrayList
, which is backed by a fixed-size array, once we get past the maximum size of this array, ArrayList
needs to internally create a new one and copy over all its elements to it before actually inserting it. So a lot of processing power is wasted on copying things over, and the memory (and therefore GC) footprint can be large.
In contrast, the likeSequence
version only iterates the entire collection once, and does not create any intermediate List
instances. In the presence of a positive limit
, the difference is even more profound, because the likeSequence
version stops iterating the moment it has found the appropriate number of elements.
So, does that mean that we should always use Sequence
? Not, it does not. For one thing, keep in mind that the above is not actually the code that gets executed when using Iterable
or Sequence
. In reality, we know that applying an intermediate (i.e. non-terminal) operation to Sequence
creates a new Sequence
that wraps the previous one, and those objects are not trivial. For smaller collections, it can be much faster to just create a few intermediate lists containing 10 elements each than to build up a hierarchy of nested objects that recursively call each other once a terminal operation is applied. That’s why it’s important to always benchmark whatever you write, and not take anything for granted.
Therefore, the general guideline is that the larger the number of operations and the larger the collection of elements we want to apply them, the larger the likelihood that Sequence
will be more performant than Iterable
. Whenever you’re dealing with a collection that’s large enough that fitting it all into memory starts to be a concern, Sequences
should be your go-to solution.
Another benefit of the likeSequence
method is that a single complicated call to a given intermediate operation can be broken up into a chain of simpler calls, but without having to traverse the collection each time.
//sampleStart fun main() { val list = (1..10).toList() // Iterates once list.filterNot { it % 2 == 0 && it % 3 == 0 } // Iterates twice list.filterNot { it % 2 == 0 } .filterNot { it % 3 == 0 } val sequence = (1..10).asSequence() // Iterates once sequence.filterNot { it % 2 == 0 && it % 3 == 0 } // Also iterates once sequence.filterNot { it % 2 == 0 } .filterNot { it % 3 == 0 } } //sampleEnd
Therefore, using Sequence
can theoretically be used to enhance readability while diminishing the impact on performance.
Sequences can also be preferred in situations where the pipeline of operations is not specified in a single place, but rather built up by various different services/methods.
//sampleStart import java.nio.file.Path interface File interface DiskAccessService { fun Path.files(): Sequence<File> } interface CensorService { fun Sequence<File>.filterAllowed(): Sequence<File> } interface BusinessService { val diskAccessService: DiskAccessService val censorService: CensorService fun Path.filesPaged(skip: Int, limit: Int): List<File> = with(diskAccessService) { with(censorService) { files() .filterAllowed() .drop(skip) .take(limit) .toList() } } } //sampleEnd fun main() { val poem = """ When you're sailing in the sea of code, Kotlin's syntax is the compass, the road. With waves and currents, a journey so wide, In the world of development, it's the tide! """.trimIndent() println(poem) }
If we did the above using Iterable
, we’d probably have processed a large portion of the files on disk in DiskAccessService
and CensorService
, only to throw almost all of them away in BusinessService
.
Finally, there are certain situations where you simply can’t use Iterable
, notably situations where you want to work with infinite sequences of elements in an infinite loop. There is simply no way to model this using Iterable
, so your choice is between Sequence
and coding in an imperative style. Of those two, always choose the former if you can.
There are some other things that you can take into consideration, which you can read about here and here.
Kotlin Sequences vs. Java Streams
You might have already noticed that Kotlin Sequences are very similar to Java Streams, and may be wondering why the Kotlin team decided to reimplement something that was already present in Java.
The reason is that designing Sequences for Kotlin first allows them to take advantage of all the features available in Kotlin that are not available in Java. An obvious one is null safety — when using Stream
, all types in all lambdas will be platform types. Another is extension functions, which allow Sequences to have a (much) richer, and arguably simpler, API. You can read about further advantages here.
It should be mentioned that one thing Streams have that Sequences don’t is the parallel()
method. However, this is something you’re actually discouraged from using, and the same thing can be achieved in a safer and much more powerful manner using Kotlin Flows.
In any case, if you ever need to move between these two worlds, the following functions are at your disposal
//sampleStart fun <T> Sequence<T>.asStream(): Stream<T> fun <T> Stream<T>.asSequence(): Sequence<T> //sampleEnd