I am currently debugging a performance issue (see here) which I could reduce to a code snippet that uses multiple async{}
calls in a coroutine. I am delaying by 50 milliseconds but the printed duration (on Windows 10, AMD 5800X3D) is always 60ms or higher:
import kotlinx.coroutines.*
import java.util.concurrent.TimeUnit
import kotlin.system.measureNanoTime
fun main(args: Array<String>) {
while (true) {
runBlocking {
Example().test()
}
}
}
class Example {
suspend fun test(): Test = withContext(Dispatchers.IO) {
var result: Test? = null
val time = measureNanoTime {
val a = async { doDelay() }
val b = async { doDelay() }
val c = async { doDelay() }
result = Test(ID("test"))
a.await()
b.await()
c.await()
}
println("TIMING: ${TimeUnit.NANOSECONDS.toMillis(time)}")
result!!
}
suspend fun doDelay() = coroutineScope {
delay(50)
"dummy"
}
data class Test(val id: String)
}
Searching for coroutines in regards to "high latency" or "long delay" yields no results in this area and the profiling data is hard to analyze as its spread across multiple threads.
I am looking for an explanation how 3 async{}
calls that delay for 50 ms actually take 60ms or more to complete. I don't mind 1-2 ms of overhead but 20% (10ms) is just way too much. Reducing the delay to 10 ms reduces the overhead to 5ms, which is worse percentage wise but lower in the absolute sense.
Additional notes:
- the
delay
calls are used to emulate IO processes - this is not about Android but server-side programming on a normal JVM
- this is a doubly reduced example, I have double checked this implementation in a simple REST API and checked the network tab of Google Chrome as well - it matches the printed >= 60ms
Dispatchers.IO
is chosen for simplicity, in reality its another ThreadPool which is similiar toDispatchers.IO
- but all multithreaded dispatchers seems to "suffer", as outlined- I am using nanotime to avoid the low resolution of Windows
System.currentTimeMillis()
and some googling confirmed thatdelay
is backed by high resolution timers