Question
I am using tf.Tensor
and tf.concat()
to handle large training data,
and I found continuous using of tf.concat()
gets slow.
What is the best way to load large data from file to tf.Tensor
?
Background
I think it's common way to handle data by array in Javascript. to achieve that, here is the rough steps to go.
steps to load data from file to Array
- read line from file
- parse line to Javascript's Object
- add that object to array by
Array.push()
- after finish reading line to end, we can use that array with for loop.
so I think I can use tf.concat()
in similar way to above.
steps to load data from file to tf.Tensor
- read line from file
- parse line to Javascript's Object
- parse object to tf.Tensor
- add tensor to original tensor by
tf.concat()
- after finish reading line to end, we can use that tf.Tensor
Some code
Here is some code to measure both speed of Array.push()
and tf.concat()
import * as tf from "@tensorflow/tfjs"
let t = tf.tensor1d([1])
let addT = tf.tensor1d([2])
console.time()
for (let idx = 0; idx < 50000; idx++) {
if (idx % 1000 == 0) {
console.timeEnd()
console.time()
console.log(idx)
}
t = tf.tidy(() => t.concat(addT))
}
let arr = []
let addA = 1
console.time()
for (let idx = 0; idx < 50000; idx++) {
if (idx % 1000 == 0) {
console.timeEnd()
console.time()
console.log(idx)
}
arr.push(addA)
}
Measurement
We can see stable process on Array.push()
,
but it gets slow on tf.concat()
For tf.concat()
default: 0.150ms
0
default: 68.725ms
1000
default: 62.922ms
2000
default: 23.199ms
3000
default: 21.093ms
4000
default: 27.808ms
5000
default: 39.689ms
6000
default: 34.798ms
7000
default: 45.502ms
8000
default: 94.526ms
9000
default: 51.996ms
10000
default: 76.529ms
11000
default: 83.662ms
12000
default: 45.730ms
13000
default: 89.119ms
14000
default: 49.171ms
15000
default: 48.555ms
16000
default: 55.686ms
17000
default: 54.857ms
18000
default: 54.801ms
19000
default: 55.312ms
20000
default: 65.760ms
For Array.push()
default: 0.009ms
0
default: 0.388ms
1000
default: 0.340ms
2000
default: 0.333ms
3000
default: 0.317ms
4000
default: 0.330ms
5000
default: 0.289ms
6000
default: 0.299ms
7000
default: 0.291ms
8000
default: 0.320ms
9000
default: 0.284ms
10000
default: 0.343ms
11000
default: 0.327ms
12000
default: 0.317ms
13000
default: 0.329ms
14000
default: 0.307ms
15000
default: 0.218ms
16000
default: 0.193ms
17000
default: 0.234ms
18000
default: 1.943ms
19000
default: 0.164ms
20000
default: 0.148ms