0

I am trying to append a new row in tf.data.Dataset in tensorflow.js, and after searching i figured that the way to do this is by turning the new row which is originally a json object into a dataset object then concatenate it with the pervious one, but i ended up facing this error

 "this.lastRead.then is not a function" 

I tried to debug it so i tried to concatenate the same dataset with it self and faced the same problem:

csvUrl = 'https://storage.googleapis.com/tfjs-examples/multivariate-linear-regression/data/boston-housing-train.csv';
const a = tf.data.csv(
  csvUrl, {
    columnConfigs: {
      medv: {
        isLabel: true
      }
    }
  });

const b = a.concatenate(a);
await b.forEachAsync(e => console.log(e));

and got the same error message, can you help me out?

bibs2091
  • 31
  • 1
  • 4

2 Answers2

0

Currently there is a bug that prevents to concatenate datasets. Until it is deployed in a new version, the dataset iterators can be utilized to do a concatenation. Here is an example:

const csvUrl =
'https://storage.googleapis.com/tfjs-examples/multivariate-linear-regression/data/boston-housing-train.csv';

async function run() {

  const csvDataset = tf.data.csv(
    csvUrl, {
      columnConfigs: {
        medv: {
          isLabel: true
        }
      }
    });

  const numOfFeatures = (await csvDataset.columnNames()).length - 1;

  // Prepare the Dataset for training.
  const flattenedDataset =
    csvDataset
    .map(({xs, ys}) =>
      {
        // Convert xs(features) and ys(labels) from object form (keyed by
        // column name) to array form.
        return {xs:Object.values(xs), ys:Object.values(ys)};
      })
    //.batch(10);

const it = await flattenedDataset.iterator()
const it2 = await flattenedDataset.iterator()
   const xs = []
   const ys = []
   // read only the data for the first 5 rows
   // all the data need not to be read once 
   // since it will consume a lot of memory
   for (let i = 0; i < 5; i++) {
        let e = await it.next()
        let f = await it2.next()
      xs.push(e.value.xs.concat(f.value.xs))
      ys.push(e.value.ys.concat(f.value.ys))
   }
  const features = tf.tensor(xs)
  const labels = tf.tensor(ys)

  console.log(features.shape)
  console.log(labels.shape)

}

await run();

The only thing to keep in mind is that the above will load all the tensors in memory which cannot be ideal depending on the size of the dataset. A data generator can be used to reduce the memory footprint. here is a very detailed answer to show how to do it

edkeveked
  • 17,989
  • 10
  • 55
  • 93
0

apparently this was a bug in the tensorflow.js library. The bug has been fixed in this pull request: https://github.com/tensorflow/tfjs/pull/5444

Thank you.

Edit

The pull request has been merged and now it is in the version 3.9.0

bibs2091
  • 31
  • 1
  • 4