1

I'm trying to use nodejs-polars library but have encountered a problem converting a javascript object to a polars dataframe.

Consider the following data for example

const myData = [
    {
        "id": "a",
        "name": "fred",
        "country": "france",
        "age": 30,
        "city": "paris" // there's no "city" property elsewhere in `myData`
    },
    {
        "id": "b",
        "name": "alexandra",
        "country": "usa",
        "age": 40
    },
    {
        "id": "c",
        "name": "george",
        "country": "argentina",
        "age": 50
    }
]

So if we do

const pl = require("nodejs-polars")

const output = pl.DataFrame(myData)

We get the error:

Error: Lengths don't match: Could not create a new DataFrame from Series. The Series have different lengths

Is there no way to create a polars dataframe from object such that it will automatically populate missing values with null?

Emman
  • 3,695
  • 2
  • 20
  • 44

1 Answers1

3

While this is not currently supported using pl.Dataframe, the desired behavior is supported via pl.readJSON. If you don't mind doing a small amount of pre-processing, this is easily achievable

const ndJSONData = myData
  .map(row => JSON.stringify(row))
  .join("\n")

// -1 will do a full scan, set to a length that best fits your use case.
const df = pl.readJSON(ndJSONData, {"inferSchemaLength": -1})

Edit: This is now supported with newer versions of polars via readRecords. nested dtypes do have limited support.

const df = pl.readRecords(myData, {inferSchemaLength: 10)
Cory Grinstead
  • 511
  • 3
  • 16