1

Hellooo

I'm starting using DATAIKU. I imported a dataset with many empty column. I'm looking for a code or method to delete all empty column in one time

shosho88
  • 31
  • 2

2 Answers2

0

Add a python recipe to your Dataset in Dataiku and use below code to load data into pandas df:

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
dataset_obj = dataiku.Dataset("<Dataset_name>")
my_df = dataset_obj.get_dataframe()

Post that you can refer to the below link for removing empty columns from a pandas df: https://www.geeksforgeeks.org/drop-empty-columns-in-pandas/

Gaur
  • 280
  • 3
  • 8
0

@Gaur is right, you can do it with a Python recipe. The following code removes all columns that are completely empty:

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
example = dataiku.Dataset("example")
example_df = example.get_dataframe()

print("Original data shape:", example_df.shape)

for column, rows in example_df.items():
    if rows.isna().all():
        example_df = example_df.drop(column, axis=1)

print("New data shape:", example_df.shape)

# Write recipe outputs
no_empty_columns = dataiku.Dataset("no_empty_columns")
no_empty_columns.write_with_schema(example_df)
edo
  • 1,712
  • 1
  • 18
  • 19