Hellooo
I'm starting using DATAIKU. I imported a dataset with many empty column. I'm looking for a code or method to delete all empty column in one time
Hellooo
I'm starting using DATAIKU. I imported a dataset with many empty column. I'm looking for a code or method to delete all empty column in one time
Add a python recipe to your Dataset in Dataiku and use below code to load data into pandas df:
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
dataset_obj = dataiku.Dataset("<Dataset_name>")
my_df = dataset_obj.get_dataframe()
Post that you can refer to the below link for removing empty columns from a pandas df: https://www.geeksforgeeks.org/drop-empty-columns-in-pandas/
@Gaur is right, you can do it with a Python recipe. The following code removes all columns that are completely empty:
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
example = dataiku.Dataset("example")
example_df = example.get_dataframe()
print("Original data shape:", example_df.shape)
for column, rows in example_df.items():
if rows.isna().all():
example_df = example_df.drop(column, axis=1)
print("New data shape:", example_df.shape)
# Write recipe outputs
no_empty_columns = dataiku.Dataset("no_empty_columns")
no_empty_columns.write_with_schema(example_df)