remove empty dataframe from list and drop corresponding name in second list

Question

I have two lists, where the first one is a list of strings called names and has been generated by using the name of the corresponding csv files.

names = ['ID1','ID2','ID3']

I have loaded the csv files into individual pandas dataframes and then done some preprocessing which leaves me with a list of lists, where each element is the data of each dataframe:

dfs = [['car','fast','blue'],[],['red','bike','slow']]

As you can see it can happen that after preprocessing a dataframe could be empty, which leads to an empty list in dfs.

I would like to remove the element from this list and return it's index, so far I have tried this but I get no index when printing k.

k = [i for i,x in enumerate(dfs) if not x]

The reason I need this index is, so I can then look at removing the corresponding index element in list names.

The end results would look a bit like this:

names = ['ID1','ID3'] 
dfs = [['car','fast','blue'],['red','bike','slow']]

This way I can then save each individual dataframe as a csv file:

for df, name in zip(dfs, names):
    df.to_csv(name + '_.csv', index=False)

EDIT: I MADE A MISTAKE: The list of lists called dfs needs changing from [''] to []

`if not x` will not result in True as the list has a value `['']`. The condition `if not x` will get triggered only if the list was `[]` without anything inside. — Joe Ferndz, Mar 30 '21 at 12:58
your for loop `for df, name in zip(dfs,names)` is just a loop of lists. So `df` is not really a dataframe. The line inside the for loop `df.to_csv(name +...)` will NOT work. Are you sure it will work? — Joe Ferndz, Mar 30 '21 at 13:17
But then your `k = [i for i,x in enumerate(dfs) if not x]` should work, right? — Red, Mar 30 '21 at 13:25
@AnnZen no it still doesnt - I tried all methods that have been suggested and it is driving me insane... the output is still just an empty list and no index — msa, Mar 30 '21 at 13:27
if list of lists ends up with `[]` instead of `['']`, then your code `k = [i for i,x in enumerate(dfs) if not x]` should work. What is not working? — Joe Ferndz, Mar 30 '21 at 13:31
please change it to `k = [i for i,x in enumerate(dfs) if x]` Thats the mistake you have — Joe Ferndz, Mar 30 '21 at 13:35

Red · Accepted Answer · 2021-03-30T13:28:50.187

You can use the built-in any() method:

k = [i for i, x in enumerate(dfs) if not any(x)]

The reason your

k = [i for i, x in enumerate(dfs) if not x]

doesn't work is because, regardless of what is in a list, as long as the list is not empty, the truthy value of the list will be True.

The any() method will take in an array, and return whether any of the elements in the array has a truthy value of True. If the array has no elements such, it will return False. The thruthy value of an empty string, '', is False.

EDIT: The question got edited, here is my updated answer:

You can try creating new lists:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

new_names = list()
new_dfs = list()

for i, x in enumerate(dfs):
    if x:
        new_names.append(names[i])
        new_dfs.append(x)

print(new_names)
print(new_dfs)

Output:

['ID1', 'ID3']
[['car', 'fast', 'blue'], ['red', 'bike', 'slow']]

If it doesn't work, try adding a print(x) to the loop to see what is going on:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

new_names = list()
new_dfs = list()

for i, x in enumerate(dfs):
    print(x)
    if x:
        new_names.append(names[i])
        new_dfs.append(x)

Neeraj · Answer 2 · 2021-03-30T13:06:04.347

0

Since you are already using enumerate , you do not have to loop again.

Hope this solves your problem:

names = ['ID1', 'ID2', 'ID3']
dfs = [['car', 'fast', 'blue'], [''], ['red', 'bike', 'slow']]


for index, i in enumerate(dfs):
    if len(i) == 1 and '' in i:
        del dfs[index]
        del names[index]

print(names)
print(dfs)

# Output
# ['ID1', 'ID3']
# [['car', 'fast', 'blue'], ['red', 'bike', 'slow']]

edited Mar 30 '21 at 13:06

answered Mar 30 '21 at 12:57

Neeraj

975
4
10

be careful, if there are many values and one of it happens to be `''`, it might delete the index – Joe Ferndz Mar 30 '21 at 12:59
I figured that might be a problem. I haven't used pandas much, i assumed it would be '' always if its empty. – Neeraj Mar 30 '21 at 13:02
3

Also, try not to manipulate a list while you are iterating through it using for loop – Joe Ferndz Mar 30 '21 at 13:03

score 0 · Answer 3 · edited Mar 30 '21 at 13:18

0

I think The issue is because of [''].

l = ['']
len(l)

Gives output as 1. Hence,

not l

Gives False

If you are sure it will be [''] only, then try

dfs = [['car','fast','blue'],[''],['red','bike','slow']]

k = [i for i,x in enumerate(dfs) if len(x)==1 and x[0]=='']

this gives [1] as output

Or you can try with any(x)

edited Mar 30 '21 at 13:18

Red

26,798
7
36
58

answered Mar 30 '21 at 13:05

Doof

123
11

Joe Ferndz · Answer 4 · 2021-03-30T13:39:28.267

Looking at the data presented, I would do the following:

Step 1: Check if list has any values. If it does, if df will be True.

Step 2: Once you have the list, create a dataframe and write to csv.

The code is as shown below:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

dfx = {names[i]:df for i,df in enumerate(dfs) if df)}

import pandas as pd
for name,val in dfx.items():
    df = pd.DataFrame({name:val})
    df.to_csv(name + '_.csv', index=False)

remove empty dataframe from list and drop corresponding name in second list

4 Answers4