2

I have two lists, where the first one is a list of strings called names and has been generated by using the name of the corresponding csv files.

names = ['ID1','ID2','ID3'] 

I have loaded the csv files into individual pandas dataframes and then done some preprocessing which leaves me with a list of lists, where each element is the data of each dataframe:

dfs = [['car','fast','blue'],[],['red','bike','slow']]

As you can see it can happen that after preprocessing a dataframe could be empty, which leads to an empty list in dfs.

I would like to remove the element from this list and return it's index, so far I have tried this but I get no index when printing k.

k = [i for i,x in enumerate(dfs) if not x]

The reason I need this index is, so I can then look at removing the corresponding index element in list names.

The end results would look a bit like this:

names = ['ID1','ID3'] 
dfs = [['car','fast','blue'],['red','bike','slow']]

This way I can then save each individual dataframe as a csv file:

for df, name in zip(dfs, names):
    df.to_csv(name + '_.csv', index=False)

EDIT: I MADE A MISTAKE: The list of lists called dfs needs changing from [''] to []

msa
  • 693
  • 6
  • 21
  • `if not x` will not result in True as the list has a value `['']`. The condition `if not x` will get triggered only if the list was `[]` without anything inside. – Joe Ferndz Mar 30 '21 at 12:58
  • Removed pandas since nothing relate to pandas – BENY Mar 30 '21 at 13:06
  • your for loop `for df, name in zip(dfs,names)` is just a loop of lists. So `df` is not really a dataframe. The line inside the for loop `df.to_csv(name +...)` will NOT work. Are you sure it will work? – Joe Ferndz Mar 30 '21 at 13:17
  • @JoeFerndz yes this part of the code works. – msa Mar 30 '21 at 13:18
  • But then your `k = [i for i,x in enumerate(dfs) if not x]` should work, right? – Red Mar 30 '21 at 13:25
  • @AnnZen no it still doesnt - I tried all methods that have been suggested and it is driving me insane... the output is still just an empty list and no index – msa Mar 30 '21 at 13:27
  • 1
    @msa Does my edit help? – Red Mar 30 '21 at 13:29
  • if list of lists ends up with `[]` instead of `['']`, then your code `k = [i for i,x in enumerate(dfs) if not x]` should work. What is not working? – Joe Ferndz Mar 30 '21 at 13:31
  • please change it to `k = [i for i,x in enumerate(dfs) if x]` Thats the mistake you have – Joe Ferndz Mar 30 '21 at 13:35

4 Answers4

1

You can use the built-in any() method:

k = [i for i, x in enumerate(dfs) if not any(x)]

The reason your

k = [i for i, x in enumerate(dfs) if not x]

doesn't work is because, regardless of what is in a list, as long as the list is not empty, the truthy value of the list will be True.

The any() method will take in an array, and return whether any of the elements in the array has a truthy value of True. If the array has no elements such, it will return False. The thruthy value of an empty string, '', is False.

EDIT: The question got edited, here is my updated answer:

You can try creating new lists:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

new_names = list()
new_dfs = list()

for i, x in enumerate(dfs):
    if x:
        new_names.append(names[i])
        new_dfs.append(x)

print(new_names)
print(new_dfs)

Output:

['ID1', 'ID3']
[['car', 'fast', 'blue'], ['red', 'bike', 'slow']]

If it doesn't work, try adding a print(x) to the loop to see what is going on:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

new_names = list()
new_dfs = list()

for i, x in enumerate(dfs):
    print(x)
    if x:
        new_names.append(names[i])
        new_dfs.append(x)
Red
  • 26,798
  • 7
  • 36
  • 58
0

Since you are already using enumerate , you do not have to loop again.

Hope this solves your problem:

names = ['ID1', 'ID2', 'ID3']
dfs = [['car', 'fast', 'blue'], [''], ['red', 'bike', 'slow']]


for index, i in enumerate(dfs):
    if len(i) == 1 and '' in i:
        del dfs[index]
        del names[index]

print(names)
print(dfs)

# Output
# ['ID1', 'ID3']
# [['car', 'fast', 'blue'], ['red', 'bike', 'slow']]
Neeraj
  • 975
  • 4
  • 10
0

I think The issue is because of [''].

l = ['']
len(l)

Gives output as 1. Hence,

not l

Gives False

If you are sure it will be [''] only, then try

dfs = [['car','fast','blue'],[''],['red','bike','slow']]

k = [i for i,x in enumerate(dfs) if len(x)==1 and x[0]=='']

this gives [1] as output

Or you can try with any(x)

Red
  • 26,798
  • 7
  • 36
  • 58
Doof
  • 123
  • 11
0

Looking at the data presented, I would do the following:

Step 1: Check if list has any values. If it does, if df will be True.

Step 2: Once you have the list, create a dataframe and write to csv.

The code is as shown below:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

dfx = {names[i]:df for i,df in enumerate(dfs) if df)}

import pandas as pd
for name,val in dfx.items():
    df = pd.DataFrame({name:val})
    df.to_csv(name + '_.csv', index=False)
Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33