0

I have a list of dataframes like this one :

arr = [df1, df2, df3]

And I want to get a position of element in this list:

position_of_df2 = arr.index(df2)

But python gives error on this line:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

What Im doing wrong? PS what is the best way to get index of dataframe in array? Only iteration?

  • Not answering your question but you perhaps could solve your problem by storing your data frames in a dict with the key as the name, this would allow you to easily access them. – tmcnicol May 15 '18 at 09:01

2 Answers2

1

list.index works by checking equality of your input with elements from the list and looking for True. It then returns the index of the first match.

Testing equality of 2 dataframes returns a dataframe:

df1 = pd.DataFrame([[1, 2]])
df2 = pd.DataFrame([[1, 2]])

print(df1 == df2)

      0     1
0  True  True

The truthness of the result is ambiguous:

print(bool(df1 == df2))

# ValueError: The truth value of a DataFrame is ambiguous.
# Use a.empty, a.bool(), a.item(), a.any() or a.all().

Option 1

In my opinion, the best way to check if you have the correct dataframe is to use an ordered dictionary and define keys (preferably, use descriptive names as keys):

from collections import OrderedDict

o = OrderedDict([(1, df1), (2, df2), (3, df3)])

print(list(o.keys()).index(2))  # 1

Option 2

Use a generator expression with is, which returns True only if 2 variables point to the same object:

lst = [df1, df2, df3]

res = next(i for i, j in enumerate(lst) if j is df2)  # 1
jpp
  • 159,742
  • 34
  • 281
  • 339
1

Based on the answer of @jpp above, also quoting from the answer, https://stackoverflow.com/a/19918849/423725, improvised a solution.

df1 = pandas.DataFrame([1, 2])
df2 = pandas.DataFrame([3, 4])
df3 = pandas.DataFrame([5, 6])

arr = [df1, df2, df3]

def isEqual(df1, df2):
    from pandas.util.testing import assert_frame_equal

    try:
        assert_frame_equal(df1, df2)
        return True
    except:  # appeantly AssertionError doesn't catch all
        return False

def indexDF(df, arr):
    for index, dataframe in enumerate(arr):
        if isEqual(df, dataframe):
            return index

indexDF(df2, arr)
# 1
BcK
  • 2,548
  • 1
  • 13
  • 27
  • The only issue is if `df1 = df2 = pd.DataFrame([[1, 2]])`, then both dataframes will be considered the same for indexing purposes. I'm not *sure* that's what OP wants, but it should be noted. – jpp May 15 '18 at 09:01
  • @jpp That's correct. It's confusing to have same dataframes with different id's. Question is, am I looking for the closest one or the one that exactly matches with specified df. – BcK May 15 '18 at 09:07
  • yeah, it's about the first occurence in array. thank you @BcK for this answer – Scacal Mardan May 15 '18 at 09:14