0

I have a pandas dataframe which I have collected from a MongoDB.

The column names are a series of dates, ie. 4/7/20, 4/8/20, etc.

What I want to do is find the most recent date which has a column name the same as the date, because I want to delete all other date columns before writing it to a PostgreSQL database.

I was intending:

  1. Set a variable with today's date
  2. Loop through the column names comparing with today's date
  3. If exists, retain the variable name
  4. If it does not, reduce the date by 1 and check again until I get a match.

I am trying to get a list of column names from the dataframe, but when I run


    collection = client['DB_Name']['DB_Collection']
    
    df = collection.find()
    
    data_pandas = pd.DataFrame(list(df))
    
    index_list = list(data_pandas.index.values.tolist()) 
    
    today = date.today()
    
    today = today.strftime('X%m/X%d/%Y').replace('X0','X').replace('X','')
    
    print(df.columns)

I get an error:

'Cursor' object has no attribute 'columns'

The data frame looks fine from the IDE. What can I do to resolve this?

halfer
  • 19,824
  • 17
  • 99
  • 186
  • I've removed "Mods - feel free to delete question" from the self-answer. I guess this means there was just a trivial error stopping this working. – halfer Jul 06 '20 at 18:36

2 Answers2

1

MongoDB returns a cursor object, which you'll need to comprehend before passing to pandas. Take a look here:

How can I load data from mongodb collection into pandas' DataFrame?

  • Thanks for your help - that is what I have done though? (Edited the source code in my original question to add more detail) –  Apr 17 '20 at 14:49
0

Was calling print(df.columns) instead of data_pandas.columns.

halfer
  • 19,824
  • 17
  • 99
  • 186