1

I am new to python, I am trying to work with dictionary and correlation. I am trying to get a better understanding of working with dictionaries. But I get an error when I try to correlate the dictionary values for every year, so 2010 up to 2021.

data = {}
for year in names:
    tableName = category+str(year)
    n = ', '.join(f'"{name}"' for name in names[year])
    sqlQ = f'select "Date", {n} from {tableName};'
    cur.execute(sqlQ)
    res = cur.fetchall()
    data[year] = {}
    for row in res:
        if row[0] in dateRange and row[0].year == year:
            date = row[0].strftime("%d/%m/%Y")
            data[year][date] = []
            for i in range(1, len(row)):
                data[year][date].append(float(row[i]))
data

{2010: {'01/01/2010': [18.16, 59.73, 218.41, 101.14, 44.15, 10.0, 134.52, 40.15, 7.53, 18.19, 24.2, 10.38, 25.6, 48.17, 18.44, 32.1, 9.11, 39.73, 88.16, 23.97, 14.19, 15.34, 45.86, 24.25, 39.83, 43.46, 8.07, 9.8, 7.87, 32.73, 13.57, 39.04, 18.68, 11.55, 22.67, 52.97, 106.15, 47.49, 34.16, 26.67, 61.96, 54.09, 7.24, 14.94, 78.03, 19.31, 34.36, 10.9, 10.73, 35.27, 16.1, 37.61, 16.61, 48.65, 38.12, 26.78, 104.99, 26.73, 22.46, 30.68, 30.7, 65.09, 50.28, 80.31, 19.91, 32.27, 62.5, 6.03, 9.14, 16.16, 24.53, 33.29, 53.25, 168.84, 10.68, 130.9, 30.12, 23.01, 23.49, 80.66, 58.73, 9.15, 16.52, 96.83, 35.17, 15.11, 21.64, 23.39, 24.24, 26.93, 42.85, 24.18, 61.75, 80.98, 30.48, 40.68, 26.06, 3.81, 30.5, 13.9], '04/01/2010': [18.37, 61.71, 223.96, 102.92, 45.26, 10.28, 133.9, 41.73, 7.64, 18.89, 24.28, 10.38, 25.68, ...

import pandas as pd
df = data
for key, value in df.items():
    #print(key,value)
    df.corr(method ='pearson')

AttributeError: 'dict' object has no attribute 'corr'

Ailurophile
  • 2,552
  • 7
  • 21
  • 46

1 Answers1

0

You had to store the dictionary in a pandas's dataframe df = pd.DataFrame(data)

import pandas as pd
for key, value in data.items():
    print(key)
    df = pd.DataFrame(value)
    print(df.corr(method ='pearson'))
imdevskp
  • 2,103
  • 2
  • 9
  • 23
  • Then it only get the first key of 2010 but not the rest, i can see values for the rest of the data is NaN - TypeError: corr() missing 1 required positional argument: 'other' – fallen mcmullan May 01 '21 at 08:20
  • @fallenmcmullan, does putting dataframe inside the loop solves the problem ? Please see the updated code – imdevskp May 01 '21 at 08:24
  • I get a lot of empty dataset still then @imdevskp – fallen mcmullan May 01 '21 at 08:26
  • do you mind adding the output of `data.keys()` and `data[2010].keys()` in the question – imdevskp May 01 '21 at 08:31
  • It prints dict_keys([2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]) dict_keys(['01/01/2010', '04/01/2010', '05/01/2010', '06/01/2010', '07/01/2010', '08/01/2010', '11/01/2010', '12/01/2010', '13/01/2010', '14/01/2010', '15/01/2010', '18/01/2010', '19/01/2010', '20/01/2010', '21/01/2010', '22/01/2010', '25/01/2010', but then it ends at end of year 2010 – fallen mcmullan May 01 '21 at 08:35
  • I have updated the code. can you include the output of that too if it's not working – imdevskp May 01 '21 at 08:39
  • When i did it with an excel file and not from database i got this result. I dont understand why it is not the same now https://prnt.sc/12co77f ... I get this https://prnt.sc/12co8mo – fallen mcmullan May 01 '21 at 08:47