11

I'm struggling with getting a simple correlation done. I've tried all that was suggested under similar questions.

Here are the relevant parts of the code, the various attempts I've made and their results.

import numpy as np
import pandas as pd

try01 = data[['ESA Index_close_px', 'CCMP Index_close_px' ]].corr(method='pearson')

print (try01) 

Out:

Empty DataFrame
Columns: []
Index: []

try04 = data['ESA Index_close_px'][5:50].corr(data['CCMP Index_close_px'][5:50])
print (try04)

Out:

**AttributeError: 'float' object has no attribute 'sqrt'**

using numpy

try05 = np.corrcoef(data['ESA Index_close_px'],data['CCMP Index_close_px'])
print (try05)

Out:

AttributeError: 'float' object has no attribute 'sqrt'

converting the columns to lists

ESA_Index_close_px_list = list()
start_value = 1
end_value = len (data['ESA Index_close_px']) +1
for items in data['ESA Index_close_px']:
    ESA_Index_close_px_list.append(items)
    start_value = start_value+1    
    if start_value == end_value:
        break
    else:
        continue

CCMP_Index_close_px_list = list()
start_value = 1
end_value = len (data['CCMP Index_close_px']) +1
for items in data['CCMP Index_close_px']:
    CCMP_Index_close_px_list.append(items)
    start_value = start_value+1    
    if start_value == end_value:
        break
    else:
        continue

try06 = np.corrcoef(['ESA_Index_close_px_list','CCMP_Index_close_px_list'])
print (try06)

Out:

****TypeError: cannot perform reduce with flexible type****

Also tried .astype but not made any difference.

data['ESA Index_close_px'].astype(float)

data['CCMP Index_close_px'].astype(float)

Using Python 3.5, pandas 0.18.1 and numpy 1.11.1

Would really appreciate any suggestion.

**edit1:* Data is coming from an excel spreadsheet data = pd.read_excel('C:\\Users\\Ako\\Desktop\\ako_files\\for_corr_‌​tool.xlsx') prior to the correlation attempts, there are only column renames and

data = data.drop(data.index[0]) 

to get rid of a line

regarding the types:

print (type (data['ESA Index_close_px']))



print (type (data['ESA Index_close_px'][1]))

Out:

**edit2* parts of the data:

print (data['ESA Index_close_px'][1:10])

print (data['CCMP Index_close_px'][1:10])

Out:

2        2137
3        2138
4        2132
5        2123
6        2127
7     2126.25
8      2131.5
9      2134.5
10       2159
Name: ESA Index_close_px, dtype: object
2     5241.83
3     5246.41
4     5243.84
5     5199.82
6     5214.16
7     5213.33
8     5239.02
9     5246.79
10    5328.67
Name: CCMP Index_close_px, dtype: object
DanielBarbarian
  • 5,093
  • 12
  • 35
  • 44
a_ko
  • 149
  • 1
  • 2
  • 10
  • 2
    Can you post some of your input data? – Maximilian Peters Nov 06 '16 at 20:31
  • We need to see how you created the DataFrame `data`. At the least, we need to know more about it, such `data.dtypes`. I was not able to reproduce what you show in your first three examples. – Warren Weckesser Nov 06 '16 at 20:50
  • Certainly: `data = pd.read_excel('C:\\Users\\Ako\\Desktop\\ako_files\\for_corr_tool.xlsx')` It is coming from an excel spreadsheet prior to the correlation attempts, there are only column renames and `data = data.drop(data.index[0])` to get rid of a line regarding the types: `print (type (data['ESA Index_close_px'])) ` `print (type (data['ESA Index_close_px'][1]))` out: – a_ko Nov 07 '16 at 07:27
  • We need you to edit your question and copy some data there directly. Please check [How to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – IanS Nov 07 '16 at 10:24
  • Thanks Ian I edited the Q with the additional info. Also added to my response to Warren. When I did: `print (type (data['ESA Index_close_px']))` and `print (type (data['ESA Index_close_px'][1]))` I got float however with `print (data['ESA Index_close_px'][1:10])` it says dtype object – a_ko Nov 07 '16 at 11:02

2 Answers2

25

Well, I've encountered the same problem today. try use .astype('float64') to help make the type correct.
data['ESA Index_close_px'][5:50].astype('float64').corr(data['CCMP Index_close_px'][5:50].astype('float64'))

This works well for me. Hope it can help you as well.

Yuan Tao
  • 447
  • 5
  • 7
0

You can try as following:

Top15['Citable docs per capita']=(Top15['Citable docs per capita']*100000)
Top15['Citable docs per capita'].astype('int').corr(Top15['Energy Supply per Capita'].astype('int'))

It worked for me.

ErTR
  • 863
  • 1
  • 14
  • 37