0

This is the code I'm using and I have also tried converting my datatype of my columns which is object to float but I got this error

df = pd.read_csv('DDOSping.csv')
pearsoncorr = df.corr(method='pearson')

ValueError: could not convert string to float: '172.27.224.251-172.27.224.250-56003-502-6'

Michael Hall
  • 2,834
  • 1
  • 22
  • 40

1 Answers1

0

Somewhere in your CSV this string value exists '172.27.224.251-172.27.224.250-56003-502-6'. Do you know why it's there? What does it represent? It looks to me like it shouldn't be in the data you include in your correlation matrix calculation.

The df.corr method is trying to convert the string value to a float, but it's obviously not possible to do because it's a big complicated string with various characters, not a regular number.

You should clean your CSV of unnecessary data (or make a copy and clean that so you don't lose anything important). Remove anything, like metadata, that isn't the exact data that df.corr needs, including the string in the error message.

If it's just a few values you need to clean then just open in excel or a text editor to do the cleaning. If it's a lot and all the irrelevant data to be removed is in specific rows and/or columns, you could just remove them from your DataFrame before calling 'df.corr' instead of cleaning the file itself.

JWyndham
  • 16
  • 2
  • Hey its network data like IP addresses and Flow IDs all data is of similar sort. They are 84 features I need to find relation in them , via any method. If you could suggest any method? – Joveria Mar 27 '22 at 12:37