0

I have some options chain data:

Contract Name,Last Trade Date,Strike,Last Price,Bid,Ask,Change

AMZN200605P03320000,2020-05-28 3:24PM EDT,3320.0,900.65,876.0,893.5,+900.65

AMZN200605P03500000,2020-05-28 3:51PM EDT,3500.0,1099.55,1055.5,1073.5,"+1,099.55"

The entry - "+1,099.55" seems to be a bad entry in the data, since there are not any other records like that and I need to sanitize it before inserting into a SQLDB. I have tried a couple different things but none have worked. Any insight would be greatly appreciated:

optionsChainPuts['Change'] = optionsChainPuts['Change'].map(lambda x: x.lstrip('\"+').rstrip('\"'))
optionsChainPuts['Change'] = optionsChainPuts['Change'].astype(str).str.replace('\D', '')
optionsChainPuts['Change'] = optionsChainPuts['Change'].astype(str).map(lambda x: x.replace('"', ''))

Thank you

Brown Bear
  • 19,655
  • 10
  • 58
  • 76
  • Refer to: https://stackoverflow.com/questions/51359010/pandas-data-with-double-quote/51359137 and https://stackoverflow.com/questions/36598608/column-in-data-frame-with-some-numbers-in-double-quotes-trying-to-change-to-flo – Arun May 31 '20 at 19:29

2 Answers2

0

It's the comma that is causing issues. One option is to split it at the comma and join the values

>>> val = "+1,099.55"
>>> val = val.split(",")
>>> num = float(val[0] + val[1])
>>> num
1099.55

Hope that helps!

Teejay Bruno
  • 1,716
  • 1
  • 4
  • 11
0

Issues is numbers with commas and quotes.

Use locale to convert from European to English

Code

from io import StringIO
import pandas as pd
import locale

s = '''Contract Name,Last Trade Date,Strike,Last Price,Bid,Ask,Change
AMZN200605P03320000,2020-05-28 3:24PM EDT,3320.0,900.65,876.0,893.5,+900.65
AMZN200605P03500000,2020-05-28 3:51PM EDT,3500.0,1099.55,1055.5,1073.5,"+1,099.55"'''

df = pd.read_csv(StringIO(s))

# set local to English
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' ) 

# Convert column to float
df['Change'] = df['Change'].apply(lambda x: locale.atof(x))

print(df['Change'])

Output

Name: Change, dtype: object
0     900.65
1    1099.55
Name: Change, dtype: float64
DarrylG
  • 16,732
  • 2
  • 17
  • 23