-2
data = Index(['borough', 'neighborhood', 'building_class_category',
       'tax_class_at_present', 'block', 'lot', 'ease_ment',
       'building_class_at_present', 'address', 'apart_ment_number', 'zip_code',
       'residential_units', 'commercial_units', 'total_units',
       'land_square_feet', 'gross_square_feet', 'year_built',
       'tax_class_at_time_of_sale', 'building_class_at_time_of_sale',
       'sale_price', 'sale_date'],
      dtype='object')

Convert the field “sale_price” to numeric (it is currently formatted as currency which cannot be used in calculations.) (Hint, you will need to remove the commas and dollar signs.

I tried

df=data
df['sale_price']=df['sale_price'].replace('$','')
using 

df['sale_price'] = df['sale_price'].str.replace(',', '').str.replace('$', '').astype(int)
webprogrammer
  • 2,393
  • 3
  • 21
  • 27
  • 3
    What is the issue? post some sample data? – vb_rises Feb 20 '20 at 00:18
  • Sale _price is the column with $3,343 etc etc etc.... Need to remove the comma and $ – Mohan Babu Feb 20 '20 at 00:28
  • believe you need to add quotes here: .astype('float64'). If you're getting a specific error message that would be nice to have. – born_naked Feb 20 '20 at 00:29
  • tried did not work – Mohan Babu Feb 20 '20 at 00:33
  • IndexError Traceback (most recent call last) in 1 df=brooklyn_sales.columns ----> 2 df['sale_price']=df['sale_price'].replace('$','') 3 df ~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in __getitem__(self, key) 4278 if is_scalar(key): 4279 key = com.cast_scalar_indexer(key) -> 4280 return getitem(key) 4281 4282 if isinstance(key, slice): IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices – Mohan Babu Feb 20 '20 at 00:35
  • Hmmm something is either off about your df or it's because you need to cast as a float not int – born_naked Feb 20 '20 at 00:41
  • 1
    **Please share a [mcve].** – AMC Feb 20 '20 at 01:41
  • 1
    @born_naked It’s something off about the df, although not in the way you might think! Indeed, the issue appears to be with the `df` variable, not the actual DataFrame. I know it’s difficult to read, but look at the error message: It looks like OP is defining `df = brooklyn_sales.columns`. – AMC Feb 20 '20 at 03:31
  • I have assigned the brooklyn_sales.columns to df, but still df['sale_price'] = df['sale_price'].str.replace(r'[$,]','').astype(int) even if i use the brooklyn_sales.columns im getting the same error – Mohan Babu Feb 20 '20 at 04:58
  • @MohanBabu when you set df=brooklyn_sales.columns you are redefining the df to be just the column headers of brooklyn_sales and nothing else. If you want the df to have the same content as brooklyn_sales you should use df=brooklyn_sales. You don't seem to be posting all your code, and it looks like your error messages are result of intermediary steps. You should post all your code. – born_naked Feb 20 '20 at 05:09

2 Answers2

1

If I understand your problem correctly, this answer should help:

converting currency with $ to numbers in Python pandas

James
  • 332
  • 2
  • 3
  • 9
  • AttributeError Traceback (most recent call last) in 1 df=brooklyn_sales.columns ----> 2 df[df.sale_price[1:]] = df[df.sale_price[1:]].apply(lambda x: x.str.replace('$','')).apply(lambda x: x.str.replace(',','')).astype(np.int64) 3 df AttributeError: 'Index' object has no attribute 'sale_price' – Mohan Babu Feb 20 '20 at 00:32
  • @MohanBabu the error says that it has no column "sale_price". Check again. There could be some spelling mistake. – vb_rises Feb 20 '20 at 00:34
  • when i try to print whole list it displays. There is sale_price column – Mohan Babu Feb 20 '20 at 00:37
  • Something like this? df[df.columns['sale_price'].replace('[\$,]', '', regex=True).astype(float) – James Feb 20 '20 at 00:40
0

You can check this one, replace int with any type you need:

df['sale_price'] = df['sale_price'].str.replace(r'[$,]','').astype(int)

Or this also works:

df['sale_price'] = pd.to_numeric(df.sale_price)
loginmind
  • 563
  • 5
  • 11
  • AttributeError Traceback (most recent call last) in 1 df=brooklyn_sales.columns ----> 2 df['sale_price'] = pd.to_numeric(df.sale_price) 3 AttributeError: 'Index' object has no attribute 'sale_price' – Mohan Babu Feb 20 '20 at 00:56
  • I tested this with my own data and it works. Your error shows that your Dataframe has no column `sale_price`. Can you check it again. – loginmind Feb 20 '20 at 01:47