1

I have a Pandas DataFrame with item details. One of the columns is Weight and some values are stored as 200kgs, 120kgs etc in the Weight column.

I want to strip the 'kgs' string so that I can use the values for some calculation. I tried doing the same through a For loop to strip the 'kgs'

item = pd.read_csv('item_data.csv') 

for x in item.Weight:  # item.Weight shows the weights of the items
    if type(x) == str:
        item.Weight = x.strip('kgs')
    else:
        item.Weight =  x

The above code strips the 'kgs' but displays the first value for all the rows!

item.Weight = [x.strip('kgs') if type(x)==str else x for x in item.Weight]

However, when i do list comprehension, as shown above, it works! Can you please explain why the For loop does not seem to work but the List Comprehension with the same logic works

saspy
  • 19
  • 1
  • 4
  • Tried with Series.str.strip() without For loop and it worked, thanks for the suggestion. However, i wanted to know why for For loop did not work, and thanks SubhashR for explaining me that. – saspy Feb 06 '20 at 13:54

4 Answers4

1

Use:

item['Weight']=item.Weight.str.strip('kgs')
Binyamin Even
  • 3,318
  • 1
  • 18
  • 45
0

There is a built in method .str.strip() try:

item.str.rstrip('kgs')
Mark
  • 934
  • 1
  • 10
  • 25
  • I used it within a for loop on a series that has text and NaNs, but the operation wasn't performed in place. Any suggestions? – rahul-ahuja Feb 05 '21 at 01:37
0

Use Series.str.rstrip to remove kgs to the right of the values

item['Weight']=item.Weight.str.rstrip('kgs')

then whe can use Series.astype to convert to float or int:

item['Weight']=item.Weight.str.rstrip('kgs').astype(float)
#item['Weight']=item.Weight.str.rstrip('kgs').astype(int)

or pd.to_numeric with errors = 'coerce' and then check if there is any NaN value and what is its origin.

item['Weight']=pd.to_numeric(item.Weight.str.rstrip('kgs'),errors = 'coerce')
ansev
  • 30,322
  • 5
  • 17
  • 31
0

In the list comprehension method you are basically creating the complete list and assigning to weights columns so it works as expected.. although the method mentioned in other answers is more efficient.

This first method does not work because you are assigning one weight at a time to whole column not a list.

SubhashR
  • 141
  • 7