Trying to loop through a pandas df changing all the data type and presentation type

Question

I'm looking for help for a way to loop through pandas DF changing the rows from the current presented object datatype of for example '1.15m' to 1150000 and also change the datatype to an integer.

This is what I have so far but it doesnt seem to be picking up the 'm' in the object.

int_cols = ['Avg. Likes', 'Posts', 'New Post Avg. Likes','Total Likes' ]

for c in int_cols:
    if 'm' in db[c]:
        db[c] = db[c].apply(lambda x: float(x.strip('m'))*1000000)
        db[c] = db[c].astype('int')
    elif 'k' in db[c]: 
        db[c] = db[c].apply(lambda x: float(x.strip('k'))*1000)
        db[c] = db[c].astype('int')
    elif 'b' in db[c]: 
        db[c] = db[c].apply(lambda x: float(x.strip('b'))*1000000000)
        db[c] = db[c].astype('int')
    else:
        continue

Edit: adding sample data

db.head(3)

|Rank | Channel Info | Influence Score  | Followers | Avg. Likes | Posts  |60-Day Eng Rate  | New Post Avg. Likes | Total Likes  | Country Or Region|
|:---:|:------------:|:----------------:|:---------:|:----------:|:------:|:---------------:|:-------------------:|:------------:|:----------------:|                  
|1    | cristiano    |92                |485200000.0|8.7m        | 3.4k   |0.013            |6.3m                 |29.1b         |Spain             |
|2    | kyliejenner  |91                |370700000.0|8.2m        | 7.0k   |0.014            |5.0m                 |57.4b         |United States     |
|3    | leomessi     |90                |363900000.0|6.7m        | 915    |0.010            |3.5m                 |6.1b          |NaN               |

Try putting the `if 'm' in ...` test inside the lambda, so it tests each object, not the Series. — sj95126, Oct 22 '22 at 14:00
Your question needs a minimal reproducible example consisting of sample input, expected output, actual output, and only the relevant code necessary to reproduce the problem. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. — itprorh66, Oct 22 '22 at 14:03
**DO NOT** post images of code, links to code, data, error messages, etc. - copy or type the text into the question. — itprorh66, Oct 22 '22 at 15:53

Naveed · Accepted Answer · 2022-10-22T21:40:03.263

here is one way to do it

int_cols = ['Avg. Likes', 'Posts', 'New Post Avg. Likes','Total Likes' ]
int_cols

# create a mapping of the suffixes for multipiers
m={'m': 1000000.0, 'k': 1000.0, 'b': 1000000000.0}
m

# remove digits and map to the dictionary
# then multiply with the numeric part
df[int_cols] = (df[int_cols].apply(lambda x: 
                                   (x.replace('[\d\.]','' , regex=True).map(m).fillna(1.).mul( 
                                    x.replace('[m|b|k]','', regex=True).fillna(1.).astype(float))) 
                                   , axis=1))
df

Avg. Likes  Posts   New Post Avg. Likes     Total Likes
0   8.7     3.4     6.3     29.1
1   8.2     7.0     5.0     57.4
2   6.7     915.0   3.5     6.1
3   6.1     1.9     1.7     11.4
4   1.8     6.8     932.0   12.6
...     ...     ...     ...     ...
195     680.6   4.6     305.7   3.1
196     2.2     1.4     2.1     3.0
197     227.8   4.2     103.2   955.9
198     193.3   865.0   82.6    167.2
199     382.5   3.8     128.2   1.5

Hi Naveed, thank you for your answer. I apologize as my previous sample data didnt show this but some of the data are not shown in "m","k" or "b". using this code returns those back as "NaN" is there a way to change that so it leaves those unchanged? — LeveragedDev, Oct 22 '22 at 20:26
@LeveragedDev, solution updated. null result from replace is filled with 1.0. Hope it helps — Naveed, Oct 22 '22 at 21:42

Trying to loop through a pandas df changing all the data type and presentation type

1 Answers1