0

I found this thread how to make a variable change from the text "1m" into "1000000" in python

My string values are in a column within a pandas dataframe. The string/0bkects values are like 18M, 345K, 12.9K, 0, etc.

values = df5['Values']

multipliers = { 'k': 1e3,
                'm': 1e6,
                'b': 1e9,
              }

pattern = r'([0-9.]+)([bkm])'

for number, suffix in re.findall(pattern, values):
    number = float(number)
    print(number * multipliers[suffix])

Running the code gives this error:

Traceback (most recent call last):
  File "c:/Users/thebu/Documents/Python Projects/trading/screen.py", line 19, in <module>
    for number, suffix in re.findall(pattern, values):
  File "C:\Users\thebu\Anaconda3\envs\trading\lib\re.py", line 223, in findall
    return _compile(pattern, flags).findall(string)
TypeError: expected string or bytes-like object

Thanks

Aaron M
  • 13
  • 4
  • Try `for value in values: for number, suffix in re.findall(pattern, value): number = float(number) print(number * multipliers[suffix])` – moys Jun 27 '20 at 03:19
  • 1
    This may be more appropriate for your needs (there was one more better answer but can’t find it right now) https://stackoverflow.com/questions/39684548/convert-the-string-2-90k-to-2900-or-5-2m-to-5200000-in-pandas-dataframe – moys Jun 27 '20 at 03:25
  • 1
    Thanks to @moy for sharing https://stackoverflow.com/questions/39684548/convert-the-string-2-90k-to-2900-or-5-2m-to-5200000-in-pandas-dataframe the checked solution solved my issue. Did not even see this option in the search. – Aaron M Jun 27 '20 at 04:16

1 Answers1

0

Here's another way using regex:

import re
def get_word(s):
    # find word
    r = re.findall(r'[a-z]', s)
    # find numbers
    w = re.findall(r'[0-9]', s)
    if len(r) > 0 and len(w) > 0:
        r = r[0]
        v = multipliers.get(r, None)
        if v:
            w = int(''.join(w))
            w *= v
            
        return round(w)

df['col2'] = df['col'].apply(get_word)

print(df)

   col      col2
0  10k     10000
1  20m  20000000

Sample Data

df = pd.DataFrame({'col': ['10k', '20m']})
YOLO
  • 20,181
  • 5
  • 20
  • 40