-1

I am struggling with mixed data type provided to me in Excel by my team members. The column contains text, whole number and decimals. This is pandas dataframe and contains more columns too. Below is the example.

Column
Does not apply
2
5
0.07
0.45
7% offset

I want to create a logic where if a row is whole number then add $ sign before it. If a row is less than 1 then multiply it by 100 and add %, if a cell has digits then also multiply by 100 and add %, and else if it has text then do nothing. I tried a solution which kind of helped writing the basic regular expression to grab the positive integers but it does not work completely.

This is how it should look like.

Column
Does not apply
$2
$5
70%
45%
7% offset

Additional Comment I apologize but I had to add another logic here. If the cell contains text anywhere either in the beginning or middle then do nothing.

hkay
  • 159
  • 1
  • 2
  • 12
  • `if type(data) is float: f'{data * 100}%' elif type(data) is int: f'${data}'` Something like that. Sorry for the edits, super tired. – Anthony L Jul 09 '22 at 03:40
  • check out [RegExr](https://regexr.com/) and [Regex101](https://regex101.com/) for a fast way to build regular expressions. I think only one of them does python, make sure the option is toggled on – Anthony L Jul 09 '22 at 03:47

2 Answers2

1

There are other ways than regex to do this, but this is a regex-y way.

import re

def modify_value(x):
    # Doesn't start with a number
    if re.match('\D', x): 
        return x
    # Starts with number leading with %
    elif re.match('\d+[%]',x):
        return x
    # Starts wtih 0 then add %
    elif re.match('[0]',x):
        return x + '%'
    # Starts with number and has decimal
    elif re.match('\d+\.', x):
        return f'{float(x) * 100:.0f}%'
    # Starts with number
    elif re.match('\d', x):
        return '$' + x

df = df.applymap(modify_value)
print(df)

# Output:

           Column
0  Does not apply
1              $2
2              $5
3              7%
4             45%
5              0%
6       6%-offset
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
  • This one almost worked but I had to add another logic. Basically if a row contains text anywhere then it should not do anything. For example, I added 7% offset. In this case it already has what I want and it should return x. I am still very new to Regex so thanks for your help. – hkay Jul 09 '22 at 04:06
  • Thanks to your answer. It worked with some minor modifications which I edited in your code. Once it is approved then I will mark it as answer and close the question. Once again thank you very much – hkay Jul 09 '22 at 04:31
0

my regex suggestion for separation numbers is:

^[0-9]$ for int numbers

and

[0-9]+[.][0-9]+ for decimal numbers.

Hossein Asadi
  • 768
  • 1
  • 6
  • 15