-1

i have column like this

ID
Step 1
Step 2
Step 4B
Step 5
Step 6
Config 1
Config 2
Config 3
Config 4
Config 5
Config 6
Config 7
Config 8
Config 9
Config 10
Step 3(Option11)
Step 5(Option11)
Step 3(Option12)
Step 5(Option12)
Step 4A
Config 6(Option11)
Config 6(Option12)
Config 6(Option13)

here i want to add .0 to the first occurence of the number And the output is like below

ID
Step 1.0
Step 2.0
Step 4.0B
Step 5.0
Step 6.0
Config 1.0
Config 2.0
Config 3.0
Config 4.0
Config 5.0
Config 6.0
Config 7.0
Config 8.0
Config 9.0
Config 10.0
Step 3.0(Option11)
Step 5.0(Option11)
Step 3.0(Option12)
Step 5.0(Option12)
Step 4.0A
Config 6.0(Option11)
Config 6.0(Option12)
Config 6.0(Option13)

i have tried this pattern r'^(\D*)(\d+)(\D*)$' it is not giving desired output

i want to get the output as above

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • 2
    That Regex does not replace anything. And a Regex itself does not output anything. What's your code? – Thomas Weller Jul 01 '23 at 15:03
  • That regex looks good. What does the replacement look like? What's your code exactly? Please provide a [mre]. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). For more tips, check out [ask]. – wjandrea Jul 01 '23 at 15:06
  • Oops, after testing it, the regex doesn't work for the lines that contain a second number. – wjandrea Jul 01 '23 at 15:17

2 Answers2

1

You can use a regex to find the end of the first string, then split and add ".0"

import re
import pandas as pd

df = pd.DataFrame({'ID': ['Step 1', 'Step 2', 'Step 4B', 'Step 5', 'Step 6', 'Config 1', 'Config 2', 'Config 3', 'Config 4', 'Config 5', 'Config 6', 'Config 7', 'Config 8', 'Config 9', 'Config 10', 'Step 3(Option11)', 'Step 5(Option11)', 'Step 3(Option12)', 'Step 5(Option12)', 'Step 4A', 'Config 6(Option11)', 'Config 6(Option12)', 'Config 6(Option13)']})


def add_point_zero(val: str):
    match = re.search(r'\d+', val)
    if match:
        end_idx = match.end()
        return f'{val[:end_idx]}.0{val[end_idx:]}'
    return val


df['ID'].apply(add_point_zero)
BushMinusZero
  • 1,202
  • 16
  • 21
1

The second \D in your Regex does not match the numbers in (Option11) for example. [Regex101]

Therefore the Regex stops at the n of Option. Since $ needs to match the end of the line (which it doesn't), the line will not be selected as a match.

So, simply change the second (\D*) to (.*). Example with reduced amount of data to make it a MRE (minimal reproducible exmaple):

data="""ID
Step 1
Config 6(Option11)"""

import re
changed = re.sub(r'^(\D*)(\d+)(.*)$', r'\1\2.0\3', data, flags=re.M)
print(changed)
Thomas Weller
  • 55,411
  • 20
  • 125
  • 222