0

I'm working in a company project, the guys collected data and put it in excel sheet. And they want me to separate the integers from alphabets using regex under Barcode_Number column. Is the a way I can do that for all the values under Barcode_Number Column?

import numpy as np
import re

data = pd.read_excel(r'C:\Users\yanga\Gaussian\SEC - 6. Yanga Deliverables\Transmission\Raw\3000_2- processed.xlsx')
data.head()

# Extract the column you want to work with
df = pd.DataFrame(data, columns= ['Barcode_Number'])

# Identify the null values
df.isnull().sum()

# remove all the null values
df.dropna(how = 'all', inplace = True)

# Select cells that contain non-digit values
df1 = df[df['Barcode_Number'].str.contains('^\D', na = False)]

For example if I have list of values under column Barcode_Number

Barcode_Number
'VQA435'
'KSR436'
'LAR437'
'ARB438'

and I want an output to be like this:

'VQA', '435'
'KSR', '436'
'LAR', '437'
'ARB', '438'

1 Answers1

1
import pandas as pd

df = pd.read_csv(filename)
df[["Code", "Number"]] = df["Barcode_Number"].str.extract(r"([A-Z]+)([0-9]+)")
print(df)

Output:

  Barcode_Number Code Number
0         VQA435  VQA    435
1         KSR436  KSR    436
2         LAR437  LAR    437
3         ARB438  ARB    438
Rakesh
  • 81,458
  • 17
  • 76
  • 113