So I have a dataframe with 6 columns. "Tags" is supposed to only have 6 characters, but sometimes the API I am pulling from likes to mess things up. It looks like:
import pandas as pd
df = pd.DataFrame({'user': ['Ticket ID', 'Closed Time', 'Tags'],
'income': [1, 2, 3, ],
'Closed Time': ['08/19/20', '08/18/20', '08/17/20'],
'Tags': [270201, 284912, 123456789101]})
Currently my code is:
`df['Tags'].replace(to_replace='[^0-9]+', value='', inplace = True, regex = True)
df['Tags'] = df['Tags'].astype(str).str.zfill(6)`
That just filters out the garbage that sometimes comes into the column. I am not sure where to start, I need something that if something in 'Tags' is longer than 6 characters, it splits Tags and duplicates the rest of the row.