0

I'm trying to using pandas to append a blank row based on the values in the first column. When the first six characters in the first column don't match, I want an empty row between them (effectively creating groups). Here is an example of what the output could look like:

002446

002447-01
002447-02

002448

This is what I was able to put together thus far.

readie=pd.read_csv('title.csv')
i=0
for row in readie:
    readie.append(row)
    i+=1
    if readie['column title'][i][0:5]!=readie['column title'][i+1][0:5]:
        readie.append([])

When running this code, I get the following error message:

TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid

I believe there are other ways to do this, but I would like to use pandas if at all possible.

  • related: [Python Pandas insert empty rows after each row](https://stackoverflow.com/q/66466080/14627505), with links to other related posts – Vladimir Fokow Aug 17 '22 at 18:00

1 Answers1

0

I'm using the approach from this answer.

Assuming that strings like '123456' and '123' are considered as not matching:

df_orig = pd.DataFrame(
    {'col':['002446','002447-01','002447-02','002448','00244','002448']}
)

df = df_orig.reset_index(drop=True)  # reset your index
first_6 = df['col'].str.slice(stop=6)
mask = first_6 != first_6.shift(fill_value=first_6[0])
df.index = df.index + mask.cumsum() 
df = df.reindex(range(df.index[-1] + 1))
print(df)

         col
0     002446
1        NaN
2  002447-01
3  002447-02
4        NaN
5     002448
6        NaN
7      00244
8        NaN
9     002448
Vladimir Fokow
  • 3,728
  • 2
  • 5
  • 27