1

I have multiple csv file which i merged together after that in order to identify individual csv data in all merged csv file i wish to create a new column in pandas where the new column should be called serial.

I want a new column serial in the pandas and it should me numbered on the basis of data in Sequence column (For example-111111111,2222222222,33333333 for every new one in csv ).I had Attached snapshot of csv file also.

Sequence Number
1
2
3
4
5
1
2
1
2
3
4

I want output Like this-

Serial  Sequence Number
1   1
1   2
1   3
1   4
1   5
2   1
2   2
3   1
3   2
3   3
3   4
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Ani
  • 147
  • 2
  • 14
  • How did you merge the data from the different csv files? You can use the file name as an identifier. Place the outcome of the read function in a dictionary with file names as keys and data frames as values. Then `df = pandas.concat(dict_of_df, sort=True)` will give you a data frame with file names in index. – Paul Rougieux Aug 22 '19 at 07:24
  • See also answers to the question [Import multiple csv files into pandas and concatenate into one DataFrame](https://stackoverflow.com/a/57608977/2641825) – Paul Rougieux Aug 22 '19 at 12:08

1 Answers1

1

Use DataFrame.insert for column in first position filled with boolean mask for compare by 1 with Series.eq (==) and cumulative sum by Series.cumsum:

df.insert(0, 'Serial', df['Sequence Number'].eq(1).cumsum())
print (df)
    Serial  Sequence Number
0        1                1
1        1                2
2        1                3
3        1                4
4        1                5
5        2                1
6        2                2
7        3                1
8        3                2
9        3                3
10       3                4
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252