0

Here is my dataset.

I am creating a new pandas dataframe (ptocol) from a previous dataframe (data) using the .groupby and .size methods as shown below. This behaves as expected, however the result is a dataframe with no column headers.

I tried and checked the solution discussed here for a very long time. But it doesn't work for me. Below is my code.

import pandas as pd
import numpy

data = pd.read_csv('first.csv')
ptocol = data.groupby(["Protocol"], as_index=False).size().rename(columns={0:'NumOfPackets'}) # dosn't work
#ptocol = data.groupby(["Protocol"], as_index=False).count() #doesn't work
print ptocol
ptocol.to_csv('protocol.csv')

Actual result (protocol.csv):

0x200e,26
ARP,100746
ATMTCP,48
BOOTP,123
BZR,4
...
...

expected result (protocol.csv):

Protocol,NumOfPackets
0x200e,26
ARP,100746
ATMTCP,48
BOOTP,123
BZR,4
...
...

Any ideas/suggestion are welcome

Community
  • 1
  • 1
user2532296
  • 828
  • 1
  • 10
  • 27

1 Answers1

0

.size() returns a Series object, you can use reset_index() to transform it to a data frame, try this instead:

ptocol = data.groupby("Protocol").size().rename('NumOfPackets').reset_index()
ptocol.to_csv('protocol.cv', index = False)

This gives something like this, not the same data as yours but the format is what you are looking for:

Symbol,NUM
A,5
AA,5
AAAP,5
Psidom
  • 209,562
  • 33
  • 339
  • 356