12

I have some pcap files and I want to filter by protocol, i.e., if I want to filter by HTTP protocol, anything but HTTP packets will remain in the pcap file.

There is a tool called openDPI, and it's perfect for what I need, but there is no wrapper for python language.

Does anyone knows any python modules that can do what I need?

Thanks

Edit 1:

HTTP filtering was just an example, there is a lot of protocols that I want to filter.

Edit 2:

I tried Scapy, but I don't figure how to filter correctly. The filter only accepts Berkeley Packet Filter expression, i.e., I can't apply a msn, or HTTP, or another specific filter from upper layer. Can anyone help me?

Community
  • 1
  • 1
coelhudo
  • 4,710
  • 7
  • 38
  • 57

8 Answers8

20

A quick example using Scapy, since I just wrote one:

pkts = rdpcap('packets.pcap')
ports = [80, 25]
filtered = (pkt for pkt in pkts if
    TCP in pkt and
    (pkt[TCP].sport in ports or pkt[TCP].dport in ports))
wrpcap('filtered.pcap', filtered)

That will filter out packets that are neither HTTP nor SMTP. If you want all the packets but HTTP and SMTP, the third line should be:

filtered = (pkt for pkt in pkts if
    not (TCP in pkt and
    (pkt[TCP].sport in ports or pkt[TCP].dport in ports)))
wrpcap('filtered.pcap', filtered)
nmichaels
  • 49,466
  • 12
  • 107
  • 135
14

I know this is a super-old question, but I just ran across it thought I'd provide my answer. This is a problem I've encountered several times over the years, and I keep finding myself falling back to dpkt. Originally from the very capable dugsong, dpkt is primarily a packet creation/parsing library. I get the sense the pcap parsing was an afterthought, but it turns out to be a very useful one, because parsing pcaps, IP, TCP and and TCP headers is straightforward. It's parsing all the higher-level protocols that becomes the time sink! (I wrote my own python pcap parsing library before finding dpkt)

The documentation on using the pcap parsing functionality is a little thin. Here's an example from my files:

import socket
import dpkt
import sys
pcapReader = dpkt.pcap.Reader(file(sys.argv[1], "rb"))
for ts, data in pcapReader:
    ether = dpkt.ethernet.Ethernet(data)
    if ether.type != dpkt.ethernet.ETH_TYPE_IP: raise
    ip = ether.data
    src = socket.inet_ntoa(ip.src)
    dst = socket.inet_ntoa(ip.dst)
    print "%s -> %s" % (src, dst)

Hope this helps the next guy to run across this post!

J.J.
  • 5,019
  • 2
  • 28
  • 26
  • Looks like dpkt is not maintained anymore. http://code.google.com/p/dpkt/issues/list Any other suggestions to parse pcap file? which is not a pita to install on mac and linux? – zengr Jul 28 '13 at 06:18
  • a package like dpkt is never "complete" - the environment is too dynamic. you've got to be prepared to dig in when you need to. I've never had a problem with the install on either Mac or Linux, even within the last couple months: just `python setup.py install`. Double-check your assumptions, something else is likely wonky somewhere. – J.J. Aug 09 '13 at 13:13
6

sniff supports a offline option wherein you can provide the pcap file as input. This way you can use the filtering advantages of sniff command on pcap file.

>>> packets = sniff(offline='mypackets.pcap')
>>>
>>> packets
<Sniffed: TCP:17 UDP:0 ICMP:0 Other:0>

Hope that helps !

Yasser Arafat
  • 61
  • 1
  • 2
4

Something along the lines of

from pcapy import open_offline
from impacket.ImpactDecoder import EthDecoder
from impacket.ImpactPacket import IP, TCP, UDP, ICMP

decoder = EthDecoder()

def callback(jdr, data):
    packet = decoder.decode(data)
    child = packet.child()
    if isinstance(child, IP):
        child = packet.child()
        if isinstance(child, TCP):
            if child.get_th_dport() == 80:
                print 'HTTP'

pcap = open_offline('net.cap')
pcap.loop(0, callback)

using

http://oss.coresecurity.com/projects/impacket.html

fraca7
  • 1,178
  • 5
  • 11
3

Try pylibpcap.

Dave Bacher
  • 15,652
  • 3
  • 63
  • 86
  • But I don't want to parse each packet to check for the protocol that I want, I want a simple solution (like openDPI). Also, I don't want to worry about "magic number" of all protocols that exists. If there is no solution, then I will have to do that. Thanks – coelhudo Feb 11 '10 at 20:09
  • A couple thoughts: 1. most of the python pcap libraries allow you to set a BPF filter on the captured packets. HTTP is an easy filter `tcp port 80`. 2. You could use Wireshark or a similar GUI to isolate the packets that you want, save those to a dumpfile and use pylibpcap or another of these libraries to operate on them. – Dave Bacher Feb 11 '10 at 23:04
  • There is no way besides "parsing each packet". You can have a program which does it behind the scenes for you, that's all you can hope. – bortzmeyer Feb 14 '10 at 17:21
2

I have tried the same using @nmichaels method, but it becomes cumbersome when I want to iterate it over multiple protocols. I tried finding ways to read the .pcap file and then filter it but found no help. Basically, when one reads a .pcap file there is no function in Scapy which allows to filter these packets, on the other hand using a command like,

a=sniff(filter="tcp and ( port 25 or port 110 )",prn=lambda x: x.sprintf("%IP.src%:%TCP.sport% -> %IP.dst%:%TCP.dport%  %2s,TCP.flags% : %TCP.payload%"))

helps to filter but only while sniffing.

If anyone knows of any other method where we can use a BPF syntax instead of the for statement?

Abhinav
  • 992
  • 2
  • 11
  • 26
  • You can generalize my method to use an actual generator instead of a generator expression. That should make for relatively clear code. – nmichaels Jan 14 '14 at 21:56
2

to filter in/out a specific protocol you have to do a per packet analysis otherwise you could miss some http traffic on a non-conventional port that is flowing in your network. of course if you want a loose system, you could check just for source and destination port number but that wont give you exact results. you have to look for specific feature of a protocol like GET, POST, HEAD etc keywords for HTTP and others for other protocol and check each TCP packets.

vinit
  • 21
  • 1
  • Yeah, it is not magical and easy thing as I thought initially. Scapy solve my specific problem as far I remember. Thanks – coelhudo May 24 '12 at 20:54
0

Here is my example of pcap parsing using scapy. It also has some relevant code for performance testing and some other stuff.

InvisibleWolf
  • 917
  • 1
  • 9
  • 22