I wrote a scripts that logs mac addresses from pcapy into mysql through SQLAlchemy, I initially used straight sqlite3 but soon realized that something better was required, so this weekend that past I rewrote all the database talk to comply with SQLAlchemy. All works fine, data goes in and comes out again. I though the sessionmaker() would be very useful to manage all the sessions to the DB for me.
I see a strange occurrence with regards to memory consumption. I start the script... it collects and writes all to DB... but for every 2-4seconds I have a Megabyte in size increase in memory consumption. At the moment I'm talking about very few records, sub-100 rows.
Script Sequence:
- Script Starts
- SQLAlchemy reads mac_addr column into maclist[].
- scapy gets packet > if new_mac is in maclist[]?
if true? only write timestamp to timestamp column where mac = newmac. back to Step 2.
if false? then write new mac to DB. clear maclist[] and call step 2 again.
After 1h30m I have a memory footprint of 1027MB (RES) and 1198MB (VIRT) with 124 rows in the 1 table database (MySQL).
Q: Could this be contributed to the maclist[] being cleaned and repopulated from DB everytime?
Q: Whats going to happen when it reaches system Max memory?
Any ideas or advice would be great thanks.
memory_profiler output for the segment in question where list[] gets populated from database mac_addr column.
Line # Mem usage Increment Line Contents
================================================
123 1025.434 MiB 0.000 MiB @profile
124 def sniffmgmt(p):
125 global __mac_reel
126 global _blacklist
127 1025.434 MiB 0.000 MiB stamgmtstypes = (0, 2, 4)
128 1025.434 MiB 0.000 MiB tmplist = []
129 1025.434 MiB 0.000 MiB matching = []
130 1025.434 MiB 0.000 MiB observedclients = []
131 1025.434 MiB 0.000 MiB tmplist = populate_observed_list()
132 1025.477 MiB 0.043 MiB for i in tmplist:
133 1025.477 MiB 0.000 MiB observedclients.append(i[0])
134 1025.477 MiB 0.000 MiB _mac_address = str(p.addr2)
135 1025.477 MiB 0.000 MiB if p.haslayer(Dot11):
136 1025.477 MiB 0.000 MiB if p.type == 0 and p.subtype in stamgmtstypes:
137 1024.309 MiB -1.168 MiB _timestamp = atimer()
138 1024.309 MiB 0.000 MiB if p.info == "":
139 1021.520 MiB -2.789 MiB _SSID = "hidden"
140 else:
141 1024.309 MiB 2.789 MiB _SSID = p.info
142
143 1024.309 MiB 0.000 MiB if p.addr2 not in observedclients:
144 1018.184 MiB -6.125 MiB db_add(_mac_address, _timestamp, _SSID)
145 1018.184 MiB 0.000 MiB greetings()
146 else:
147 1024.309 MiB 6.125 MiB add_time(_mac_address, _timestamp)
148 1024.309 MiB 0.000 MiB observedclients = [] #clear the list
149 1024.309 MiB 0.000 MiB observedclients = populate_observed_list() #repopulate the list
150 1024.309 MiB 0.000 MiB greetings()
You will see observedclients is the list in question.