2

I am trying to find IP addresses in read.log that are listed more than 3 times. Once found, I want to print the IP address once and write it to writelist.log.

I have been trying this using a set but I am not sure how I can print and write only the IP address.

For example, if read.log contains...

10.1.89.11
255.255.255.255
255.255.255.255
10.5.5.5
10.5.5.5
10.5.5.5
10.5.5.5
255.255.255.255
255.255.255.255

I just want to print and save the below to writelist.log

255.255.255.255
10.5.5.5

With my current code, I am printing and saving this...

set([])
set([])
set([])
set([])
set([])
set([])
set(['10.5.5.5'])
set(['10.5.5.5'])
set(['10.5.5.5', '255.255.255.255'])

I do not want to print set([]) or duplicate IP addresses.

I know I could use the string.replace() method to get rid of some of that but is there a better way to do this? Possibly without a set?

Here is my code...

import re

login_attempts = 3

def run():

    try:
        with open("read.log", "r+") as log:
            ip_list = []
            for line in log:
                address = "^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}$"
                match = re.match(address, line)

                if (match):
                    match = match.group()
                    ip_list.append(match.strip())
                    s = set([i for i in ip_list if ip_list.count(i) > login_attempts])

                    strs = repr(s)  # use repr to convert to string
                    with open("writelist.log", "a") as f:
                        f.write(strs)

                else:
                    continue
                log.close
    except OSError as e:
        print (e)

run()
aydow
  • 3,673
  • 2
  • 23
  • 40
Kade Williams
  • 1,041
  • 2
  • 13
  • 28
  • This seems like a good use case for [`collections.Counter`](https://docs.python.org/3/library/collections.html#collections.Counter). – John B Jun 19 '18 at 00:44
  • @JohnB Yes and for some reason I can't find a good answer to link here. If you find one let me know and I'll dupe it. – idjaw Jun 19 '18 at 00:44
  • I did see an example using that earlier today. I'll read up on it. – Kade Williams Jun 19 '18 at 00:45
  • @KadeWilliams you can read the doc [here](https://docs.python.org/3/library/collections.html#collections.Counter) – idjaw Jun 19 '18 at 00:45

1 Answers1

3

Use a Counter

import collections
with open('read.log', 'r+') as f:
     # Place into a counter and remove trailing newline character
     count = collections.counter(f.read().splitlines())

Which will give

Counter({'10.1.89.11': 1, '255.255.255.255': 4, '10.5.5.5': 4})

You can then iterate over the Counter

for ip, n in count.items():
    print(ip, n)
    # Process the IP
    ...

This assumes that you're receiving clean input. You will have to sanitise your data before you process it.

aydow
  • 3,673
  • 2
  • 23
  • 40
  • And to conditionally output, you could iterate over `count.items()` and only write if greater than 3. – John B Jun 19 '18 at 01:01
  • You could also avoid that inner loop and do `f.read().splitlines()`. It should be noted though that by stripping the new line, remember to add it when writing back each line you want to the new file too. – idjaw Jun 19 '18 at 01:02
  • For completion sake you can show the write part. Per the comment, the trick here is that you still need to call `items` on `count` just like @JohnB mentioned. – idjaw Jun 19 '18 at 01:06
  • Would this still work well if there was other data in the log? There may be other text other than IP addresses. I'm running 2.7 and 3.6 but getting errors in https://stackoverflow.com/questions/13311094/counter-in-collections-module-python – Kade Williams Jun 19 '18 at 03:23
  • AFAIK, anything < 2.7 isn't supported. you'd have to show your new code and what errors you're getting – aydow Jun 19 '18 at 03:27
  • @KadeWilliams mentioning that there might be other data in the logs changes the entire nature of the question and invalidates what is a good answer based on the information provided. Also, if you are using 2.7 then you should not be receiving that error (you also tagged your question as python-3.x). Are you sure you are not on 2.6? This is also information that would be good to add to the original question. Since you need to extract the IP as well since you have other data, you will most likely need your regex that you have to create your list, then use your counter method. – idjaw Jun 19 '18 at 04:02
  • You can also look at [this](https://stackoverflow.com/questions/3496518/python-using-a-dictionary-to-count-the-items-in-a-list?noredirect=1&lq=1) answer for more methods on how to count occurrences. – idjaw Jun 19 '18 at 04:02