0
import os
import re 
from collections import Counter 
from collections import OrderedDict 
from datetime import datetime

currentDirectoryPath = os.getcwd()
print(currentDirectoryPath)


regexp = re.compile(
    r'(?P<clientIP>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).+\['
    + '(?P<timestamp>\d{2}/[A-Z][a-z]{2}/\d\d\d\d).+\"'
    + '(?P<action>[A-Z]{3,4}).+\"'  
    + '\s*(?P<statuscode>[1-5][0-9][0-9])'
    )



os.chdir("/content/drive/My Drive/IT 170/log")
currentDirectoryPath = os.getcwd()
listOfFileNames = os.listdir(currentDirectoryPath)
for files in listOfFileNames :
  print(files) 

nLogUser = input ("What file do you want to read? (digit) ")
print('access_'+ nLogUser  +'.log')
f = open('access_'+ nLogUser  +'.log', 'r')
matched = 0
failed = 0
cnt_clientIPs = Counter()
cnt_clientIPsP1 = Counter()
cnt_clientIPsP2 = Counter()


def TopClientIp():
  startdateInput = input("What date would you like to start at? (dd/Mmm/yyyy) ")
  enddateInput= input("What date would you like to end at? (dd/Mmm/yyyy) ")
  f = open('access_'+ nLogUser  +'.log', 'r')
  allLines=f.readlines()
  for line in allLines:
    m = re.match(regexp,line)
    if m:
      if  m.group('timestamp') >= startdateInput and  m.group('timestamp') <= enddateInput:
        cnt_clientIPsP1.update([m.group('clientIP')])
    else:
      continue



userChoice=input(" \n Welcome to the Log Analyzer program! Here we have some choices on what you would like to see. \n If you would like to see the Top IP addresses enter 1. \n")
if userChoice == "1":
  userInputIP = input("Enter how many of the top clients you want to see. ")
  userInput=input("Would you like to see all clients from a certian date? (Yes or no)")
  if userInput.lower() == "yes":
    TopClientIp()
    #After creating the counter for the specific time range this new counter will print the Clients IP in the time range
    for clientIP, count in cnt_clientIPsP1.most_common(int(userInputIP)):
      print('[*] %30s: %d' % (clientIP, count)) 
    print('[*] ============================================')
  else:
#This one prints from all time. 
    print('[*] ============================================')
    print('[*] '+ userInputIP +' Most Frequently Occurring Clients Queried')
    print('[*] ============================================')
    for clientIP, count in cnt_clientIPs.most_common(int(userInputIP)):
      print('[*] %30s: %d' % (clientIP, count))
    print('[*] ============================================')


Enter how many of the top clients you want to see. 10
[*] ============================================
[*] 10 Most Frequently Occurring Clients Queried
[*] ============================================
[*]                 205.167.170.15: 15695
[*]                  79.142.95.122: 3207
[*]                  52.22.118.215: 734
[*]                  84.112.161.41: 712
[*]                   37.1.206.196: 371
[*]                   91.200.12.22: 287
[*]                178.191.155.244: 284
[*]                 198.50.160.104: 249
[*]                   84.115.10.14: 234
[*]                  93.83.250.186: 219
[*] ============================================

 Welcome to the Log Analyzer program! Here we have some choices on what you would like to see. 
 If you would like to see the Top IP addresses enter 1. 
 If you would like to see the top actions enter 2. 
 If you would like to see the top clients of a certian status code enter 3. 
 If you would like to see the top client with a specific action and status code enter 4.1
Enter how many of the top clients you want to see. 10
Would you like to see all clients from a certian date? (Yes or no)yes
What date would you like to start at? (dd/Mmm/yyyy) 18/Feb/2016
What date would you like to end at? (dd/Mmm/yyyy) 01/Mar/2016
[*] ============================================
[*] 10 Most Frequently Occurring Clients Queried
[*] ============================================
[*] ============================================

I'm reading a file and I want to get between two dates. For an example I can print this out when the start and end day are increasing. 10/Feb/2018 as start and end as 13/feb/2018/. But how can I do it when the end day is 01/Mar/2018 and the start day is 10/Feb/2018? As you can see above in the code.

def TopClientIp():
  startdateInput = input("What date would you like to start at? (dd/Mmm/yyyy) ")
  enddateInput= input("What date would you like to end at? (dd/Mmm/yyyy) ")
  f = open('access_'+ nLogUser  +'.log', 'r')
  allLines=f.readlines()
  for line in allLines:
    m = re.match(regexp,line)
    if m:
      if  m.group('timestamp') >= startdateInput and  m.group('timestamp') <= enddateInput:
        cnt_clientIPsP1.update([m.group('clientIP')])
    else:
      continue

I believe the If statement for m.group() is written wrong.

Thebul500
  • 13
  • 4
  • 5
    Parse them into dates (via the `datetime` module) and compare those. – Scott Hunter May 05 '20 at 00:23
  • You should probably scrub the IP addresses for the sake of privacy. Also, it would be very helpful if you put together a minimal reproducible example: https://stackoverflow.com/help/minimal-reproducible-example. It's more helpful not just to the people answering your questions but also for anyone arriving here through a Google search result. – linqo May 05 '20 at 00:28
  • @linqo the program is intended to get the IP addresses. This is not work-related if that is what you are trying to get at. I will look into the minimal reproducible example. I think my title is what your getting act. Thanks for the advice. – Thebul500 May 05 '20 at 01:16
  • Just use three if statement comparing year, month and date. To compare months, you could just keep dictionary with key as months name and value as numbers "january":1, "february":2 etc. So first you check the year, if different you check month, if different again then check date. Make function that compares the dates like that. Simple If statements would work. – Murtuza Vadharia May 05 '20 at 01:35
  • @MurtuzaVadharia Honestly that might be it. is there a way I could implicate my already regular expression or do I have to create a new one for this case? – Thebul500 May 05 '20 at 01:40
  • https://stackoverflow.com/q/8142364/13357958 – SergioR May 05 '20 at 01:41
  • @Thebul500 Just design a function that takes two string A and B as input and then function compares it using three if statement and return True if string A is less else return False. Use that function in whichever if condition you wanna place. – Murtuza Vadharia May 05 '20 at 01:49

1 Answers1

0

You may try this:

>>> d1 = datetime.datetime.strptime( '18/Feb/2016', '%d/%b/%Y')
>>> d1
datetime.datetime(2016, 2, 18, 0, 0)
>>> d2 = datetime.datetime.strptime( '01/Mar/2016', '%d/%b/%Y')
>>> d2
datetime.datetime(2016, 3, 1, 0, 0)
>>> d1 < d2
True
>>> 
lenik
  • 23,228
  • 4
  • 34
  • 43
  • Those are comparing two dates yes. But what if they are comparing it to a string date. As this is reading from a file it gets the dates from the file and compares to the start date and end date to the string. Which is why you see m.group('timestamp') that would be a string date for an example 18/Feb/2016. Then if I compare it to the class datetime this would lead to an error. As you can't compare a string to a class. – Thebul500 May 05 '20 at 17:32
  • @Thebul500 use `datetime.strptime()` to convert a string to the time value to compare – lenik May 05 '20 at 17:58