0

I have a file

192.168.10.1 - - [12/aug/20:23:30:41] "PUT /img.jpg" 200 - 
192.168.10.2 - - [10/aug/20:01:20:30] "PUT /img.jpg " 404 - 
192.168.10.2 - - [10/aug/20:12:10:15] "PUT /img.jpg " 200 2114 
192.168.10.3 - - [09/aug/20:06:20:12] "GET / img.jpg" 200 377 
192.168.10.1 - - [07/aug/20:12:40:20] "GET /img.jpg" 200 2114
192.168.10.1 - - [01/aug/20:06:45:50] "GET /img.jpg" 404 - 

I want to count every LIne in the file, in which 2nd last numbers of the line beginning in 2 , for example, 192.168.10.1 exist 3 times in the file, but only 2 lines of 192.168.10.1 is 200 and 1 is 404. so i want count only 2 lines

192.168.10.1  2
192.168.10.2  1
192.168.10.3  1
Zeeshan
  • 13
  • 3
  • 4
    Our format works best when you actually _tried something yourself_ and have a question about why it didn't work. If you haven't tried writing your own code, encountered a specific problem, and searched for other questions about that problem, it's typically premature to ask a question here. – Charles Duffy Sep 12 '20 at 15:27
  • 3
    That said, as an existing, answered instance: [sort uniq ip addresses in from apache log](https://stackoverflow.com/questions/18682308/sort-uniq-ip-address-in-from-apache-log) – Charles Duffy Sep 12 '20 at 15:28
  • How do you define the *connection attempt*? – M. Nejat Aydin Sep 12 '20 at 16:26
  • @M.NejatAydin "Connection attempt" with the help of status code(beginning with 2) . status code is 2nd last numbers of every line, For example, first-line status code is 404 2nd-line status code is 200 – Zeeshan Sep 12 '20 at 16:41
  • @CharlesDuffy I want only those IP addresses which status code is beginning with 2 (status code is 2nd last numbers of every line) For example first-line status code is 404 2nd-line status code is 200 – Zeeshan Sep 12 '20 at 16:55
  • ..Great that that's what you want. How have you tried to accomplish it, and what _specific, narrow problem_ did you encounter in the process? – Charles Duffy Sep 12 '20 at 17:08
  • BTW, if you were using [`asql`](https://steve.fi/software/asql/), your restriction on status code would just be a `WHERE STATUS>=200 AND STATUS<300` clause in your query. – Charles Duffy Sep 12 '20 at 17:09
  • @CharlesDuffy I tried with this command cat thttpd.log | awk '{print $1}' | sort | uniq -c but the problem is this is counting all IP Addresses, I want to apply a filter that check the status code beginning in 2 then count. I don't know how can I do this in a shell script – Zeeshan Sep 12 '20 at 17:26
  • 1
    Please do not add extra requirements on comments. Answers are good, but obviously you will break them, again and again, every time you reveal new secret requirements. You have to edit your question. You have to define strictly what "attempt" means. Any request is an attempt to me, but you don't want this. Do you want to include response status code or type of the request? You have to provide the **exact** expected output for your sample input. You have to include into the input representative cases for your requirements. – thanasisp Sep 12 '20 at 19:07
  • Do you want to print all lines with occurences or only the first one? Update the question with that, either write that you want only the first one, or add the 2 more lines to the output, according to your input. – thanasisp Sep 13 '20 at 04:47
  • @thanasisp updated question, i want all the lines – Zeeshan Sep 13 '20 at 06:16
  • Very well @Zeeshan, please wait for one more person to reopen this question. – thanasisp Sep 13 '20 at 06:18
  • @thanasisp also sorry for added extra requirements on comments. I am new here and don't know the rules and regulation – Zeeshan Sep 13 '20 at 06:36

2 Answers2

1

This can be done in many ways, one such way is to use a combinations of awk, sort and uniq commands

 awk -F ' ' '$(NF-1) ~ /^2/ {print $1}' log_file.txt | sort | uniq -c | sort 

Explanation:

  • awk -F ' ' '$(NF-1) ~ /^2/ {print $1}' --> This will check if second last column begins with "2", if yes, it will print first column, i.e., IP Addresses
  • sort --> Will sort the output
  • uniq -c --> Will tell how many times a line was repeated, along with the number of times it was repeated.
Homer
  • 424
  • 3
  • 7
  • 2
    You don't need to `cat filename | command`, but just `command – Léa Gris Sep 12 '20 at 16:13
  • @Homer I want only IP which status code is beginning with 2 (status code is 2nd last numbers of every line) For example first-line status code is 404 2nd-line status code is 200 – Zeeshan Sep 12 '20 at 16:46
  • To double down on what LéaGris was saying earlier -- `cat filename | somecommand` is sometimes orders-of-magnitude less efficient than `somecommand – Charles Duffy Sep 12 '20 at 17:02
  • LéaGris/Charles , agreed, we can remove cat. Thanks. @Zeeshan, we can put a condition in awk, to check if second last value begins with a 2, like: awk -F ' ' '$(NF-1) ~ /^2/ {print $1}' log_file.txt | sort | uniq -c | sort – Homer Sep 12 '20 at 17:34
1

Using command line utilities:

grep '^[^"]*"[^"]*" 2' logfile |
cut -d' ' -f1 | sort | uniq -c | sort -nr | head -n 10

This lists the top ten IP addresses ordered by the attempt counts.

M. Nejat Aydin
  • 9,597
  • 1
  • 7
  • 17
  • this is not working – Zeeshan Sep 12 '20 at 17:07
  • 1
    "Not working" is basically useless as a description -- it doesn't provide any information someone could use to generate a solution that _does_ work. What _specific issue_ do you have when deploying this code? – Charles Duffy Sep 12 '20 at 17:10
  • @Zeeshan What output you get? I've tried it on your sample and it works. – M. Nejat Aydin Sep 12 '20 at 17:11
  • @M.NejatAydin nothing – Zeeshan Sep 12 '20 at 17:14
  • @Zeeshan Then either you feed it the wrong file or the format in your sample isn't correct. – M. Nejat Aydin Sep 12 '20 at 17:19
  • @M.NejatAydin yes you are right it's working according to the above log file but my actual log file format is below 172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET / HTTP/1.1" 200 123 "" "Mozilla/5.0 (compatible; Konqueror/2.2.2-2; Linux)" when i try on this format this show nothing, can you tell me what is the problem – Zeeshan Sep 12 '20 at 17:40
  • @M.NejatAydin you hardcoded here "GET" '"GET [^"]*" 2' can we do this without hardcoded, because in the future maybe we will have some lines with GET, some with PUT, so how can we handle that – Zeeshan Sep 12 '20 at 18:26
  • @Zeeshan Yes. I've edited it again. – M. Nejat Aydin Sep 12 '20 at 18:48
  • @M.NejatAydin thanks can you give me an Explanation of the code – Zeeshan Sep 12 '20 at 18:54
  • @M.NejatAydin also tell me how can we change the format, the above code show "attempt counts" "IP addresses" if we want the first IP then count "IP addresses" "attempt counts" – Zeeshan Sep 12 '20 at 19:04