0
import json
import requests
import re

tag = "tag=mirai"
hd = {"Content-Type": "application/x-www-form-urlencoded"}
r = requests.post ('https://urlhaus-api.abuse.ch/v1/tag/', data=tag, headers=hd)
c = ['urls', 'url']
data = json.loads (r.text)

for i in range (0, 998):
    fff = data['urls'][i]['url']
    pattern = r"((([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])[ (\[]?(\.|dot)[ )\]]?){3}([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5]))"
    ips = [match[0] for match in re.findall (pattern, fff)]
    list2 = []
    for item in ips:
        if item != []:
            list2.append (item)
            print(list2)

I wrote this code and output type list of lines I searching for 5 days throughout and any finds answer of problem.

Example output

['51.15.64.60']
['51.15.64.60']
['149.3.170.181']
['149.3.170.181']
['89.248.166.183']
['185.132.53.30']
['185.132.53.30']
['185.132.53.30']

Necessary output

['51.15.64.60']
['149.3.170.181']
['89.248.166.183']
['185.132.53.30']

  • Read about [`set`](https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset). – Tom Wojcik Sep 10 '20 at 15:00
  • Welcome to Stack Overflow! I'm interested to know what you searched for five days that you couldn't find a good solution -- sometimes knowing what to look for is half the battle. For example, what you really want to do here is to remove duplicates from the _list_ in which you store the IP addresses, not from _loop output_. – Pranav Hosangadi Sep 10 '20 at 15:07
  • I tried set() but no effect {'51.15.64.60'} {'51.15.64.60'} {'149.3.170.181'} {'149.3.170.181'} {'89.248.166.183'} {'185.132.53.30'} {'185.132.53.30'} {'185.132.53.30'} {'45.13.58.4'} {'89.34.27.168'} {'89.34.27.168'} {'89.34.27.168'} – Turkwarrior Sep 10 '20 at 15:32

2 Answers2

1

Just check if the item (list) is not already in list2 before adding it.

This answer assumes that you want the list2 to have no duplicate objects but to keep the order of insertion.

(If what you want is for list2 to have no sequential duplicate objects, or if the order of insertion is unimportant, there would be different answers.)

...
if item != []:
   if item not in list2: # Add this line
     list2.append (item)
     print(list2)
Joshua Fox
  • 18,704
  • 23
  • 87
  • 147
1

Use if statement to see if item already exist in list2 or not

Below is what should work for you..

import json
import requests
import re

tag = "tag=mirai"
hd = {"Content-Type": "application/x-www-form-urlencoded"}
r = requests.post ('https://urlhaus-api.abuse.ch/v1/tag/', data=tag, headers=hd)
c = ['urls', 'url']
data = json.loads (r.text)

for i in range (0, 998):
    fff = data['urls'][i]['url']
    pattern = r"((([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])[ (\[]?(\.|dot)[ )\]]?){3}([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5]))"
    pattern2 = r"^(\S++).*\R(?=(?>.*\R)*?\1 )"
    ips = [match[0] for match in re.findall (pattern, fff)]
    list2 = []
    for item in ips:
        if item != []:
            if item not in list2:
                list2.append (item)
                print(list2)
Assad Ali
  • 288
  • 1
  • 12