0

I'm bulding a model by python, and apply on Bot-IoT dataset https://research.unsw.edu.au/projects/bot-iot-dataset

I tried extract information from IP address in CSV file inside this dataset.
The features [saddr, daddr] refers source and destination of address of IoT device (the data are simulated)

by using python, I was installed maxminddb-geolite2 and ip2geotools but the results are ambuge

Here the code after read data,

import time
from geolite2 import geolite2
geo = geolite2.reader()
df_1 = dt.loc[:50,['saddr']]

def IP_info_1(ip):
    try:
        x = geo.get(ip)
    except ValueError:   #Faulty IP value
        return np.nan
    try:
        return x['country']['names']['en'] if x is not None else np.nan
    except KeyError:   #Faulty Key value
        return np.nan
s_time = time.time()
# map IP --> country
#apply(fn) applies fn. on all pd.series elements
df_1['country'] = df_1.loc[:,'saddr'].apply(IP_info_1)
print(df_1.head(), '\n')
print('Time:',str(time.time()-s_time)+'s \n')

print(type(geo.get('48.151.136.76')))

and the resuls as following:

saddr  country
0  192.168.100.147      NaN 
1  192.168.100.147      NaN 
2  192.168.100.147      NaN 
3  192.168.100.147      NaN
4 192.168.100.147      NaN 

Time: 0.00870203971862793s 

<class 'dict'>

     

after test another code

import time
s_time = time.time()
from ip2geotools.databases.noncommercial import DbIpCity
df_2 = dt.loc[:50,['saddr']]
def IP_info_2(ip):
    try:
        return DbIpCity.get(ip, api_key = 'free').country
    except:
        return np.nan
df_2['country'] = df_2.loc[:, 'saddr'].apply(IP_info_2)
print(df_2.head())
print('Time:',str(time.time()-s_time)+'s')

print(type(DbIpCity.get('48.151.136.76',api_key = 'free')))

the results are:

 saddr country
0  192.168.100.147      ZZ
1  192.168.100.147      ZZ
2  192.168.100.147      ZZ
3  192.168.100.147      ZZ
4  192.168.100.147      ZZ
Time: 25.913161039352417s
<class 'ip2geotools.models.IpLocation'>

the code from this link Identifying country by IP address

How to fixed ?

Another question, there are two format in address for this features, what the difference?

the first format:
    fe80::250:56ff:febe:254
    fe80::250:56ff:febe:26db

and the scoend is like:
    192.168.100.46

Any suggestions to take advantage of these features other than knowing the location?

furas
  • 134,197
  • 12
  • 106
  • 148
LAT
  • 21
  • 4
  • first format is `IP version 6` (`IPv6`), second is `IP version 4` (`IPv4`) – furas Jun 10 '22 at 16:50
  • addresses `192.168.x.x` are for private use in local network and they don't have global locations – furas Jun 10 '22 at 16:51
  • Wikipedia: [IP address](https://en.wikipedia.org/wiki/IP_address) – furas Jun 10 '22 at 16:54
  • Does this mean that it is not possible to extract the site from these addresses? So what can be extracted from private IP addresses? @furas – LAT Jun 12 '22 at 03:06
  • private IPs are used in local networks, so everyone can use it and they don't have one global location on internet. You may have computers with `192.168.x.x`, I have all my computers with `192.168.x.x`, someone else may have computers with `192.168.x.x` - so they may have millions locations. And databases like `maxminddb` know that `192.168.x.x` can be used in millions locations - so they don't keep information about these addresses. – furas Jun 12 '22 at 11:09
  • you can use these addresses with other tools to check if they exists in your local neetwork, and check if they have open ports. You can convert IP to MAC address and some tools may recognize who created this hardware (beginning numers in MAC address are assigned to differen producers). Some tools may use this connection also to guess OS system on this hardware. See tool [nmap](https://nmap.org/book/man.html). There is even Python module to work with this tool – furas Jun 12 '22 at 11:17

0 Answers0