1

MaxMind's GeoLite2 is a wonderful database and is very useful if you want to map IP addresses to countries.

To do this efficiently, I want to import it into a MySQL database with a scheme like this:

Model

I remember writing an import script for the CSV database long time ago, but the CSV as you can download it today has a very difficult to understand format, at least to me:

network,geoname_id,registered_country_geoname_id,represented_country_geoname_id,is_anonymous_proxy,is_satellite_provider 1.0.0.0/24,2077456,2077456,,0,0 1.0.1.0/24,1814991,1814991,,0,0 1.0.2.0/23,1814991,1814991,,0,0 1.0.4.0/22,2077456,2077456,,0,0 1.0.8.0/21,1814991,1814991,,0,0 1.0.16.0/20,1861060,1861060,,0,0 1.0.32.0/19,1814991,1814991,,0,0 1.0.64.0/18,1861060,1861060,,0,0 1.0.128.0/17,1605651,1605651,,0,0

I'm really stuck at the basics here. What is the most efficient and easiest way to import the database from its CSV representation into MySQL?

bytecode77
  • 14,163
  • 30
  • 110
  • 141

4 Answers4

5
network,geoname_id,registered_country_geoname_id,represented_country_geoname_id,is_anonymous_proxy,is_satellite_provider
1.0.0.0/24,2077456,2077456,,0,0
1.0.1.0/24,1814991,1814991,,0,0
1.0.2.0/23,1814991,1814991,,0,0
1.0.4.0/22,2077456,2077456,,0,0

create table thing1
(   network varchar(20) not null,
    geoname_id varchar(20) not null,
    registered_country_geoname_id varchar(20) not null,
    represented_country_geoname_id varchar(20) not null,
    is_anonymous_proxy varchar(20) not null,
    is_satellite_provider varchar(20) not null
);

LOAD DATA INFILE 'c:\\dev\\ipaddr.txt' 
INTO TABLE thing1
FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(@v1,@v2,@v3,@v4,@v5,@v6)
set network=ifnull(@v1,''),
geoname_id=ifnull(@v2,''),
registered_country_geoname_id=ifnull(@v3,''),
represented_country_geoname_id=ifnull(@v4,''),
is_anonymous_proxy=ifnull(@v5,''),
is_satellite_provider=ifnull(@v6,'');

The above came in fine for me.

Begin of Edits below

For the below, attempting to improve this answer after points made by stijn-de-witt in comments.

Note however that the referenced blog in the comments had an error in the update statement for going into ints. So for the time being until I sort that out, I found a varchar modification as seen below.

Edit1 (more to follow, see comments below answer):

Alter the table to get a "from ip to ip range"

alter table thing1 add column from_ip varchar(20), add column to_ip varchar(20);
-- note that those two are nullable at the moment. You can always change that later

Update table for the varchar's just added

update thing1
set from_ip=  INET_NTOA(INET_ATON( SUBSTRING_INDEX(network, '/', 1)) 
   & 0xffffffff ^ ((0x1 << ( 32 - SUBSTRING_INDEX(network, '/', -1))  ) -1 )),
to_ip=   INET_NTOA(INET_ATON( SUBSTRING_INDEX(network, '/', 1)) 
   | ((0x100000000 >> SUBSTRING_INDEX(network, '/', -1) ) -1 ))
select * from thing1;

(For the above update statement, Credit to Bernd Buffen in this answer )

Results of the above update statement:

mysql> select network,from_ip,to_ip from thing1;
+------------+---------+-----------+
| network    | from_ip | to_ip     |
+------------+---------+-----------+
| 1.0.1.0/24 | 1.0.1.0 | 1.0.1.255 |
| 1.0.2.0/23 | 1.0.2.0 | 1.0.3.255 |
| 1.0.4.0/22 | 1.0.4.0 | 1.0.7.255 |
+------------+---------+-----------+

From here, check out the MySQL Manual Page Miscellaneous Functions for INET_ATON(expr).

Edit2 (thanks to stijn-de-witt again):

alter table thing1 add column uint_from_ip int unsigned, add column uint_to_ip int unsigned;

UPDATE thing1 SET uint_from_ip = inet_aton(SUBSTRING(network, 1, LOCATE('/', network) - 1)), 
uint_to_ip = (inet_aton(SUBSTRING(network, 1, LOCATE('/', network) - 1)) + (pow(2, (32-CONVERT(SUBSTRING(network, LOCATE('/', network) + 1), UNSIGNED INT)))-1));

Results:

select network,from_ip,to_ip,uint_from_ip,uint_to_ip from thing1;
+------------+---------+-----------+--------------+------------+
| network    | from_ip | to_ip     | uint_from_ip | uint_to_ip |
+------------+---------+-----------+--------------+------------+
| 1.0.1.0/24 | 1.0.1.0 | 1.0.1.255 |     16777472 |   16777727 |
| 1.0.2.0/23 | 1.0.2.0 | 1.0.3.255 |     16777728 |   16778239 |
| 1.0.4.0/22 | 1.0.4.0 | 1.0.7.255 |     16778240 |   16779263 |
+------------+---------+-----------+--------------+------------+

(the above credit to this S0BEIT blog after a few fixes mentioned)

Community
  • 1
  • 1
Drew
  • 24,851
  • 10
  • 43
  • 78
  • Not bad so far, but it doesn't match my schematics at all. Besides, where is the IP from-to? If we encode an IP address as a 32-bit integer, then we should have two int columns named "from" and "to", right? – bytecode77 Aug 15 '15 at 00:06
  • just trying to share something byte, not wrap a bow around it. please forgive :> – Drew Aug 15 '15 at 00:07
  • I only know so much about IPv4. So for me, it would be helpful to know, how to convert a representation like "223.252.161.128/25" to an IP from-to range. – bytecode77 Aug 15 '15 at 18:35
  • [s0beit's blog](https://s0beit.me/mysql/using-maxminds-geoip-csvs-to-create-a-country-geoip-lookup/) describes how to convert the `network` field to `range_begin` and `range_end` fields using a simple SQL query. I haven't used it yet, but it looks legit: `UPDATE \`GeoLite2-Country-Blocks-IPv4\` SET range_begin = inet_aton(SUBSTRING(network, 1, LOCATE('/', network) - 1)), range_end = (inet_aton(SUBSTRING(network, 1, LOCATE('/', network) - 1)) + (pow(2, (32-CONVERT(SUBSTRING(network, LOCATE('/', network) + 1), INTEGER)))-1))` – Stijn de Witt Jun 21 '16 at 12:42
  • @StijndeWitt thanks for sharing that. It would certainly be better and I will attempt to improve this with that information. – Drew Jun 21 '16 at 12:49
  • @StijndeWitt edit was performed, I will continue to see if the int solution can be solved. Note that it had an error in the blog. Yet to be sorted out. Let me know if ints are industry best practice for searching. I imagine they might be. – Drew Jun 21 '16 at 13:23
  • This is probably solved now for ints with my comment at the bottom suggesting one look into `INET_ATON()`. Tweak your `alter table` and updates accordingly, even if it is a 2 step process. – Drew Jun 21 '16 at 13:39
  • @Drew What was your error? I just tried it and I also had a (minor) error, the cast `CONVERT(SUBSTR...), INTEGER)` should be to `UNSIGNED INTEGER` i.s.o. just `INTEGER`. Apart from that it seems to work fine. – Stijn de Witt Jun 21 '16 at 13:55
  • @StijndeWitt It was an error 1064 of course. near the end. I spent a minute counting parentheses. Cursory glance they didn't close out – Drew Jun 21 '16 at 13:56
  • @Drew I think you encountered the same error. Use `from_ip INT UNSIGNED` i.s.o `VARCHAR(20)`, then change the `CONVERT(..., INTEGER)` to `CONVERT(..., UNSIGNED INTEGER)` and it'l work. – Stijn de Witt Jun 21 '16 at 13:57
  • @StijndeWitt thx, Edit2 at bottom, seems to have worked boss. Now we have cidr, IP, and int. – Drew Jun 21 '16 at 14:11
  • How do you do this for IPv6 addresses? I tried using INET6_ATON instead of INET_ATON in the above answer (Edit2 part), but the query fails with message " Numeric value out of range: 1690 BIGINT UNSIGNED value is out of range" – Ahmed Shefeer Apr 03 '21 at 10:30
2

I'd suggest using MaxMind's GeoIP2 CSV Converter to put it in the format you want. There is an -include-integer-range option that will provide from and to integer columns that I think you are looking for. Binaries are available for Windows, Linux (amd64), and OS X.

Greg Oschwald
  • 1,716
  • 2
  • 11
  • 14
0

It really doesn't seem possible to do this with a simple SQL script, so I've written one in C#. And since importing MySQL databases that are so big is not that simple, I implemented a direct INSERT INTO into the script itself.

A table structure like the one on the sketch in the question is required for it to work.

using MySql.Data.MySqlClient;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace GeoIPConvert
{
    public static class Program
    {
        public static void Main(string[] args)
        {
            // https://dev.maxmind.com/geoip/geoip2/geolite2/

            List<Country> countries = File.ReadAllLines("Countries.csv")
                .Select(line => line.Split(','))
                .Where(line => line[4] != "" && line[5] != "")
                .Select((line, index) => new Country
                {
                    ID = Convert.ToInt32(line[0]),
                    DatabaseID = index + 1,
                    Flag = line[4].ToLower(),
                    Name = line[5].Replace("\"", "")
                })
                .ToList();

            List<IPRange> ipRanges = File.ReadAllLines("GeoIP.csv")
                .Select(line => line.Split(','))
                .Where(line => line[2] != "")
                .Select(line => new IPRange
                {
                    Country = countries.First(country => country.ID == Convert.ToInt32(line[2])),
                    From = ConvertCidrToRange(line[0]).Item1,
                    To = ConvertCidrToRange(line[0]).Item2,
                })
                .ToList();

            //string sql =
            //  "INSERT INTO geoip_countries(Flag, Name) VALUES\r\n" +
            //  string.Join(",\r\n", countries.Select(country => "(\"" + country.Flag + "\", \"" + country.Name + "\")").ToArray()) + "\r\n" +
            //  "INSERT INTO geoip_ipranges(CountryID, `From`, `To`) VALUES\r\n" +
            //  string.Join(",\r\n", ipRanges.Select(iprange => "(\"" + iprange.Country.DatabaseID + "\", \"" + iprange.From + "\", \"" + iprange.To + "\")").ToArray());

            //File.WriteAllText("Import.sql", sql);

            using (MySqlConnection sql = new MySqlConnection("Server=localhost;Database=test_db;Uid=root;"))
            {
                sql.Open();

                foreach (Country country in countries)
                {
                    new MySqlCommand("INSERT INTO geoip_countries(Flag, Name) VALUES(\"" + country.Flag + "\", \"" + country.Name + "\")", sql).ExecuteNonQuery();
                }
                foreach (IPRange ipRange in ipRanges)
                {
                    new MySqlCommand("INSERT INTO geoip_ipranges(CountryID, `From`, `To`) VALUES(\"" + ipRange.Country.DatabaseID + "\", \"" + ipRange.From + "\", \"" + ipRange.To + "\")", sql).ExecuteNonQuery();
                    Console.WriteLine(ipRange.To);
                }

                sql.Close();
            }
        }

        private static Tuple<uint, uint> ConvertCidrToRange(string cidr)
        {
            string[] parts = cidr.Split('.', '/');
            uint ipnum = Convert.ToUInt32(parts[0]) << 24 | Convert.ToUInt32(parts[1]) << 16 | Convert.ToUInt32(parts[2]) << 8 | Convert.ToUInt32(parts[3]);
            uint mask = uint.MaxValue << (32 - Convert.ToInt32(parts[4]));
            return Tuple.Create(ipnum & mask, ipnum | (mask ^ uint.MaxValue));
        }
    }

    public class Country
    {
        public int ID { get; set; }
        public int DatabaseID { get; set; }
        public string Flag { get; set; }
        public string Name { get; set; }
    }

    public class IPRange
    {
        public Country Country { get; set; }
        public uint From { get; set; }
        public uint To { get; set; }
    }
}
bytecode77
  • 14,163
  • 30
  • 110
  • 141
0

You can use my tool for converting MaxMind GeoLite2 country/city CSV to MySQL/PostgreSQL/Microsoft SQL Server 2019 from GitHub https://github.com/mbto/maxmind-geoip2-csv2sql-converter

You can read examples, or follow this steps:

  1. Take a free license key to MaxMind API (if you don't have) at https://support.maxmind.com/account-faq/license-keys/how-do-i-generate-a-license-key/
  2. Install Java 11 (if not installed) at adoptopenjdk.net or github.com/raphw/raphw.github.io or oracle.com/java
  3. Download a tool from releases (.zip or .tar)
  4. Unpack to yours directory
  5. Copy/Paste .ini template bin/GeoLite2-Country-CSV.mysql.default.ini with your profile name, example bin/GeoLite2-Country-CSV.mysql.Your Project Name.ini or use default.
  6. Open .ini template with Notepad and change [windows_loader] or [unix_loader] section (set MySQL host:port, user and password).
  7. For unix: Execute chmod +x maxmind-geoip2-csv2sql-converter
  8. Run converting: maxmind-geoip2-csv2sql-converter.bat -c "GeoLite2-Country-CSV.mysql.Your Project Name.ini" -k Your_License_Key -i 4,6
  9. After converting, the scripts bin/converted/loader.bat and bin/converted/loader.sh will be available.
  10. For unix: Execute chmod +x loader.sh
  11. Execute loader.bat or loader.sh for importing schemas.

Done

schemas

mbto
  • 61
  • 5