0

I need to validate IP ranges in a file and correct them.

File has these bad ranges:

192.168.1.2-192.168.1.1
10.0.0.10-10.0.0.8
172.16.0.9-172.16.0.5

The problem is that ending address cannot come before starting address and it should be corrected to:

192.168.1.1-192.168.1.2
10.0.0.8-10.0.0.10
172.16.0.5-172.16.0.9

My file has a lot of these bad ranges, so an automatic correction way would be great.

Demontager
  • 217
  • 1
  • 5
  • 12

2 Answers2

1

Hy,

You have to do the following steps:

  1. read each line
  2. split the current line in ips
  3. sort the two ips
  4. echo the sorted ips

The following script does this:

#!/bin/bash

filename="$1"
#Step1: read each line from file
#see http://stackoverflow.com/questions/10929453/bash-scripting-read-file-line-by-line
while read -r line
do
    #Step2: split each line in ips
    #see http://stackoverflow.com/questions/10586153/split-string-into-an-array-in-bash
    IFS='-' read -r -a array <<< "$line"

    #Step3: sort the ips
    #see http://stackoverflow.com/questions/7442417/how-to-sort-an-array-in-bash
    #for sorting ips see: https://www.madboa.com/geek/sort-addr/
    IFS=$'\n' sorted=($(sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4 <<<"${array[*]}"))
    unset IFS
    #Step4: echo the results
    echo ${sorted[0]}"-"${sorted[1]}
done < "$filename"

The results for the following file:

192.168.1.2-192.168.1.1
10.0.0.10-10.0.0.8
172.16.0.5-172.16.0.9

are:

192.168.1.1-192.168.1.2
10.0.0.8-10.0.0.10
172.16.0.5-172.16.0.9
banuj
  • 3,080
  • 28
  • 34
  • It almost works as expected. But when tested on production list the end range goes to the new line https://gist.github.com/Demontager/c4d04c927d338d592f98 p.s. Works fine with my sample. – Demontager Feb 14 '16 at 00:27
  • 1
    @Demontager: If it "works find with my sample" then you should accept the provided answer via http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235 . If you didn't specify your problem correctly, then you should ask for clarification by editing your Q rather than providing a link to an off-site set of data. Good luck. – shellter Feb 14 '16 at 05:24
  • @shellter, accepted. – Demontager Feb 14 '16 at 17:25
  • 2
    @Demontager : Good show! As a welcome and an FYI realize that good questions A. show **small** sample data that covers all cases, including "records" that should be skipped and those that should be flagged as "error-in-input", B. required output from that sample data ,C. Include best attempt code to resolve the problem. D. current output from your code AND and error messages ("it's not working" is not evidence). E. Your best thoughts on your approach to your solution and why it may not be working yet. Good luck and keep posting ;-) – shellter Feb 14 '16 at 17:39
  • @shellter. I will keep my comments intact, same keep yours, you right 100% Secondly, this is not a show, i tried to find the best sample to fit my real needs, not expected other result, that's why clarified and showed output how code works with my data. Which evidence you need more ? See my Q here http://stackoverflow.com/questions/35279738/grep-text-between-patterns-using-bash/35279949?noredirect=1#comment58304384_35279949 I asked guy to adapt code to one more criteria and he did. Why not to ask? Someone may need this "extra" later on too. – Demontager Feb 14 '16 at 18:07
  • @Demontager : Sorry if I have offended you, that was not the intent. Good luck in your future endeavors! – shellter Feb 14 '16 at 19:22
  • @shellter, not offended, keep poking me if you like. I told you in last comment - you completely right and i will always remember your notes when i need code to fit my "production data". – Demontager Feb 14 '16 at 19:31
  • 1
    Caveat: Unlike some other cases (ie. with assignment syntax prefixing a regular command), setting `IFS` on the same line as making use of the new value (`sorted=( ... )`) *does not* scope the IFS change to that one line. It might enhance clarity to move the IFS assignment elsewhere -- ie. outside the loop -- to avoid giving readers the mistaken impression that that assignment is in fact so scoped. – Charles Duffy Feb 15 '16 at 21:14
  • 1
    BTW, `echo ${sorted[0]}"-"${sorted[1]}` is *precisely the reverse* of good practices, quoting the content that doesn't need to be quoted (the string constant with no whitespace or shell-special characters) and leaving unquoted the content that *does* need to be quoted to prevent string-splitting and glob expansion (which is to say, the parameter expansions). Consider `echo "${sorted[0]}-${sorted[1]}"` instead. – Charles Duffy Feb 15 '16 at 21:16
  • 1
    You might also consider avoiding the array altogether: `{ read -r first; read -r second; } < <(sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4 <<<"${array[*]}")` will read the first and second items into two separate variables, `first` and `second`, without the glob-expansion side effects of the original code (look at what happened if you somehow had a `*` instead of an IP address in your file: you would have a list of files in the current directory substituted into your array due to the unquoted expansion of `$(sort ...)`). – Charles Duffy Feb 15 '16 at 21:19
  • 1
    ...similarly, you could make your outer loop `while IFS=- read -r first second; do`, and not need to bother with `$line` or `$array` at all. – Charles Duffy Feb 15 '16 at 21:20
  • In addition to the many shell coding errors @CharlesDuffy already pointed out, even once all of those are fixed the end result will be immensely slow compared to, say, an awk solution. See http://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for one discussion on that topic. – Ed Morton Feb 15 '16 at 21:27
  • 1
    @EdMorton and I often disagree on the finer points around this, but shell does indeed have a lot of details it's easy to get wrong, and very few shells (only genuine David Korn ksh, really) are even close to fast as awk. This particular implementation, using a separate sort invocation for each line processed, is going to be many orders of magnitude slower than its `awk` equivalent. – Charles Duffy Feb 15 '16 at 21:33
1

Given your sample input/output then all you need is this, using GNU awk for gensub():

$ awk -F- {print (gensub(/.*\./,"",1,$1) < gensub(/.*\./,"",1,$2) ? $1 FS $2 : $2 FS $1)}' file 
192.168.1.1-192.168.1.2
10.0.0.10-10.0.0.8
172.16.0.5-172.16.0.9

With other awks just use a couple of local vars and sub().

If, however, you need a solution that works when some other part of the IP addrs than just the final segment can be different on a given line (e.g. 172.16.0.5-172.15.0.9), then this will work in any awk:

$ cat tst.awk
BEGIN { FS="-" }
{
    split($1,t,/\./)
    beg = sprintf("%03d%03d%03d%03d", t[1], t[2], t[3], t[4])

    split($2,t,/\./)
    end = sprintf("%03d%03d%03d%03d", t[1], t[2], t[3], t[4])

    print (beg < end ? $1 FS $2 : $2 FS $1)
}

$ awk -f tst.awk file
192.168.1.1-192.168.1.2
10.0.0.8-10.0.0.10
172.16.0.5-172.16.0.9

$ echo '172.16.0.5-172.15.0.9' | awk -f tst.awk     
172.15.0.9-172.16.0.5

If you're considering using a shell loop just to manipulate text then make sure you read and fully understand https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice first.

Community
  • 1
  • 1
Ed Morton
  • 188,023
  • 17
  • 78
  • 185