0

I'm afraid I haven't come across a direct way of doing this, although I tried adapting some provided solutions for similar scenarios (but not quite for what I need). Given this data:

1118 1120
1121 1124
1122 1127
1125 1126
1128 1133
1130 1135
1136 1139
1137 1138
1140 1145

It is already sorted by column 1. Except for first and last lines, all the others have intervals that overlap, in pairs. So I want an output with just the overlapping ranges:

1122 1124
1125 1126
1130 1133
1137 1138

For me at least, this is harder that I expected at first glance.

one-liner
  • 791
  • 1
  • 9
  • 19
  • 2
    `tried adapting some provided solutions` please do add that to question... it will show your effort... else likely the question would be treated as asking free coding service.. – Sundeep Jul 18 '17 at 14:36
  • why should `1122 1127` not be in the output? – RomanPerekhrest Jul 18 '17 at 14:44
  • @Sundeep: https://stackoverflow.com/questions/12742484/identify-overlapping-ranges-in-awk; https://stackoverflow.com/questions/16638951/how-to-remove-overlap-in-numeric-ranges-awk; https://stackoverflow.com/questions/21488613/awk-to-find-overlaps; https://stackoverflow.com/questions/38482040/using-awk-to-print-pairs-of-records-having-overlapping-range-of-values-between-t; There was another one not found on stackoverflow but I can't find it again. – one-liner Jul 18 '17 at 14:55
  • you can link them in question + add the variation you tried not present in those links... – Sundeep Jul 18 '17 at 14:56
  • @RomanPerekhrest I looked at the data again but I don't see why it should, maybe I am missing something. In any case I actually missed one interval: `1125 1126` – one-liner Jul 18 '17 at 15:00

1 Answers1

0

Here's one way to do it in awk. There's likely a more efficient way.

awk '{b=e=0; for(i=$1; i<=$2; i++) { if (exists[i]) if(b==0) {b=e=i} else {e=i}; exists[i]=i; } if(b) print b,e; }' input_file
flu
  • 546
  • 4
  • 11