2

I have a list of numbers like below:

1  0/1
2  1/1
3  1/1
4  1/1
5  1/1
6  1/1
7  0/1
8  0/1

If the column 2 is "1/1" for consecutive rows, I would like to report the start and end of positions, say, in here, it should be: 2-6

How should I do this applying some simple bash code, or python if necessary?

many thanks

LookIntoEast
  • 8,048
  • 18
  • 64
  • 92

2 Answers2

2

If you are able to code in python you can solve it in the following way:

  1. Read your file.
  2. Use a regex to create a list that contains the first number only if the second is 1/1.
  3. Group the list in ranges. (hint)

So the code will look like:

import re

# step 1
with open('filename') as f:
    data = f.read()

# step 2
list = re.findall(r'(\d+)\s+1/1', data)

# step 3
# Check the link in the description of the algorithm
Community
  • 1
  • 1
enrico.bacis
  • 30,497
  • 10
  • 86
  • 115
0

Bash solution:

#! /bin/bash
unset in                                 # Flag: are we inside an interval?
unset last                               # Remember the last position.
while read p f ; do
    if [[ $f = 1/1 && ! $in ]] ; then    # Beginning of an interval.
        echo -n $p-
        in=1
    elif [[ $f = 1/1 && $in ]] ; then    # Inside of an interval.
        last=$p
    elif [[ $f != 1/1 && $in ]] ; then   # End of an interval.
        echo $last
        unset in
    fi
done
choroba
  • 231,213
  • 25
  • 204
  • 289