1

In bash, suppose I have the input:

ATGTGSDTST

and I want to print:

AT
ATGT
ATGTGSDT
ATGTGSDTST

which means that I need to look for all the substrings that end with 'T' and print them. I thought I should use sed inside a for loop, but I don't understand how to use sed correctly in this case. Any help? Thanks

choroba
  • 231,213
  • 25
  • 204
  • 289
Guy Ohayon
  • 45
  • 1
  • 6

2 Answers2

0

The following script uses sed:

#!/usr/bin/env bash

pattern="ATGTGSDTST"                                                        
sub="T"
# Get number of T in $pattern:
num=$(grep -o -n "T" <<< "$pattern" | cut -d: -f1 | uniq -c | grep -o "[0-9]\+ ")                                                        
i=1                                                                         
text=$(sed -n "s/T.*/T/p" <<< "$pattern")                                   
echo $text                                                                  

while [ $i -lt $num ]; do                                                   
    text=$(sed -n "s/\($sub[^T]\+T\).*/\1/p" <<< "$pattern")                
    sub=$text                                                               
    echo $text                                                              
    ((i++))                                                                 
done          

gives output:

AT
ATGT
ATGTGSDT
ATGTGSDTST
builder-7000
  • 7,131
  • 3
  • 19
  • 43
0

No sed needed, just use parameter expansion:

#! /bin/bash
string=ATGTGSDTST
length=${#string}
prefix=''

while (( ${#prefix} != $length ))  ; do
    sub=${string%%T*}
    sub+=T
    echo $prefix$sub
    string=${string#$sub}
    prefix+=$sub
done
choroba
  • 231,213
  • 25
  • 204
  • 289