-1

I have a large file of size 38000 by 5001. The first column is position information and the rest are signals. I also have another file that contains pairs of positions that also exists in the large files. I need to split the large file into multiple small files where each file only contains the rows that are in a certain range.

I know this is almost a duplicate question and I have tried everything that is provided before. It's not working that's why I'm posting here my codes. I have tried with awk. Here's what I've tried.

The file that contains the pairs of ranges is named with the lowest and highest value. For example, the name of a range file I have can be blah_blah_30000_4000.txt. This file contains pair values in every 500 apart. Such as

30000    30000
30000    30500
30000    31000
.
.
.
40000    30000
40000    30500
.
.
.
40000    40000

First I extracted the lowest and highest value from the file name.

IFS='_' read -a splittedName <<< "${fileName}"
startRange=${splittedName[2]}
endRange=${splittedName[3]}

Now to make these two strings into numbers

starting=$((startRange + 0))
ending=$((endRange + 0))

Then I used awk like so

awk -F, '{ if($1 >= "$startRange" && $1 <= "$endRange") { print >"test.txt"} }' $InputFile

Could anyone tell me where I'm doing wrong?

Inian
  • 80,270
  • 14
  • 142
  • 161
MD Abid Hasan
  • 339
  • 1
  • 3
  • 15
  • 1
    The data is pre-sorted by first column?? – Rafael Apr 16 '19 at 06:19
  • 1
    *I also have another file that contains pairs of positions that also exists in the large files.* Can you elaborate on this? Is this one file or two files? – Rafael Apr 16 '19 at 06:21
  • You try to use shell variables in an awk script. That is not the way to do it. See https://stackoverflow.com/questions/19075671. However, be aware that, if we have a bit more information about your input files, we could probably write a single awk script that does it all in a single go. – kvantour Apr 16 '19 at 07:48
  • Possible duplicate of [How do I use shell variables in an awk script?](https://stackoverflow.com/questions/19075671/how-do-i-use-shell-variables-in-an-awk-script) – kvantour Apr 16 '19 at 12:40

1 Answers1

1

You should rewrite your command on this way:

awk -F, -v start=$startRange -v end=$endRange -v fname=$fileName\
'{ if($1 >= start && $1 <= end) { print >$fname.txt} }' $InputFile

As mentioned in comments you can't use shell variables inside awk script

Romeo Ninov
  • 6,538
  • 1
  • 22
  • 31