0

I have multiple (1086) files (.dat) and in each file I have 5 columns and 6384 lines. I have a single file named "info.txt" which contains 2 columns and 6883 lines. First column gives the line numbers (to delete in .dat files) and 2nd column gives a number.

1 600
2 100
3 210
4 1200

etc... I need to read in info.txt, find every-line number corresponding to values less than 300 in 2nd column (so it is 2 and 3 in above example). Then I need to read these values into sed-awk or grep and delete these #lines from each .dat file. (So I will delete every 2nd and 3rd row of dat files in the above example).

More general form of the question would be (I suppose): How to read numbers as input from file, than assign them to the rows to be deleted from multiple files.

I am using bash but ksh help is also fine.

Vijay
  • 65,327
  • 90
  • 227
  • 319
  • 3
    Please come up with a more minimal example, showing the input files and your desired output. – glenn jackman Jun 04 '14 at 13:30
  • In your example, only the first row has a value greater than 300 in the 2nd column, so it looks like to me that you'd only delete line 1 from your data files, not lines 2 and 3. – SeeJayBee Jun 04 '14 at 14:15
  • Sorry It should be values smaller than 300. – user2721844 Jun 04 '14 at 14:16
  • Can you edit it and clean that up then? Also, in your second sentence you say "2 rows and 6883 lines". I assume you actually mean "2 columns and 6883 lines". – SeeJayBee Jun 04 '14 at 14:17

4 Answers4

1
sed -i "$(awk '$2 < 300 { print $1 "d" }' info.txt)" *.dat

The Awk script creates a simple sed script to delete the selected lines; the script it run on all the *.dat files.

(If your sed lacks the -i option, you will need to write to a temporary file in a loop. On OSX and some *BSD you need -i "" with an empty argument.)

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • I think there's a problem with this, but I'm not sure. After sed deletes line 1, won't line 2 now be line 1? So, if you delete line 2, aren't you now deleting line 3? I think you have to do a reverse sort of the line numbers to delete somewhere. – SeeJayBee Jun 04 '14 at 14:59
  • Also, you need to guarantee uniqueness. If you delete line 25, you don't want to delete line 25 again. – SeeJayBee Jun 04 '14 at 15:01
  • Why do you think so? Did you try? `sed` refers to line numbers in the original input file. – tripleee Jun 04 '14 at 15:02
  • Obviously, I didn't try, that's why I said "I'm not sure". But if it always refers to the original file, then there should be no problem. – SeeJayBee Jun 04 '14 at 15:02
  • Well to clarify if it works or not: This worked out GREAT!! I really appreciate it ! Thanks a lot tripleee! – user2721844 Jun 05 '14 at 11:56
1

This might work for you (GNU sed):

sed -rn 's/^(\S+)\s*([1-9]|[1-9][0-9]|[12][0-9][0-9])$/\1d/p' info.txt | 
sed -i -f - *.dat

This builds a script of the lines to delete from the info.txt file and then applies it to the .dat files.

N.B. the regexp is for numbers ranging from 1 to 299 as per OP request.

potong
  • 55,640
  • 6
  • 51
  • 83
0
# create action list
cat info.txt | while read LineRef Index
 do
   if [ ${Index} -lt 300 ]
    then
      ActionReq="${ActionReq};${Index} b
"
    fi
 done

# apply action on files
for EachFile in ( YourListSelectionOf.dat )
 do
   sed -i -n -e "${ActionReq}
p" ${EachFile}
 done

(not tested, no linux here). Limitation with sed about your request about line having the seconf value bigger than 300. A awk is more efficient in this operation. I use sed in second loop to avoid reading/writing each file for every line to delete. I think that the second loop could be avoided with a list of file directly given to sed in place of file by file

NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
  • The [useless use of `cat`](/questions/11710552/useless-use-of-cat) is an antipattern. The `for` loop is a syntax error. – tripleee Jun 19 '19 at 12:23
0

This should create a new dat files with oldname_new.dat but I havent tested:

awk 'FNR==NR{if($2<300)a[$1]=$1;next}
     !(FNR in a)
     {print >FILENAME"_new.dat"}' info.txt *.dat
Vijay
  • 65,327
  • 90
  • 227
  • 319