0

I have a text file in the below format.The first column represents a timestamp with a very high resolution.The second number represents the sequence number.I want to plot a graph between these two values.i.e Sequence number Over time.For this purpose I want to scale the sequence number and the timestamp.Time stamp can be scaled by subtracting the first time stamp from the remaining time stamps.Sequence number also should be scaled the same way.However when scaled the sequence number can have negative values.How do I write a bash script using awk to achieve this.This file name is print_1010171.txt.Please not that I do have a number of files of the same format.so I want the script to get generic.

5698771509078629376     1133254688
5698771509371165696     1150031904
5698771510035551232     1150031904
5698771510036082688     4170258464
5698771510036715520     2895583264
5698771510037202176     1620908064
5698771510037665280     346232864
5698771510038193664     3366459424
5698771510332259072     2091784224
5698771510332816128     817109024
5698771510333344512     3837335584
5698771510339882240     2562660384
5698771510340411392     1287985184
5698771510340939776     13309984
5698771510348048896     3033536544
5698771510348577280     1758861344
5698771510349228800     484186144
5698771510632804864     3504412704
5698771510633441792     2229737504
5698771510634390272     955062304
5698771510638858496     3975288864
5698771510639347712     2700613664
5698771510642663168     1425938464
5698771510643387136     134486304
5698771510643808768     3154712864
5698771510648858368     1880037664
5698771510649410560     605362464
5698771510655600384     3625589024
5698771510656128768     2350913824
5698771510656657408     1076238624
liv2hak
  • 14,472
  • 53
  • 157
  • 270

3 Answers3

1
awk 'NR == 1 {basets = $1; baseseq = $2} {print $1 - basets, $2 - baseseq}' inputfile

or, if you don't want to output the initial pair of zeros:

awk 'NR == 1 {basets = $1; baseseq = $2; next} {print $1 - basets, $2 - baseseq}' inputfile
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
1

Here is a bash wrapper script which should do what you want:

#!/bin/bash

gnuplot << EOF
set terminal png truecolor size 800,600
set output 'plot_$1.png'

firstx=0
offsetx=0
funcx(x)=(offsetx=(firstx==0)?x:offsetx,firstx=1,x-offsetx)
firsty=0
offsety=0
funcy(x)=(offsety=(firsty==0)?x:offsety,firsty=1,x-offsety)

plot '$1' u (funcx(\$1)):(funcy(\$2))
EOF

To use the script, give it the name of the file you want to plot as an argument:

$ myscript.sh print_1010171.txt

I modified the answer given here to accommodate two variables. See that answer also if you want to subtract the lowest value from all data rather than the first.

Community
  • 1
  • 1
andyras
  • 15,542
  • 6
  • 55
  • 77
  • the `echo` is unnecessary here. – mgilson Jun 17 '12 at 22:02
  • Also, the semicolons don't matter and your output file will be named something like `plot_print_stuff.txt.png`. You could probably use the `strstr` function and string slicing to cut off the `.txt` extension (if you know the datafile has a `.txt` extension). (Otherwise, nice answer ;) +1 – mgilson Jun 17 '12 at 22:43
1

Very similar to Dennis Williamson's solution -- This should be more efficient (but probably not something you'd ever notice) and it will also silently ignore blank lines (the other solution will give very large negative numbers for blank lines).

#script coolscript.gp
if(!exists("DATAFILE")) DATAFILE='test.dat'
EXT_INDEX=strstr(DATAFILE,'.txt')  #assume data has a .txt extension.
set term post enh color
set output DATAFILE[:EXT_INDEX] . '.ps'  #gnuplot string slicing and concatenation
plot "< awk 'BEGIN{getline; header_col1=$1; header_col2=$2 }{if(NF){print $1-header_col1,$2-header_col2}}' ".DATAFILE using 1:2

You can definitely do this using an all-gnuplot solution. (See @andyras's nice solution and my answer that he linked to). This (alternate) solution works by reading the first line in awk and assigning the variables header_col1 and header_col2 with the data in column 1 and column 2. It then subtracts those from the future columes (as expected) as long as the line isn't empty.

Note that this solution can be called from the commandline using:

gnuplot -e "DATAFILE='mydatafile.txt'" coolscript.gp

Unfortunately, the quotes are necessary since gnuplot needs them, meaning that if you're using this in a shell loop, you should definitely use the double quotes on the outside as I show.

for FILE in *.dat; do
   gnuplot -e "DATAFILE='${FILE}'" coolscript.gp
done
mgilson
  • 300,191
  • 65
  • 633
  • 696