0

I need to use two files in a calculation over about 200000 time steps. The calculation requires that the files contain input data for all time steps, starting at zero. However, I don't have data for the first time steps (from 0 to 40000).

How can I create dummy data, formatted as in the two examples below, to the top parts of my two files, from step 0 to 40000?

Below are examples of what data from one time step look like in each file. What I need to do is copy/paste the data to create dummy data for the first 40000 time steps, only changing the time step and time values as indicated.

File 1.

  • The only part that needs to change for each time step are the numbers in i = 39068, time = 19534.000.
  • I need the first "i" to be 0 and the first "time" to be 00000.000.
  • i/2 = time.

.

     260
 i =    39068, time =    19534.000, E =     -4062.1986631584
 Mg       -13.3685893531       -3.8172224945       -6.5454328304
 Mg        -8.0288171797       -9.8589528145       -4.2951766641
 Mg       -13.5647837790        3.7714741638       11.0518209867
 Mg        -1.5795350637        4.3136091666       -6.2931048061
 Mg        -1.3612751052       -7.5574060036        0.1309284910
 Mg       -15.3370827391       -1.1830156923       -6.7188280399
 H       -25.9873248868        7.7856564757      -15.3088471263
 H        -7.7675250833        2.5977735010       28.4575972233
 H        -9.4734812532       -4.2213295429       14.2412611145
 H        -3.6844358917        2.2584052865        0.1049152363
 O       -18.0698975152        2.1776522700      -11.0397875030
 O        -3.9062250799        4.3450953228        6.0283195565
 O        -3.5714461764       12.4282336147      -11.6036514440

File 2 format.

  • Each row corresponds to one time step. So each row goes with a block of data like in the File 1 example above. E.g., 14.6405230712 14.6405230712 14.6405230712 would correspond to i=39068.

.

14.6405230712 14.6405230712 14.6405230712
14.6404784034 14.6404784034 14.6404784034
14.6404312212 14.6404312212 14.6404312212
14.6356002404 14.6356002404 14.6356002404

If anyone could please point me in the right direction, I would be grateful; I have little experience with Awk, for example. I can't modify the code I'm using to ignore the first time steps because it's pre-compiled.

My tags may not be appropriate and any suggestions there would be welcome too. Also, apologies for the strange formatting with the large spaces and random "." It's the only way I can figure out to display/preserve the formatting for the number columns.

  • Please include expected output for that sample too. And what you have tried so far would also be a nice addition. – oguz ismail Dec 30 '20 at 11:58
  • @oguzismail, I will try to clarify the question. The expected output will be identical to the two examples I provided for 40000 time steps. I just need dummy data for the first 40000 time steps of my files. I can't modify the code I'll use for the calculations because it's pre-compiled and there are too many time steps to create the dummy time steps manually, so I'm stuck. I think there may be a way to do this via a bash of Awk script but I'm asking this question because I don't know how. –  Dec 30 '20 at 12:47
  • I'm sure I can help but I need more work on your part to actually understand what you have and what you want. Don't talk about time steps, talk about lines and blocks of line. Describe your input or give us a sample big enough so we can actually understand what it looks like. For now I only see one line in one file that you need to change, but I'm pretty sure that's not what you're looking for. How are the blocks linked together? Do you want to change the input files or have a continuous stream? Without understanding that, I don't know what to do with file2. Does it also need to be changed? – Camusensei Dec 30 '20 at 13:32
  • I have two files. File 1 contains hundreds of thousands of blocks of data. File 2 contains rows, where each row corresponds to a block of data in file 1. In both files, the first 40000 data blocks/rows are missing. So I need to add dummy data back in to the top part of each file, formatted as in the examples above. In File 1 (where there are blocks of data) only this part has to differ for each block: i = 0, time = 00000.000; it has to increase for each block/row from 0 to 40000. –  Dec 30 '20 at 13:40
  • @Camusensei, is this clear? If not, I can modify it. I have provided a sample in my question and don't understand what makes it unclear but would be happy to continue trying to explain. Thank you. –  Dec 30 '20 at 13:41
  • 1
    I have added an answer. As you can see I still had to make lots of assumptions about the data that you wanted in the dummy blocks: the format of the time for odd steps, the format of the `i` for values lower than 10k, how blocks are chained in the file. This is why we tell you to always give example inputs (not applicable here) and outputs. – Camusensei Dec 30 '20 at 14:04
  • Thank you, this is enough to set me on the right track. –  Dec 30 '20 at 14:13

1 Answers1

0

Here is one way you can generate 40k blocks of dummy data:

for i in {0..40000}; do
  printf -v time '%05d.%s00' "$((i/2))" "$((i%2 == 0 ? 0 : 5))"
  cat << EOF
     260
 i =    $i, time =    $time, E =     -4062.1986631584
 Mg       -13.3685893531       -3.8172224945       -6.5454328304
 Mg        -8.0288171797       -9.8589528145       -4.2951766641
 Mg       -13.5647837790        3.7714741638       11.0518209867
 Mg        -1.5795350637        4.3136091666       -6.2931048061
 Mg        -1.3612751052       -7.5574060036        0.1309284910
 Mg       -15.3370827391       -1.1830156923       -6.7188280399
 H       -25.9873248868        7.7856564757      -15.3088471263
 H        -7.7675250833        2.5977735010       28.4575972233
 H        -9.4734812532       -4.2213295429       14.2412611145
 H        -3.6844358917        2.2584052865        0.1049152363
 O       -18.0698975152        2.1776522700      -11.0397875030
 O        -3.9062250799        4.3450953228        6.0283195565
 O        -3.5714461764       12.4282336147      -11.6036514440
EOF
done > dummy_40k

and for file 2:

for i in {0..40000}; do
  echo "14.6405230712 14.6405230712 14.6405230712"
done > dummy_40k_file2

up to you to paste the dummy and the real data together I guess

EDIT: Assuming you don't hit command line length limit, it can be shortened using https://stackoverflow.com/a/5349842/4486184 to

printf '14.6405230712 14.6405230712 14.6405230712\n%.0s' {0..40000} > dummy_40k_file2
Camusensei
  • 1,475
  • 12
  • 20
  • Thank you very much. I will figure out how to implement this and will let you know how it went. –  Dec 30 '20 at 13:59