I would like to write a BASH/ZSH script to create a series of input files to run calculations based on a series of molecular coordinate files which have been generated for me. Do do this, I would like to copy the coordinates into a template file and then place each resulting input file into a separate directory.
The coordinate files all take the form:
16
Coordinates from ORCA-job tma.h2s-2-vpt2-b97m-d4-qz_DH001
N -1.01856437662682 -0.06753190029287 -0.03381525476330
C -1.19660954482440 1.36662477704039 0.00584753001945
H -0.52879789886132 1.79811481102372 0.74381766556842
H -2.22460413600866 1.65299630904871 0.26354625406395
H -0.95890188582374 1.79386241616540 -0.96233649351010
C -1.28542658079723 -0.65492826277353 1.25977378002066
H -2.31817759691074 -0.48376502304456 1.59096903742337
H -0.61568119318365 -0.23019270431631 1.99961200111689
H -1.11650787130907 -1.72558477449900 1.21976548924500
C -1.84182993191365 -0.66637076429011 -1.05941450368454
H -1.59318377282657 -0.24161533539115 -2.02568837580639
H -2.91364004980661 -0.50657782170270 -0.88078168276917
H -1.66188430154905 -1.73508690928373 -1.09888083377706
H 1.05164190689465 -0.14079517174876 -0.18978887741952
S 2.39061532747941 0.04713819635625 -0.09264157716239
H 2.60455667714639 -1.21543465473706 0.28761295155857
where the first line is the number of atoms in the molecule, the second line is a comment line, and the remainder are cartesian coordinates for each atom. All of these files are named <prefix>_DH###.xyz
, where the ###
are numbers (in the above example, 001
), and the prefix is just the name of the file (in the above example, it's tma.h2s-2-vpt2-b97m-d4-qz
). I have ~80 of these coordinate files (so it's not really practical to do it manually), where each of these files contains coordinates for the same molecule, but with a slightly different geometry.
I would like to copy the coordinates in each of these files into a template file, which looks something like this:
#H2S-TMA, Displacement ###: B97M-D4/AVTZ
!B97M-D4 verytightscf verytightopt freq DefGrid3 NoRI Mass2016 UseSym
%method
functional mgga_xc_b97m_v
end
etc etc etc
%pal
nprocs 4
end
* xyz 0 1
$coords
*
where the coordinates would be copied into the space labelled $coords. I'd also like it for each of these files to be called <prefix>_D###.inp
, similar to the original coordinate files, and to be copied into directories called D###
, with all the numbers matching the original coordinate file's name. (So, a coordinate file tma_DH001.xyz
would produce ./D001/tma_D001.inp
- the naming and numbering of each resulting input file is very important)
So far, I have been able to generate each of the new directories, but I can't figure out how to copy the contents of all of the coordinates into the template. I have asked a similar question in the past (Script to copy all text from one file into template file), but that was for splitting the contents of one file into multiple new files; I'm not sure how to copy coordinates from multiple files into the template.
I imagine this will require an awk
or a sed
command, or even cat
, but I don't understand awk
(it's another language entirely!), sed seems to only copy individual lines, and I'm not sure how to direct the cat
output into a specific spot in the template file. What I have been able to come up with so far is:
#!/bin/zsh
input_file=$1
template=$2
#Determine number of atoms
n_atoms=$(sed -n 1p $1)
#determine number of displacements
n_displacements=$(ls -1 *.xyz | wc -l)
#Create directories of the form D000
start_index=1
end_index="$n_displacements"
for ((i=start_index; i<=end_index; i++)); do
# format a dirpath with the 3-digits index
printf -v dirpath 'D%03d' $i
mkdir -p -- "$dirpath"
done
#Copy coordinates into template file
for file in *${n}.xyz
do sed -i -e 1,2d $file >> $TEMPFILE
do cat $TEMPFILE >> ???
(I've added the $input_file
variable so that I can find out the number of atoms - because all of the coordinate files are of the same molecule, $n_atoms
will always be the same. I thought obtaining the number of atoms might be important to getting this script to work, but now I'm not so sure...)
It's the last part I can't figure out...
This will be the second script I've ever tried to write from scratch, so I am very inexperienced with all of this... Any help would be greatly appreciated!