1

data.txt

height= 6'1" age= shoe-size=9.5 sex= M
height= 6'5" age= shoe-size=9.0 sex= M
height= 5'11" age= shoe-size=8.5 sex= F
height= 5'9" age= shoe-size=11.5 sex= M
height= 4'11" age= shoe-size=7.5 sex= F
height= 6'4" age= shoe-size=9.5 sex= M

age.txt

21
23
22
19
34
27

How do I grab the information from the age.txt and place it into data.txt, putting each number following the other into the age portion of data.txt?

Is there a way to do a for loop for the number of lines in my file and then look for 'age' and every time I see age, replace it with the number in age.txt

Expected output

height= 6'1" age=21 shoe-size=9.5 sex= M
height= 6'5" age=23 shoe-size=9.0 sex= M
height= 5'11" age=22 shoe-size=8.5 sex= F
height= 5'9" age=19 shoe-size=11.5 sex= M
height= 4'11" age=34 shoe-size=7.5 sex= F
height= 6'4" age=27 shoe-size=9.5 sex= M
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
yoyo31
  • 11
  • 2

3 Answers3

3

Using bash4 index arrays.

mapfile -t data < data.txt
mapfile -t age <age.txt

for i in "${!data[@]}"; do echo "${data[$i]//age=/age="${age[$i]}"}"; done

Output is this.

height= 6'1" age=21 shoe-size=9.5 sex= M
height= 6'5" age=23 shoe-size=9.0 sex= M
height= 5'11" age=22 shoe-size=8.5 sex= F
height= 5'9" age=19 shoe-size=11.5 sex= M
height= 4'11" age=34 shoe-size=7.5 sex= F
height= 6'4" age=27 shoe-size=9.5 sex= M

mapfile aka readarray is a bash4+ feature.

The ${!data[@]} means you're looking for the index of an array and ${data[@]} is the array.

Or using while loop and read read, so basically it is just reading the two files inside a while loop.

while IFS= read -r line_in_data <&3
  read -r line_in_age; do
  printf '%s\n' "${line_in_data//age=/age=$line_in_age}"
done 3<data.txt <age.txt

Should print the same output.

height= 6'1" age=21 shoe-size=9.5 sex= M
height= 6'5" age=23 shoe-size=9.0 sex= M
height= 5'11" age=22 shoe-size=8.5 sex= F
height= 5'9" age=19 shoe-size=11.5 sex= M
height= 4'11" age=34 shoe-size=7.5 sex= F
height= 6'4" age=27 shoe-size=9.5 sex= M

POSIX sh solution.

#!/bin/sh

while read -r column1_in_data column2_in_data column3_in_data rest_of_columns_in_data <&3
  read -r line_in_age; do
  printf '%s\n' "$column1_in_data $column2_in_data $column3_in_data$line_in_age $rest_of_columns_in_data"
done 3<data.txt <age.txt

the <&3 is a redirection so the first read will read from FD (file descriptor 3.)

The ${var//search/rep} is a bash specific P.E. (parameter expansion.)

Awk can do better/faster on large set of files/data I believe.

Jetchisel
  • 7,493
  • 2
  • 19
  • 18
  • Thanks! Can you explain what this means: "${!data[@]}" – yoyo31 Jan 27 '20 at 01:40
  • See: [Creating an array from a text file in Bash](https://stackoverflow.com/q/30988586/6862601) – codeforester Jan 27 '20 at 02:24
  • @yoyo31 *If name is an array variable, expands to the list of array indices (keys) assigned in name. If name is not an array, expands to 0 if name is set and null otherwise. When ‘@’ is used and the expansion appears within double quotes, each key expands to a separate word.* From [bash documentation](https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html#Shell-Parameter-Expansion). – Shawn Jan 27 '20 at 02:59
2

Using awk and getline provides a very simple and efficient solution, e.g.

awk '{getline age < "age.txt"; $3=$3 age}1' data.txt

Above you are simply using getline to read from age.txt into the variable age and then appending age to the 3rd field from data.txt.

Another way so long as "age=" only appears once in the line would be:

awk '{getline age < "age.txt"; sub(/age=/,"age=" age)}1' data.txt

Example Use/Output

You can just copy and middle-mouse paste into an xterm in the directory where your files are located, e.g.

$ awk '{getline age < "age.txt"; $3=$3 age}1' data.txt
height= 6'1" age=21 shoe-size=9.5 sex= M
height= 6'5" age=23 shoe-size=9.0 sex= M
height= 5'11" age=22 shoe-size=8.5 sex= F
height= 5'9" age=19 shoe-size=11.5 sex= M
height= 4'11" age=34 shoe-size=7.5 sex= F
height= 6'4" age=27 shoe-size=9.5 sex= M
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • what does the '1' at the end do? – yoyo31 Jan 27 '20 at 02:05
  • 1
    `1` is just shorthand for the default rule `print` `:)` – David C. Rankin Jan 27 '20 at 02:10
  • Thanks so much David. If I wanted to grab it from a variable instead of a text file, how can I use awk for that? Do I just replace '$3=$3 age' with '$3=$3 $var'? – yoyo31 Jan 27 '20 at 02:13
  • You would have to pass `var` in as variable from the shell. You can do that at the beginning with `-v varname="$var"`. However, since `awk` processes all the records, using `getline` to read from the `age.txt` file is probably your best option. (your other option would be to process the `age.txt` file first and just fill an array with the values and then process `data.txt`) – David C. Rankin Jan 27 '20 at 02:29
  • Let's say I already had a variable that I grab from, how do I place it into awk? Would it be like this? `awk -v '{varname="$var"; $3=$3 varname}1'` data.txt and lastly, how do I save that awk line, would I just have to append it to a new text file or can I use the existing one? – yoyo31 Jan 27 '20 at 02:57
  • Yes, I see what you are getting at, but the problem is `awk` is running as a single process. While you can pass a shell variable into `awk`, what is missing is a mechanism for updating a shell variable for each record processed by `awk`. Now I'm not going to say it is "blanket" impossible, but I will say, other than pre-reading all variable values into an array, reading from a file with `getline`, or passing a single shell variable in at the beginning -- I don't know of any way to do what you are wanting to do. (bummer) – David C. Rankin Jan 27 '20 at 03:05
  • The problem being when you pass `-v varname="$var"` at the beginning, its value is fixed. If you then want to get a new value into `var` for each record processed -- there isn't a way to know when `awk` is done processing each record back in the shell to make the update, and no way for `awk` to update that `var` value from the shell (other than for `awk` doing something with it internally). So where I used `getline` to read a new value into `age` each time I processed a record -- there isn't a counterpart for doing that from a shell variable per-record. – David C. Rankin Jan 27 '20 at 03:09
  • Oh, I see, thank you for that valuable insight David. This helped a lot! – yoyo31 Jan 27 '20 at 03:19
1

You could do it with a bunch of cut and paste and process substitution:

paste -d ' ' \
    <(paste -d'\0'  \
        <(cut -d' ' -f-3 data.txt) \
        age.txt) \
    <(cut -d' ' -f4- data.txt)

The outer paste command uses a blank as the delimiter to paste the output of the inner paste and the outer cut together.

The inner paste uses a null delimiter so there is no blank between age= and the age value.

This being said, the input file format seems inconsistent regarding the spacing around =.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116