0

I am attempting to count the occurrences of a particular string "PhD" within a file 'X', take that count# (e.g. 3), and print/cat to the beginning of another file, 3 times, this particular string

Graduate StudentID 1
Graduate StudentID 2
Graduate StudentID 3

The numbers after StudentID reflect the counting.

My hopeless attempt to this point is ($OUT was supposed to the file written to) and I am not sure how to resolve the (obvious) resultant errors.

find /home/college/Applications/Graduate -name "*.inp" -exec sed 's/[PhD]//g' input | uniq -c print >$OUT {$1} \;
Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
jnorth
  • 115
  • 7
  • 1
    `[PhD]` is "any of the characters 'P', 'h' or 'D'"; you probably don't want the `[]`. Then, you can't have pipes in an `-exec` clause, you need `-exec bash -c 'command | command'` if you want it. – Benjamin W. Nov 15 '17 at 18:38
  • Thanks @BenjaminW. Is this closer to your suggestion? {find /home/college/Applications/Graduate -name "*.inp" -exec bash -c sed -i -r 's/PhD//g' 'input | uniq' echo {$1} \;} – jnorth Nov 15 '17 at 18:50
  • It's a bit confusing. Do you want to count occurrences in a single file, or in all `.inp` files contained in `/home/college/Applications/Graduate`? – Benjamin W. Nov 15 '17 at 19:08
  • right, a single file. The reason for the "find" is that eventually there will be several .inp files to iterate through. – jnorth Nov 15 '17 at 19:10
  • Is the file in `$OUT` empty, or should the new lines go on top of an existing file? – Benjamin W. Nov 15 '17 at 19:15
  • They should go on top of the existing file, yes. – jnorth Nov 15 '17 at 19:17

1 Answers1

1

Here is how I would do it:

#!/bin/bash

# Count number of occurrences
# Use -o | wc -l instead of -c to count multiple occurrences in same line
count=$(grep -ro 'PhD' --include='*.inp' /home/college/Applications/Graduate | wc -l)

# Intermediate file
tmp=$(tempfile)

# Output file
out=outfile.txt

{
    # Print header lines
    for (( i = 1; i <= count; ++i )); do
        printf '%s %d\n' 'Graduate StudentID' "$i"
    done

    # Print existing contents
    cat "$out"
} > "$tmp"

# Rename intermediate file
mv "$tmp" "$out"

This assumes that your output file name is outfile.txt

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
  • that's an elegant looking solution - a much better approach and highly customizable for other purposes. I will get on that asap to test. In the "wc -l" what is the -l used for? Thank you again. – jnorth Nov 15 '17 at 19:59
  • @jnorth It counts the lines (hence the "ell") of output from the pipe. Theoretically, `grep -c` could already do the counting, but that only counts the number of matching lines, even if they contain multiple occurrences of the search term per line. `-o` prints every match on its own line, and `wc -l` then counts the lines. – Benjamin W. Nov 15 '17 at 20:08
  • Thank you!, was doing a quick search before asking but couldn't find an accessible explanation. – jnorth Nov 15 '17 at 20:10
  • Quick question, if possible, is there a quick way to specify adding the text after a given line# or regex? ...that is ambiguous. I will study the cat function. – jnorth Nov 15 '17 at 20:19
  • @jnorth Something like https://stackoverflow.com/questions/16715373/insert-contents-of-a-file-after-specific-pattern-match or https://stackoverflow.com/questions/15559359/insert-line-after-first-match-using-sed maybe? – Benjamin W. Nov 15 '17 at 22:09
  • Cheers @BenjaminW :) – jnorth Nov 15 '17 at 22:20