1

I have a file structure that looks like this:

  • Folder1
    • file1.feature
    • file2.feature
    • file3.feature
  • Folder2
    • file1.feature
    • file2.feature
    • ...etc.

The files are Behat feature files which look like this:

Scenario: I am filling out a form
    Given I am logged in as User
    And I fill in "Name" with "My name"
    Then I fill in "Email" with "myemail@example.com"

I am trying to iterate over each file within the file structure to get matches on my regex:

/I fill in "[^"]+" with "([^"]+)"/gm

The regex looks for I fill in "x" with "y", and I would like to store the capture group "y" from each file where a line in the file matches the expression.

So far I can iterate through the folders and print out the file names in mt Bash script like so:

#!/bin/bash

cd behat/features

files="*/*.feature"


for f in $files
do
    echo ${f}
done

I am trying to retrieve the capture group using Sed currently by doing this in my loop:

sed -r 's/^I fill in \"[^)]+\" with \"([^)]+)\"$/\1/'

But I fear that I am going down the wrong track, as this is returning all of the file content throughout all the files.

party-ring
  • 1,761
  • 1
  • 16
  • 38

1 Answers1

2

You may use

cd behat/features && find . -name *.feature -type f -print0 | xargs -0 \
  sed -E -n 's/.*I fill in "[^"]+" with "([^"]+)"/\1/p' > outfile

This command "goes" to behat/features directory, finds all files with feature extension (recursively) and then prints the capture group #1 values matched with your regex as -n option suppresses the output of lines and p flag only outputs what remains after a replacement.

See more specific solutions for recursive file matching at How to do a recursive find/replace of a string with awk or sed? if need be.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563