2

I have a list of files stored in a text file, and if a Python file is found in that list. I want to the corresponding test file using Pytest.

My file looks like this:

/folder1/file1.txt
/folder1/file2.jpg
/folder1/file3.md
/folder1/file4.py
/folder1/folder2/file5.py

When 4th/5th files are found, I want to run the command pytest like:

pytest /folder1/test_file4.py
pytest /folder1/folder2/test_file5.py

Currently, I am using this command:

cat /workspace/filelist.txt | while read line; do if [[ $$line == *.py ]]; then exec "pytest test_$${line}"; fi; done;

which is not working correctly, as I have file path in the text as well. Any idea how to implement this?

Abhinav Dhiman
  • 745
  • 3
  • 17
  • Why do you have this list in a file? What about `pytest folder1` or `find /folder1 -type f -name '*.py' -exec pytest {} \;`? If you really need to use the file, you can do `sed -n 's/.*\.py$/pytest "&"/p' | bash` – dan Dec 16 '21 at 07:53
  • I'm pulling the list of files included in a Github Pull Request and checking whether a Python file exists in those changed files, if yes, its corresponding pytest file needs to be executed. – Abhinav Dhiman Dec 16 '21 at 07:56
  • Why does `pytest /folder1/folder2/file5.py` not have a `test_` prefix? Is that a typo? – tripleee Dec 16 '21 at 09:05
  • Yes. Corrected it. – Abhinav Dhiman Dec 16 '21 at 09:15

3 Answers3

2

Using Bash's variable substring removal to add the test_. One-liner:

$ while read line; do if [[ $line == *.py ]]; then echo "pytest ${line%/*}/test_${line##*/}"; fi; done < file

In more readable form:

while read line
do 
  if [[ $line == *.py ]]
  then 
    echo "pytest ${line%/*}/test_${line##*/}"
  fi
done < file

Output:

pytest /folder1/test_file4.py
pytest /folder1/folder2/test_file5.py

Don't know anything about the Google Cloudbuild so I'll let you experiment with the double dollar signs.

Update:

In case there are files already with test_ prefix, use this bash script that utilizes extglob in variable substring removal:

shopt -s extglob                                           # notice
while read line
do
    if [[ $line == *.py ]]
    then
        echo "pytest ${line%/*}/test_${line##*/?(test_)}"  # notice
    fi
done < file
James Brown
  • 36,089
  • 7
  • 43
  • 59
0

You may want to focus on lines that ends with ".py" string You can achieve that using grep combined with a regex so you can figure out if a line ends with .py - that eliminates the if statement.

IFS=$'\n'
for file in $(cat /workspace/filelist.txt|grep '\.py$');do pytest $file;done
inverminx
  • 65
  • 1
  • 8
  • [Don't read lines with `for`](http://mywiki.wooledge.org/DontReadLinesWithFor) – tripleee Dec 16 '21 at 08:49
  • Also, review [When to wrap quotes around a shell variable](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Dec 16 '21 at 08:55
  • @tripleee - updated the delimiter to new lines only. Which issue do you see with not quoting $file ? – inverminx Dec 16 '21 at 12:22
  • 1
    The linked question explains the problems with lack of quoting in quite a lot of detail. In brief, beginners test with trivial file names, and are surprised when their code breaks with nontrivial ones. Try with spaces, irregular spaces, shell metacharacters etc in the arguments. Roughly half of http://mywiki.wooledge.org/BashPitfalls are variations of this. – tripleee Dec 16 '21 at 12:40
0

You can easily refactor all your conditions into a simple sed script. This also gets rid of the useless cat and the similarly useless exec.

sed -n 's%[^/]*\.py$%test_&%p' /workspace/filelist.txt |
xargs -n 1 pytest

The regular expression matches anything after the last slash, which means the entire line if there is no slash; we include the .py suffix to make sure this only matches those files.

The pipe to xargs is a common way to convert standard input into command-line arguments. The -n 1 says to pass one argument at a time, rather than as many as possible. (Maybe pytest allows you to specify many tests; then, you can take out the -n 1 and let xargs pass in as many as it can fit.)

If you want to avoid adding the test_ prefix to files which already have it, one solution is to break up the sed script into two separate actions:

sed -n '/test_[^/]*\.py/p;t;s%[^/]*\.py$%test_&%p' /workspace/filelist.txt |
xargs -n 1 pytest

The first p simply prints the matches verbatim; the t says if that matched, skip the rest of the script for this input.

(MacOS / BSD sed will want a newline instead of a semicolon after the t command.)

sed is arguably a bit of a read-only language; this is already pressing towards the boundary where perhaps you would rewrite this in Awk instead.

tripleee
  • 175,061
  • 34
  • 275
  • 318