0

I'm trying to search through my codebase to find files that have an inconsistent indentation. Basically, I don't care if a file is indented with tabs or spaces, as long as it is internally consistent.

Obviously, I can run grep -Prn "^\t" src to find lines that begin with tabs, and grep -Prn "^ " src for spaces, but I don't know how to search for files that contain at least 1 match for both patterns.

The best I can come up with something like

for f in `grep -Prl "^\t" src` ; do grep -Pl "^ " $f; done

but that is incredibly slow for a large codebase. Is there a faster way to do this with a single grep command?

kvantour
  • 25,269
  • 4
  • 47
  • 72
ewok
  • 20,148
  • 51
  • 149
  • 254
  • 1
    [edit] your question to include a [mcve] with concise, testable sample input and expected output. Clarify what it means for a file to be "internally consistent". – Ed Morton May 07 '19 at 18:15

1 Answers1

0

Are you trying to find the names of files under your src directory that have both tabs and blanks in the indentation (possibly on separate lines)? That'd be this using any find and any awk in any shell on any UNIX box:

find src -type f -exec awk '/^[[:space:]]*\t/{t=1} /^[[:space:]]* /{b=1} b&&t{print FILENAME; exit}' {} \;

otherwise more efficiently (since awk will be called for batches of files instead of 1 at a time) with GNU find and GNU awk:

find src -type f -exec awk '/^[[:space:]]*\t/{t=1} /^[[:space:]]* /{b=1} b&&t{print FILENAME; b=t=0; nextfile}' {} +
Ed Morton
  • 188,023
  • 17
  • 78
  • 185