I've a 6GB XML file that has only one line (verified with wc -l file.xml
)
This is the command I'm using : grep -o '<wd:Report_Entry>' file.xml | wc -l
and it's outputting 446441
. This is supposed to be the right command as mentioned at https://stackoverflow.com/a/14510665/5524175.
The correct count is 1521620
. Surprisingly, this rust solution gives the right count. count_occurences '<wd:Report_Entry>' file.xml
gives 1521620
.
Also, the following command mentioned in this accepted answer also gives 446441
.
sed 's/<wd:Report_Entry>/<wd:Report_Entry>\n/g' file.xml | grep -c "<wd:Report_Entry>"
I'm not sure what I'm missing. Escape characters like < or > or :
? I'm on macOS. This is my grep version.
➜ ~ grep --version
grep (BSD grep, GNU compatible) 2.6.0-FreeBSD