0

I tried using the below command to count the number of times said string appears in a large file. (several gigs) but it only returns the number of lines that the string appears on. This is problematic for me because the string appears multiple times per line.

Is there anyway to count the number of times the string appears in a file in CMD or will this require a batch file?

find /c "findthis9=""7""" *.xml > results.txt
steelthunder
  • 438
  • 2
  • 12
  • 27

3 Answers3

1

I don't think it's possible. If you're on later windows, you could invoke powershell from commandline:

powershell -Command "&{(Get-Content c:\test.xml) | Foreach-Object {([regex]::matches( $_, 'findthis9=\"7\"'))} | Measure-Object | select -expand Count}

Just a clarification: Apart from being runnable direct from cmd, it also give you the number of the string findthis9="7" in the file test.xml.

For each line in file, match findthis9="7", measure (count) result, show only the actual number of occurrences.

S22
  • 220
  • 1
  • 6
1

This can easily be done in batch (or the command line) if you have a utility that can insert a newline before and after each occurrence of the search string. The REPL.BAT hybrid JScript/batch utility can do this very easily. REPL.BAT is pure script that will run natively on any modern Windows machine from XP onward. It performs a regex search/replace on stdin and writes the result to stdout.

<test.xml repl "(findthis9=\q7\q)" \n$1\n x | find /c "findthis9=""7"""
Community
  • 1
  • 1
dbenham
  • 127,446
  • 28
  • 251
  • 390
0

If you're using anything Windows XP or higher, you can theoretically use Windows PowerShell. If the system is Windows Vista, then you definitely can. If it is indeed XP, then you'd need to make sure PowerShell was installed first. Here's the code:

# Windows PowerShell
# All text following a '#' is a comment line, like the 'rem' keyword in cmd
$file = Get-Content MyFile.xml # you can change this to *.xml if you wish

# split the file variable on all instances of a space
$file = $file.Split(" ")

# declare the pattern
$pattern = "findthis9=""7"""
# declare a variable to use as a counter for each occurence

for ($i = 0; $i -lt $file.GetUpperBound(""); $i++)
{
    if ($file[$i] -match $pattern)
    {
        ++$counterVariable
    }
}

return $counterVariable

Also, if you turned this into a function, then you could do it by file, because you could return the filename with the number of times it appears in the file. See below:

function Count-NumberOfStringInstances()
{
    [CmdletBinding()]

    # define the parameters
    param (

    # system.string[] means array, and will allow you to enter a list of strings
    [Parameter()]
    [System.String[]]$FilePath,

    [Parameter()]
    [System.String]$TextPattern
    )

    $counterVariable = 0

    $files = Get-ChildItem -Path $FilePath

        $file = Get-Content $FilePath # you can change this to *.xml if you wish

        # split the file variable on all instances of a space
        $file = $file.Split(" ")

        # declare the pattern
        # declare a variable to use as a counter for each occurence

        for ($i = 0; $i -lt $file.GetUpperBound(""); $i++)
        {
            if ($file[$i] -match $TextPattern)
            {
                ++$counterVariable
            }
        }

        # return the counter variable

    return $counterVariable
}
PSGuy
  • 653
  • 6
  • 17