-2

I have no idea about batch file and Windows command line and I am trying to write a batch file which would run at a given folder location and validate filenames of all the files present there.

File names can be validated using regex:

^[a-zA-Z0-9]{1,20}\-[a-zA-Z0-9]{1,40}\-[0-9]{8}(\-[0-9]+)?\.[a-zA-Z]{2,4}$

Given regex is correct.

Or if regex is not useful, in human language, the file name should be:

alphaNumerricMax20char-alphaNumerricMax40char-8charDateofFormatyyyyMMdd.extension

I have previously written a C# code which is as follows:

        rtbResult.Text = string.Empty;
        List<string> fileNamesNotValid = new List<string>();
        string[] allFilesInDirectory = new string[0];
        try
        {
            if (Directory.Exists(txtFolderLocation.Text))
            {
                allFilesInDirectory = Directory.GetFiles(txtFolderLocation.Text);
                foreach (string file in allFilesInDirectory)
                {
                    var fileName = Path.GetFileName(file);

                    if (Regex.IsMatch(fileName, @"^[a-zA-Z0-9]{1,20}\-[a-zA-Z0-9]{1,40}\-[0-9]{8}(\-[0-9]+)?\.[a-zA-Z]{2,4}$"))
                    {
                        var dataComponent = fileName.Split('-')[2].Split('.')[0];
                        try
                        {
                            DateTime date = DateTime.ParseExact(dataComponent, "yyyyMMdd",
                                                      System.Globalization.CultureInfo.InvariantCulture);
                            if (date > DateTime.Now)
                            {
                                fileNamesNotValid.Add(fileName);
                            }
                        }
                        catch (Exception)
                        {
                            fileNamesNotValid.Add(fileName);
                        }

                    }
                    else
                    {
                        fileNamesNotValid.Add(fileName);
                    }
                }
            }
        }
        catch (Exception ex)
        {
            MessageBox.Show(ex.Message);
        }
        foreach (var item in fileNamesNotValid)
        {
            rtbResult.Text += item + "\n";
        }

        MessageBox.Show("Process Completed!!\n Total Files proccessed = " + allFilesInDirectory.Count() + "\n Total Invalid Files = " + fileNamesNotValid.Count);
Mofi
  • 46,139
  • 17
  • 80
  • 143
  • 4
    hi, since this is a Q&A forum; what is your exact question in here? – Mong Zhu Jun 10 '16 at 09:07
  • 1
    It is not clear what you want as you already seem to have the code to do it - you could run that from a batch file. If you are asking how you would perform an action on each file in a folder using a batch file then look at this : http://stackoverflow.com/questions/180741/how-to-do-something-to-each-file-in-a-directory-with-a-batch-script – PaulF Jun 10 '16 at 09:17

1 Answers1

0

This one was quite the challenge, and I'm not entirely sure it does exactly what the OP asked for, but I just couldn't help but try and come up with some kind of solution. Since there is no way to fully test file names with actual Regular Expressions, you have to be a bit creative.

The batch starts by creating a listing of all file names. It then iterates through all those file names, chopping each into 4 components by using the dashes and the period:

  • AlphaNumeric (Max 20 char)
  • AlphaNumeric (Max 40 char)
  • Date (8 digits)
  • Extension (at least 2 alphabetic characters)

If there are more than 4 components in the name, an error is raised. This automatically takes care of spaces in the file name.

Each of the components is first checked against its maximum allowed length. If it passes that check, it is then fed to FINDSTR to check for any character NOT in the provided list of characters (which is where the FindStr limitations on character classes are really evident, as you can't negate a full character set such as [^0-9], for instance, but oh well...). If any unwanted character is found in the component, an error is raised.

All filenames that pass all checks are written to the Success.txt file. All those with errors are written to the Error.txt file.

@Echo off
setlocal EnableDelayedExpansion
if exist error.txt del error.txt
if exist success.txt del success.txt
dir /b *.c* > tmp.tmp
set success=false
for /f %%i in (tmp.tmp) do (
  for /f "tokens=1,2,3,4* delims=-." %%a in ("%%i") do (
    REM 5 chunks means there's an error
    REM Spaces in filename will automatically cause an error (which is OK)
    if "%%e"=="" (
      REM less than 4 chunks means there's an error
      if "%%d"=="" (
        call :errProcess %%i "ERROR, missing chunks"
      ) else (
        REM At this stage we need to validate if:
        REM %%a contains [a-Z0-9]{1,20}
        REM %%b contains [a-Z0-9]{1,40}
        REM %%c contains [0-9]{8}
        REM %%d contains [a-Z]{2}
        set tmp=%%a
        if "!tmp:~20,1!" == "" (
          call :checkRegEx %%i %%a abcdefghijklmnopqrstuvwxyz0123456789
          if !success!==true (
            set tmp=%%b
            if "!tmp:~40,1!" == "" (
              call :checkRegEx %%i %%b abcdefghijklmnopqrstuvwxyz0123456789
              if !success!==true (
                set tmp=%%c
                if "!tmp:~8,1!" == "" (
                  call :checkRegEx %%i %%c 0123456789
                  if !success!==true (
                    set tmp=%%d
                    if "!tmp:~1,1!" == "" (
                      REM Extension must be at least 2 characters long
                      call :errProcess %%i "ERROR, ^"!tmp!^" is LSS than 2"
                    ) else (
                      call :checkRegEx %%i %%d abcdefghijklmnopqrstuvwxyz
                      if !success!==true (
                        call :success %%i
                      )
                    )
                  )
                ) else (
                  call :errProcess %%i "ERROR, ^"!tmp!^" is GTR than 8"
                )
              )
            ) else (
              call :errProcess %%i "ERROR, ^"!tmp!^" is GTR than 40"
            )
          )
        ) else (
          call :errProcess %%i "ERROR, ^"!tmp!^" is GTR than 20"
        )
      )
    ) else (
      call :errProcess %%i "ERROR, too many chunks"
    )
  )
  rem FindStr /R "[a]" %%i 

)
del tmp.tmp
goto :eof

:errProcess
set strTmp=%2
Echo %1 : %strTmp:~1,-1%
Echo %1 >> Error.txt
goto :eof

:success
Echo %1 : SUCCESS
Echo %1 >> Success.txt
goto :eof

:checkRegEx
REM checkregEx filename "string" [regEx] (always a negated RegEx)
set zeFile=%1
set zeString=%2
set zeRegEx=%3
set success=false
set unwantedChar=
FOR /F "tokens=*" %%i IN ('echo %zeString%^| findstr /i "[^%zeRegEx%]"') DO (
  SET unwantedChar=%%i
  call :errProcess %zeFile% "ERROR, %zeString% does not match RegEx"
)
if "!unwantedChar!"=="" (
  set success=true
)
set zeString=
set zeRegEx=
goto :eof
Filipus
  • 520
  • 4
  • 12