0

I am a bit grounded, I need to convert some Mass Spec Data with this program:

https://ccms-ucsd.github.io/GNPSDocumentation/fileconversion/ Data Conversion (Traditional)

My slightly modified code is this (works):

the original is working in a .bat file. That might be a problem--

cd 1_Input_Folder

FOR %%i IN (*.mzXML, *.mzML, .raw, *.d, *.raw, *.RAW, *.wiff) DO (
..\Installation\pwizLibraries-and-Installation\pwiz_Leave-Alone\msconvert %%i --filter "peakPicking true 1-" --64 --mzXML -o ..\2_Output_Folder --outfile %%~ni.mzXML >> ../log.txt 2>&1
)
FOR /D %%i IN (*.mzXML, *.mzML,*.raw, *.d, *.raw, *.RAW, *.wiff) DO (
..\Installation\pwizLibraries-and-Installation\pwiz_Leave-Alone\msconvert %%i --filter "peakPicking true 1-" --64 --mzXML -o ..\2_Output_Folder --outfile %%~ni.mzXML >> ../log.txt 2>&1
)
cd ..

and my problem is that my input file contains multiple files in one file, so in my output I get the last extracted file.

I was thinking of a counter like this: https://www.rgagnon.com/gp/gp-batch-increment-a-counter.html

But I can't really figure out how to implement my existing code into it.

So I have modified it a bit as I can't figure out the process with a counter. Now my problem is that it generates the file with the date (seems good) but overwrites it. I am missing a part in the script here :/


@echo

cd "D:\@Convert_Analyst\GNPS_Vendor_Conversion\1_Input_Folder"

set Time=%time:~0,2%.%time:~3,2%.%time:~6,2%

FOR %%i IN (*.mzXML, *.mzML, .raw, *.d, *.raw, *.RAW, *.wiff) DO (..\Installation\pwizLibraries-and-Installation\pwiz_Leave-Alone\msconvert %%i --filter "peakPicking true 1-" --64 --mzXML -o ..\2_Output_Folder --outfile %%~ni_%Time%.mzXML >> ../log.txt 2>&1)

FOR /D %%i IN (*.mzXML, *.mzML,*.raw, *.d, *.raw, *.RAW, *.wiff) DO (..\Installation\pwizLibraries-and-Installation\pwiz_Leave-Alone\msconvert %%i --filter "peakPicking true 1-" --64 --mzXML -o ..\2_Output_Folder --outfile %%~ni_%Time%___.mzXML >> ../log.txt 2>&1)

I am guessing I need to add the code in a loop but I don't know how :/

Edit: the folder cd "D:\@Convert_Analyst\GNPS_Vendor_Conversion\1_Input_Folder" starts with @ as the folder name.

Anyone has an idea?

BR Tim

Timmmsa
  • 45
  • 8

1 Answers1

0

The following batch file can be used for this task using dynamic variable TIME.

@echo off
setlocal EnableExtensions DisableDelayedExpansion
cd /D "D:\@Convert_Analyst\GNPS_Vendor_Conversion\1_Input_Folder" || exit /B
md "..\2_Output_Folder" 2>nul
if not exist "..\2_Output_Folder\" exit /B
del ..\log.txt 2>nul
for /R %%i in (*.mzXML *.mzML *.d *.raw *.wiff) do (
    set "FullName=%%i"
    set "FileName=%%~ni"
    setlocal EnableDelayedExpansion
    set "NameTime=!TIME:~0,2!.!TIME:~3,2!.!TIME:~6,2!"
    ..\Installation\pwizLibraries-and-Installation\pwiz_Leave-Alone\msconvert.exe "!FullName!" --filter "peakPicking true 1-" --64 --mzXML -o ..\2_Output_Folder --outfile "!FileName!_!NameTime!.mzXML" >> ..\log.txt 2>&1
    endlocal
)
endlocal

This solution does not work 100% as expected on my Windows machine because of echo %TIME% executed in a command prompt window outputs  8:31:17,29 and for that reason the command line echo %TIME:~0,2%.%TIME:~3,2%.%TIME:~6,2% outputs  8.31.17 which means the leading 0 is missing on hour less than 10 (with time using 24 hours format).

A solution working on any Windows PC with Windows Vista or newer is:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
cd /D "D:\@Convert_Analyst\GNPS_Vendor_Conversion\1_Input_Folder" || exit /B
md "..\2_Output_Folder" 2>nul
if not exist "..\2_Output_Folder\" exit /B
del ..\log.txt 2>nul
for /R %%i in (*.mzXML *.mzML *.d *.raw *.wiff) do (
    set "FileProcessed="
    for /F "tokens=4-6 delims=/: " %%j in ('%SystemRoot%\System32\robocopy.exe "%SystemDrive%\|" . /NJH') do if not defined FileProcessed (
        ..\Installation\pwizLibraries-and-Installation\pwiz_Leave-Alone\msconvert.exe "%%i" --filter "peakPicking true 1-" --64 --mzXML -o ..\2_Output_Folder --outfile "%%~ni_%%j.%%k.%%l.mzXML" >> ..\log.txt 2>&1
        set "FileProcessed=1"
    )
)
endlocal

This solution uses the time output by ROBOCOPY on an error which is independent on Windows time and language settings for the used account. For more details about the ROBOCOPY approach see the answers at Time is set incorrectly after midnight.

Note: If two files with same name are processed within one second, the second file has the some file name in output folder as the output file for the first file which means overwriting the output file of first file with same name processed in same second.

It is expected by me that the file msconvert has the file extension .exe which is appended in both batch files to reference the file to execute with complete name of file.

The reasons for the not working code in the question are:

  1. The CD command in second code example posted in question works only if the current drive on execution of the batch file is drive D: as otherwise the command CD does not change the drive and the directory. The solution is using the option /D to change drive and directory.

  2. The command line set Time=%time:~0,2%.%time:~3,2%.%time:~6,2% defines only once the environment variable Time based on current value of dynamic variable TIME which makes it impossible in further script to get current time from the dynamic variable TIME. See Difference between Dynamic Environment Variables and Normal Environment Variables in CMD for more details. There should be never used the name of a dynamic variable or a predefined Windows environment variable inside a batch file for a new environment variable.

  3. Command and space are both interpreted as separator between multiple wildcard patterns. So it is enough to use either a space or a comma as separator between multiple wildcard patterns.

  4. Windows interprets file/folder names and wildcard patterns always case-insensitive. For that reason *.raw matches the same files as *.RAW.

  5. .raw is not a wildcard pattern because of not containing either * or ?. For that reason the string .raw is assigned to the loop variable i instead of searching for files with file extension .raw and assigning the file names of found files one after the other to the specified loop variable i. So the usage of .raw in code in question results in running once the first FOR loop with .raw as file name assigned to i independent on existence of a file with file extension .raw.

  6. The time string to use in the output file names is referenced with immediate expansion which means %Time% is replaced already by the string assigned to the environment variable Time before FOR is executed at all. See Variables are not behaving as expected for more details about delayed expansion as used by the first code posted in this answer.

  7. While the first FOR loop in second code of the question searches for non-hidden files in current directory matching one of the wildcard patterns (with exception of .raw not being a wildcard pattern), the FOR /D loop searches in current directory for non-hidden directories with a directory name matching one of the wildcard patterns. That does not make sense at all as files should be processed and not directories.

Both posted solutions search recursive in the current directory and all its subdirectories for non-hidden files of which name is matched by one of the wildcard patterns to process these files and produce the output file with current time in file name which includes hour, minute and second.

To understand the commands used and how they work, open a command prompt window, execute there the following commands, and read the displayed help pages for each command, entirely and carefully.

  • cd /?
  • del /?
  • echo /?
  • endlocal /?
  • exit /?
  • for /?
  • if /?
  • md /?
  • robocopy /? (although not really used here for copying/moving files/folders)
  • set /?
  • setlocal /?

See also:

Mofi
  • 46,139
  • 17
  • 80
  • 143
  • Hello, thank you very much for your help, but it still doesn't work, the file still keeps overrwriting itself :/ – Timmmsa Sep 23 '21 at 07:45
  • @TimSeidel I have not installed `..\Installation\pwizLibraries-and-Installation\pwiz_Leave-Alone\msconvert.exe` and don´t know anything about this executable and its options. I tested the two scripts with inserting `echo` left to the command line running this executable to see in `..\log.txt` the command line as it would be executed. So if the batch file results still in overwriting files, then this is caused by this executable and how it is called respectively the executable is executed for multiple files with same file name within same second as I wrote in the answer. – Mofi Sep 23 '21 at 12:15
  • Hello @Mofi, yes that seems to be an executable Problem. I have contacted the owner on Github and we a re having a meeting to see if that issue can be resolved. Thanks so much for your time :) – Timmmsa Sep 27 '21 at 11:31