I'm using a batch file to format a text file so that I can append it to an already populated .csv file automatically each hour. My problem is, the text file's intended formatting isn't showing up in Notepad. It opens as a single line with 43 tokens. I only need tokens 30 - 43. Since I can not skip lines and I'm over the token limit, what are my options?
-
Your best option is to not use batch. There are many better choices, Powershell? Python? – abelenky Mar 02 '17 at 02:27
-
batch is not suitable for complex processing. A hybrid solution or powershell would be much better – phuclv Mar 02 '17 at 03:18
-
Please mark dbenham's answer as the answer for your question. – Squashman Mar 13 '17 at 01:47
1 Answers
You don't state what your token delimiter is, or whether any of your values contain quoted delimiters, or whether any tokens are empty (consecutive delimiters with nothing between them).
Pure batch solution
I will assume that your input is comma delimited. It is trivial to change the delimiter used by FOR /F
.
I will also assume no values contain comma literals, and there are no consecutive commas. A simple FOR /F
cannot handle either situation. Both could be solved with batch, (assuming the line is < 8kb), but it is a bit painful. If you have these issues, then I think you are better off with some other language than batch.
A single FOR /F
command cannot parse more than 31 tokens. For more info, see
- https://stackoverflow.com/a/8520993/1012053
- Number of tokens limit in a FOR command in a Windows batch script
But you don't need to parse any tokens past 29 :-)
You can simply use FOR /F
with "delims=29*"
, and the *
"token" will contain tokens 30-43.
for /f "usebackq tokens=29* delims=," %%A in ("yourInputFile.ext") do (echo(%%B) >>yourOutput.csv
If your input delimiter is something other than a comma, then you can store tokens 30-43 in a variable, and then use find/replace to substitute commas for the pipes.
For example, if your input parameter is a pipe, then
@echo off
setlocal
for /f "usebackq tokens=29* delims=|" %%A in ("yourInputFile.ext") do set "line=%%B"
setlocal enableDelayedExpansion
if defined line set "line=!line:|=,!"
(echo(!line!) >>yourOutput.csv
I did not put the manipulation and write operations within the loop because you state your input has only one line.
I do not enable delayed expansion until after the loop completes just in case your input contains !
literals. Expansion of FOR /F
variables will corrupt !
values if delayed expansion is enabled.
Robust JREPL.BAT solution (hybrid batch/JScript)
If your input violates any of the restrictions that I laid out in the pure batch solution, then you could use JREPL.BAT - A regular expression command line text processing utility. JREPL.BAT is pure script (hybrid batch/JScript) that runs natively on any Windows machine from XP onward - no 3rd party exe file required.
Since you did not specify your input format, I will assume it is CSV. The following solution will simply remove the first 29 tokens. It supports empty tokens, as well as quoted tokens with comma literals.
call jrepl "^(\q([^\q]|\q\q)*\q,|[^,]*,){29}" "" /x /f yourFile.ext >>yourOutput.csv
I'm sure there is an efficient JREPL solution if your input is not csv format. But I am not going to waste my time trying to guess your format.
-
-
-
Arrived at work with a beautiful solution here waiting for me! Thanks so much! Your one line of code works perfectly. After your code I added JREPL to substitute double commas ",," with single comma delims "," and now everything is good. – Augustus DeVandry Mar 02 '17 at 17:15