2

I have a text file containing:

http://website1.com
http://website2.com
http://website3.com
http://website4.com
http://website5.com
http://website6.com
http://website7.com
http://website8.com
http://website9.com
http://website10.com
http://website11.com
http://website12.com
http://website13.com
http://website14.com
http://website15.com

I want to group the text by five "websites" by deleting all carriage returns/line feeds except for the 5th, 10th, 15th etc. one. The output should look something like:

http://website1.comhttp://website2.comhttp://website3.comhttp://website4.comhttp://website5.com
http://website6.comhttp://website7.comhttp://website8.comhttp://website9.comhttp://website10.com
http://website11.comhttp://website12.comhttp://website13.comhttp://website14.comhttp://website15.com

What do I do to achieve this?

techdaemon
  • 215
  • 2
  • 7
  • 13

3 Answers3

1

Assuming you want a batch file, this is fairly straightforward:

@echo off
rem We need delayed expansion inside the loop
setlocal enableextensions enabledelayedexpansion
rem Initialize the variables we are going to use to avoid using stale environment vars
set LIST=
set COUNT=0
rem Iterate over the lines in the text file
for /f "delims=" %%l in (list.txt) do (
  rem Append the current line to the list
  set LIST=!LIST!%%l
  rem Count how many we got
  set /a COUNT+=1
  rem If we have five items already
  set /a "COUNT%%=5"
  if !COUNT!==0 (
    rem Output them and reset the list
    echo !LIST!
    set LIST=
  )
)
rem Output the remainder if the list does not contain k×5 lines
if defined LIST echo %LIST%

Redirect that batch's output to another file and copy it over your old one if needed (never redirect to your input file :-)).

A variant that directly writes a new output file (list_new.txt):

@echo off
setlocal enableextensions enabledelayedexpansion
set LIST=
set COUNT=0
del list_new.txt
for /f "delims=" %%l in (list.txt) do (
  set LIST=!LIST!%%l
  set /a COUNT+=1
  set /a "COUNT%%=5"
  if !COUNT!==0 (
    >>list_new.txt echo !LIST!
    set LIST=
  )
)
if defined LIST >>list_new.txt echo %LIST%
Joey
  • 344,408
  • 85
  • 689
  • 683
  • i renamed the text file to list.txt. i saved your batch script to something.bat and run it but it doesn't work. – techdaemon Mar 29 '11 at 06:13
  • @techdaemon, can you be more specific than that? Any error messages? One thing that coulr be problematic could be URIs with `&` in them, I guess. – Joey Mar 29 '11 at 06:15
  • it didnt change anything in the list.txt file nor did it create a separate output file. – techdaemon Mar 29 '11 at 06:21
  • @techdaemon, that is to be expected. It prints the new contents on standard output. You're supposed to *redirect the output* into a file (that's why I said that). I'll adapt to directly write a separate file, though. – Joey Mar 29 '11 at 06:40
  • As expected Joey's code works fine, also with `&`, but it removes the `!`, I assume this can be ignored here, as `!` isn't allowed in domain names – jeb Mar 29 '11 at 06:44
  • @jeb: Elsewhere in URIs maybe? I didn't test very thoroughly, to be honest. – Joey Mar 29 '11 at 06:47
  • @jeb: It also has problems with `%`. Still unsure how I can deal with that reliably. – Joey Mar 29 '11 at 07:09
  • @Joey: My tests works also with `%`. But you should change the last line `echo %LIST%` to `echo !LIST!`, else you got problems with the characters `&|"<>`. To go the safe way, you could use the delayed toggling technic – jeb Mar 29 '11 at 07:47
  • @jeb: Well, I adapted it to a subroutine by now, solving the problem with `!`. It's now more or less a straight port of the `bash` solution that now was deleted, using ` – Joey Mar 29 '11 at 08:00
0

If you have a choice, here's a Ruby one liner

C:\work> ruby -ne 'print $.%5==0? $_ :$_.chomp' file
http://website1.comhttp://website2.comhttp://website3.comhttp://website4.comhttp://website5.com
http://website6.comhttp://website7.comhttp://website8.comhttp://website9.comhttp://website10.com
http://website11.comhttp://website12.comhttp://website13.comhttp://website14.comhttp://website15.com
kurumi
  • 25,121
  • 5
  • 44
  • 52
0

Based on Joey's solution, this is only for the safe handling all special characters %&|<>" and also !^.

This is only neccessary, if you expect ! in your file data.
In any other case, Joey's code is better and easier to read.

@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem Initialize the variables we are going to use to avoid using stale environment vars
set LIST=
set COUNT=0
rem Iterate over the lines in the text file
rem We need toggling the delayed expansion inside the loop
rem always disabled if using %%l, enabled for using the variables
for /f "delims=" %%l in (list.txt) do (
  rem Append the current line to the list, %%l is only safe if delayed expansion is disabled
  set "line=%%l"
  setlocal EnableDelayedExpansion
  rem To use the line variable, delayed expansion has to be enabled
  for %%a in ("!LIST!!line!") do (
    endlocal
    set "LIST=%%~a"
  )

  set /a COUNT+=1
  rem Count how many we got
  rem If we have five items already
  setlocal EnableDelayedExpansion
  if !COUNT! GEQ 5 (
    rem Output them and reset the list
    echo(!LIST!
    endlocal
    set "LIST="
    set COUNT=0
  ) ELSE ( 
    endlocal 
  )
)
setlocal EnableDelayedExpansion
rem Output the remainder if the list does not contain k×5 lines
if defined LIST echo(!LIST!

Why it is so complicated?

The problem is, that %%a (FOR-Loop-Variables) are expanded just before the delayed expansion is executed. You get problems if the content of %%a contains any ! and then you lose also ^ (only if one or more ! exists).
But you need the delayed expansion to show or compare the content of variables inside of the for-loop (forget about call %%var%%).
Expansion with the delayed syntax !variable! is always safe, independent of the content, as it is the last phase of the parser.

But unfortunately enabling/disabling the delayed exp. always creates a new variable context, when leaving this context you lose all changes of the variables.
Therefore I use the inner FOR-Loop to passing from the enabledDelayed-Context back to the disabledDelayed-Context, so the LIST-var contains the correct data.

hope someone understand what I try to explain.
Some more explanations about phases are at how cmd.exe parse scripts

Community
  • 1
  • 1
jeb
  • 78,592
  • 17
  • 171
  • 225