-2

i have a huge text file and i want to delete certain portions of it between two certain words. e.g:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta nobis eleifend option congue nihil imperdiet doming id quod mazim placerat facer possim assum. Typi non habent claritatem insitam; est usus legentis in iis qui facit eorum claritatem. Investigationes demonstraverunt lectores legere me lius quod ii legunt saepius. Claritas est etiam processus dynamicus, qui sequitur mutationem consuetudium lectorum. Mirum est notare quam littera gothica.

delete between "guis" and "gothica" words, it becomes:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis gothica.

actually in the huge files there are lots of "gui" s and "gothica" s , i have to get rid of all of them.

this can be achieved with a simple batch script but i am so strange to the subject. TIA if anyone helps.

SaiKiran
  • 6,244
  • 11
  • 43
  • 76
sinemgul
  • 3
  • 2
  • 1
    Out of curiosity, why a batch file? What have you tried so far? – user812786 Sep 22 '16 at 14:29
  • i tried type file.txt | findstr /v guis | findstr /v gothica but it deletes only those lines including guis and gothica. i need to delete all lines between them. i hope you can help after this explanation. – sinemgul Sep 22 '16 at 14:44
  • Please do not confuse StackOverflow with a free code writing service! So show us what you have tried and describe precisely what you are encountering trouble with. Please read the entire [tour page](http://stackoverflow.com/tour) and learn how this site works! – aschipfl Sep 22 '16 at 14:54
  • Ok, I see. The text you pasted shows up as all on one line, is that actually how the file looks? BTW, you may want to [edit] that into your question (that's probably why you got a downvote, people here like to see what you have tried on your own first). I'll try a couple things, but in the meantime this looks like it might be relevant? http://stackoverflow.com/questions/33638832/batch-file-find-two-lines-then-copy-everything-between-those-lines – user812786 Sep 22 '16 at 14:55
  • 1
    Your premise that this can be achieved via a simple batch file is false. It can be done (with some limitations, and slowly), but it is far from simple. In general, batch is not a good choice for manipulating text files. Simple solutions have many limitations. A robust solution requires a lot of arcane code. – dbenham Sep 22 '16 at 14:58
  • @whrrgarbl yes the topic is the just what i need, thanks. i tried the code trying to get the logic behind it, but it does not seem to work. and the answer has not been accepted by anyone. i am doubtful the script is correct. – sinemgul Sep 22 '16 at 15:15
  • @ dbenham but i need a simple solution, not a robust solution for now. – sinemgul Sep 22 '16 at 15:16
  • @whrrgarbl Perfect! thank you very much! – sinemgul Sep 23 '16 at 13:38

1 Answers1

0

Here is the simplest solution I came up with, which I'm sure has issues with things like special characters, but works with the given example. I used filenames input.txt and output.txt.

@echo off
setlocal disableDelayedExpansion
set "FLAG=FALSE"
:: Define LF to contain a newline character
set LF=^


:: Do not remove above lines!
> output.txt (
    for /f "eol= tokens=*" %%A in (input.txt) do (
        set "ln=%%A"
        setlocal enableDelayedExpansion
        for %%L in ("!LF!") do (
            for /f "eol= delims=., " %%W in ("!ln: =%%~L!") do (
                if "%%W"=="quis" (
                    set "FLAG=TRUE"
                    <nul set /p=%%W 
                ) else if "%%W"=="gothica" (
                    <nul set /p=%%W 
                    set "FLAG=FALSE"
                ) else if "!FLAG!"=="FALSE" (
                    <nul set /p=%%W 
                )
            )
        )
        endlocal
    )
)

This goes through each word and prints them out until it finds quis, and resumes printing out after it finds gothica. I used <nul set /p=%%W to echo without printing a newline (see 2nd and 3rd links), which has a side effect of printing out an extra space at the end of the file, so be aware of that.

References:

Community
  • 1
  • 1
user812786
  • 4,302
  • 5
  • 38
  • 50