3

Is it also possible to ignore some duplicate lines while removing other duplicates from an xml file, example: if my abx.xml is CODE:

@echo off
setlocal disableDelayedExpansion
set "file=%~1"
set "line=%file%.line"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^


::The 2 blank lines above are critical, do not remove
>"%deduped%" (
  for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%file%") do (
    set "ln=%%A"
    setlocal enableDelayedExpansion
    >"%line%" (echo !ln:\=\\!)
    >nul findstr /xlg:"%line%" "%deduped%" || (echo !ln!)
    endlocal
  )
)
>nul move /y "%deduped%" "%file%"
2>nul del "%line%"

Only BATCH SCRIPT PLEASE.

<bookstores>
   <book id="parent">
      <name="it1"/>
      <name="it1"/>
      <name="it2"/>
   </book>
   <book id="child">
      <name="it1"/>
      <name="it1"/>
      <name="it2"/>
      <name="it3"/>
   </book>     
</bookstores>

Output should be:

<bookstores>
   <book id="parent">
      <name="it1"/>
      <name="it2"/>
   </book>
   <book id="child">
      <name="it3"/>
   </book>     
</bookstores>

But the output i am getting is: NOTE: </book> tag is removed.

<bookstores>
   <book id="parent">
      <name="it1"/>
      <name="it2"/>
   </book>
   <book id="child">
      <name="it3"/>

</bookstores>

I have searched couple of simillar requests but most of them are deleting all duplicate lines,but not sure how to ignore some duplicate lines:

Batch to remove duplicate rows from text file

Community
  • 1
  • 1
kumar
  • 389
  • 1
  • 9
  • 28
  • You are trying to treat XML as a plain text file. Which it is, and it isn't. XML is a structure, and the link you posted is for a non-structured file. Sometimes you have to use the right tool for the job. Something like [`XSLT`](http://stackoverflow.com/questions/355691/how-to-remove-duplicate-xml-nodes-using-xslt) would be far more appropriate for this. – Gray Jul 25 '13 at 13:55
  • Post your code so that we don't have to re-write it. – RGuggisberg Jul 25 '13 at 14:00
  • http://stackoverflow.com/questions/11689689/batch-to-remove-duplicate-rows-from-text-file/17859683#17859683 the given link is having the code. – kumar Jul 25 '13 at 14:04
  • Please add code to your question. It is not readable in the comment box. – Endoro Jul 25 '13 at 14:05
  • code added to the question. – kumar Jul 25 '13 at 14:09
  • @Gray can you give a sample XSLT code for the following `` example. – kumar Jul 25 '13 at 14:11
  • @phani No disrespect intended, but sorry, I don't want to do that, no. My suggestion is to take a look at the link I gave (maybe you didn't notice it was a link, I didn't make it very clear) http://stackoverflow.com/questions/355691/how-to-remove-duplicate-xml-nodes-using-xslt If you decide to go with something like that, and can't figure it out, post a new question to SO with what you tried, but with the XSLT tag. I'm sure you will get prompt help, you just have to make the effort. – Gray Jul 25 '13 at 14:46
  • Thanks @Gary, looks like the approach that you suggested me is working. – kumar Jul 25 '13 at 16:23

1 Answers1

3

This might work for you, if you put lines always to print in the %dict% file:

@ECHO OFF &SETLOCAL ENABLEDELAYEDEXPANSION
SET "file=file"
SET "new=new"
SET "dict=dictionary"

(FOR /f "tokens=1*delims=:" %%a IN ('findstr /n "^" "%file%"') DO (
    SET "nr=%%a"
    SET "line=%%b"
    SET "this="
    FINDSTR /l "!line!" "%dict%" >NUL 2>&1&& ECHO(!line! || (
        FOR /f "tokens=1*delims==" %%x IN ('set "$" 2^>nul') DO IF !line!==%%y SET "this=1"
        IF "!this!"=="" (
            ECHO(!line!
            SET "$!nr!=!line!"
        )
    )
))>"%new%"
TYPE "%new%"

..shell session:

    >type file
    <bookstores>
       <book id="parent">
          <name="it1"/>
          <name="it1"/>
          <name="it2"/>
       </book>
       <book id="child">
          <name="it1"/>
          <name="it1"/>
          <name="it2"/>
          <name="it3"/>
       </book>
    </bookstores>

    >type dictionary
    </book>

    >script.bat
    <bookstores>
       <book id="parent">
          <name="it1"/>
          <name="it2"/>
       </book>
       <book id="child">
          <name="it3"/>
       </book>
    </bookstores>
Endoro
  • 37,015
  • 8
  • 50
  • 63
  • Thanks for the snippet, Since this is an xml file, i have decided to use XSLT to process the xml file. Appreciate your help. – kumar Jul 25 '13 at 16:25