0

I have several files containing the following XML element:

     <table cellpadding="0" cellspacing="0" border="0"style="width:100%">

The part that says border="0"style=" needs a space between the 0 value and style attribute.

Unfortunately there are too many files with this issue to make manually going and inserting the space a viable option. I can edit attributes and I can edit values by creating an Xpath that gets the table as a NodeList, creates a node and gets the attributes.. but how would I add a space between the attribute and the value??

Ian Roberts
  • 120,891
  • 16
  • 170
  • 183
vessel
  • 29
  • 8
  • 1
    I don't think you can use XPath, because your input file won't manage to be successfully parsed. I think you should use simple text replacement, programmatically with any language you like. – potame Apr 14 '15 at 07:28
  • ahh okay.. will look into it, cheers.. – vessel Apr 14 '15 at 07:29
  • @vessel - do you know if the border is always set to a 1 char value? if either border or style have a consistent char length for value, this is super easy – that-ru551an-guy Apr 14 '15 at 07:29
  • 1
    Take a look at this question - http://unix.stackexchange.com/questions/112023/how-can-i-replace-a-string-in-a-files – ingenious Apr 14 '15 at 07:29
  • How about using a shell tool like `sed`? –  Apr 14 '15 at 07:30

3 Answers3

1

We could always just String.split("\""); aka split on the commas.

Here, try this:

/** In reality, you would probably read file to string? 
 * or read line by line? either way is an easy fix! 
*/

String input = ("<table cellpadding=\"0\" cellspacing=\"0\" border=\"0\"style=\"width:100%\">");
String xmlTag = StringUtils.substringBetween(input, "<", ">");

Starting with index number, array after split contains as follows:

  1. XML Tag Name

ODD INDICES ~ 1, 3, 5, and so on, contain: attribute name.

EVEN INDICES ~ 2, 4, 6, and so on, contain: attribute value.

    int arrSize = xmlCharValPairs.length()        
    String[] xmlCharValPairs = xmlTag.split("\"");
    StringBuilder sb = new StringBuilder(arrSize);

    sb.append("<" + xmlCharValPairs[0] + " ");

    for (int i = 1; i < arrSize-1; i++) {
        if (i%2 == 0) 
            sb.append("\"" + xmlCharValPairs[i].trim() + "\" ");
        else 
            sb.append(xmlCharValPairs[i]);
    }

    String returnXMLFormat = sb.toString();

This will leave you with an XML String in your requested format :)

that-ru551an-guy
  • 300
  • 1
  • 12
0

If it's consistent length then all you need to write is a simple string parser that would add extra "" at X position.

If it's not the same everything I think I would try to check if char is " then a char -1 from it and then check if it's =" or (some letter)" for example a".

width="100" vs width="100" anotherparam="...

This could tell you if it's begining or end of param. If it's the ending then simply add a space char after it.

Obiously you could then check if it's "(someletter) or "(space) to know if there is space char after your apostrophe.

width="100" param2="..." vs width="100"param2=""

If you have lets say 200 files to edit you could use something similar to this:

File folder = new File("your/path");
File[] listOfFiles = folder.listFiles();

Then simply open files in a loop, edit them and save them to new files with their orginal names or just overwrite current files. It's up to you.

Dashovsky
  • 137
  • 9
0

Your file isn't well-formed XML so you will need a tool that can handle files that aren't well-formed XML. That rules anything in the XSLT/XQuery/XPath family.

You can probably fix nearly all occurrences of the problem, with low risk of adverse side effects, by using a regular expression that inserts a space after any occurrence of " that isn't immediately preceded by =. (This will add some unnecessary spaces, but the XML parser will ignore them.)

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Thanks, I actually ended up using a variation of the info here: http://stackoverflow.com/questions/5511096/java-convert-formatted-xml-file-to-one-line-string which allowed me to use a simple replaceAll(). – vessel Apr 14 '15 at 11:04