For little XML Strings like what you have posted, I just use the getBetween() method provided below. Read the comments in in code:
/**
* Retrieves any string data located between the supplied string leftString
* argument and the supplied string rightString argument.<br><br>
* <p>
* This method will return all instances of a substring located between the
* supplied Left String and the supplied Right String which may be found
* within the supplied Input String.<br>
*
* @param inputString (String) The string to look for substring(s) in.<br>
*
* @param leftString (String) What may be to the Left side of the substring
* we want within the main input string. Sometimes the
* substring you want may be contained at the very
* beginning of a string and therefore there is no
* Left-String available. In this case you would simply
* pass a Null String ("") to this parameter which
* basically informs the method of this fact. Null can
* not be supplied and will ultimately generate a
* NullPointerException.<br><br>
*
* If the leftString is found to be escaped within the inputString then that
* escape sequence is converted to a "~:L:~" sequence within the
* inputString. If this new sequence ("~:L:~") is detected within a found
* substring then it is automatically converted back to it original escaped
* sequence before it is added to the returned array.<br>
*
* @param rightString (String) What may be to the Right side of the
* substring we want within the main input string.
* Sometimes the substring you want may be contained at
* the very end of a string and therefore there is no
* Right-String available. In this case you would simply
* pass a Null String ("") to this parameter which
* basically informs the method of this fact. Null can
* not be supplied and will ultimately generate a
* NullPointerException.<br><br>
*
* If the righString is found to be escaped within the inputString then that
* escape sequence is converted to a "~:R:~" sequence within the
* inputString. If this new sequence ("~:R:~") is detected within a found
* substring then it is automatically converted back to it original escaped
* sequence before it is added to the returned array.<br>
*
* @param options (Optional - Boolean varArgs - 2 Parameters):<pre>
*
* ignoreLetterCase - Default is false. This option works against the
* string supplied within the leftString parameter
* and the string supplied within the rightString
* parameter. If set to true then letter case is
* ignored when searching for strings supplied in
* these two parameters. If left at default false
* then letter case is not ignored.
*
* trimFound - Default is true. By default this method will trim
* off leading and trailing white-spaces from found
* sub-string items. General sentences which obviously
* contain spaces will almost always give you a white-
* space within an extracted sub-string. By setting
* this parameter to false, leading and trailing white-
* spaces are not trimmed off before they are placed
* into the returned Array.</pre>
*
* @return (List Interface of String [{@code List<String>}]) Returns a List
* of all the sub-strings found within the supplied Input String
* which are between the supplied Left-String and supplied
* Right-String.
*/
public static List<String> getBetween(String inputString, String leftString,
String rightString, boolean... options) {
// Return null if nothing was supplied.
if (inputString.isEmpty() || (leftString.isEmpty() && rightString.isEmpty())) {
return null;
}
// Prepare optional parameters if any supplied.
// If none supplied then use Defaults...
boolean ignoreCase = false; // Default.
boolean trimFound = true; // Default.
if (options.length > 0) {
if (options.length >= 1) {
ignoreCase = options[0];
if (options.length >= 2) {
trimFound = options[1];
}
}
}
// Remove any control characters from the
// supplied string (if they exist).
String modString = inputString.replaceAll("\\p{Cntrl}", "");
// Establish a List String Array Object to hold
// our found substrings between the supplied Left
// String and supplied Right String.
List<String> list = new ArrayList<>();
if (modString.contains("\\" + leftString)) {
modString = modString.replace("\\" + leftString, "~:L:~");
}
if (modString.contains("\\" + rightString)) {
modString = modString.replace("\\" + rightString, "~:R:~");
}
// Use Pattern Matching to locate our possible
// substrings within the supplied Input String.
String regEx = java.util.regex.Pattern.quote(leftString) + "{1,}"
+ (!rightString.isEmpty() ? "(.*?)" : "(.*)?")
+ java.util.regex.Pattern.quote(rightString);
if (ignoreCase) {
regEx = "(?i)" + regEx;
}
java.util.regex.Pattern pattern = java.util.regex.Pattern.compile(regEx);
java.util.regex.Matcher matcher = pattern.matcher(modString);
while (matcher.find()) {
// Add the found substrings into the List.
String found = matcher.group(1);
if (trimFound) {
found = found.trim();
}
found = found.replace("~:L:~", "\\" + leftString).replace("~:R:~", "\\" + rightString);
list.add(found);
}
return list;
}
To use the above method, you might have code like this:
String xml = "<list xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:noNamespaceSchemaLocation=\"xsd/book.xsd\">\n"
+ " \n"
+ " <!-- Nicknames -->\n"
+ " <name>ElectricPlayer</name>\n"
+ " <name>Necromancer</name>\n"
+ " <name>Turnip King</name>\n"
+ " <name>Esquire</name>\n"
+ " <name>NeophyteBeliever</name>\n"
+ " <name>Twitch</name>\n"
+ "\n"
+ "<particle_death_sentence>What am I doing with my life can you tell me?</particle_death_sentence>\n"
+ " <particle_death_sentence>You really had to kill me? I'm farming over here</particle_death_sentence>\n"
+ "\n"
+ "</list>";
// Get a list of 'possible' data node tags
List<String> possibleDataNodes = getBetween(xml, "<", ">", false, true);
// Declare a Map to hold the required data.
LinkedHashMap<String, List<String>> map = new LinkedHashMap<>();
// Determine data node tags...
for (int i = 0; i < possibleDataNodes.size(); i++) {
String nodeName = possibleDataNodes.get(i);
if (i < possibleDataNodes.size() - 1 && possibleDataNodes.get(i + 1).equals("/" + nodeName)) {
if (!map.containsKey(nodeName)) {
map.put(nodeName, getBetween(xml, "<" + nodeName + ">", "</" + nodeName + ">"));
}
}
}
// DISPLAY THE MAP:
// Get keySet() into Set
Set<String> setOfKeySet = map.keySet();
for (String key : setOfKeySet) {
// Display the Map Key which is String:
System.out.printf("%-22s%-31s%n", "Detected Key Name:", key);
System.out.printf("%-22s%-30s%n", "Values for:", key);
System.out.println(String.join("", java.util.Collections.nCopies(50, "-")));
// Display the Map Value which is a List<String>:
for (String value : map.get(key)) {
System.out.printf("%22s%-30s%n", "- ", value);
}
System.out.println(String.join("", java.util.Collections.nCopies(80, "=")));
System.out.println();
}
And what you should see displayed within the Console window is:
Detected Key Name: name
Values for: name
--------------------------------------------------
- ElectricPlayer
- Necromancer
- Turnip King
- Esquire
- NeophyteBeliever
- Twitch
================================================================================
Detected Key Name: particle_death_sentence
Values for: particle_death_sentence
--------------------------------------------------
- What am I doing with my life can you tell me?
- You really had to kill me? I'm farming over here
================================================================================