2

There is a string which is a separated string : item_1|item_2|item_3 , in this example the separating symbol is |.

My boss does not like the split method to get the different parts of the string: he thinks there is risk about it, but he is not very sure what is the risk. So are there other ways to get a List from a separated String?

pheromix
  • 18,213
  • 29
  • 88
  • 158
  • 1
    Split it manually, using `indexOf` and `subString` ... messing, but it works. I'd be more curious of the "supposed risk". If they can't identify it, then what's the issue – MadProgrammer Feb 20 '19 at 10:12
  • 10
    What a bizarre thing to not like... – BretC Feb 20 '19 at 10:12
  • Possible duplicate of [How to split a string in Java](https://stackoverflow.com/questions/3481828/how-to-split-a-string-in-java) – Tom Feb 20 '19 at 10:13
  • 2
    what 'risk' is there that wouldn't be there otherwise? that | might be used in one of the elements? do you think that wouldn't mess with any other way to do it as well? – Stultuske Feb 20 '19 at 10:13
  • The duplicate contains several ways to split a String (also `#split(..)`, but you don't need to use that answer if you don't like it). – Tom Feb 20 '19 at 10:14
  • We need to know a bit more about the format. Can the items contain |, and if so, how is that escaped? (That is, how would a human know that the pipe belongs to the item, and is not a delimiter?) – yshavit Feb 20 '19 at 10:18
  • @yshavit I think you are guessing my boss' fear ; so how to escape the `|` if it belongs to the item ? – pheromix Feb 20 '19 at 10:20
  • 2
    how about: make sure you get a separator that will never be used in one of the elements. – Stultuske Feb 20 '19 at 10:22
  • it was admitted by our customer that the separator is `|` – pheromix Feb 20 '19 at 10:22
  • 2
    Are you allowed to use floating point numbers where you work? Loads of risks there... – BretC Feb 20 '19 at 10:24
  • 2
    If you must use an alternative to split, you must also figure out why you should not use split, otherwise any alternatives could have exactly the same issue as split (especially if you just re-implement split yourself.). Try to figure out if you have some bizarre csv like format, where you have 3 items in the line: `item_1|"item|2"|item_3` where the 2. item is enclosed in quotes and therefore the `|` inside it is not a delimiter. Or if your format allows traditional escaping such that `item_1|item\|2|item_3` is also 3 items and the `|` in `\|` is not a delimiter. – nos Feb 20 '19 at 10:28
  • 2
    I mean...talk with your boss..seriously. There is absolutely no "less - risky" way to do this, don't mind about the method that is being used – aran Feb 20 '19 at 10:48
  • Maybe related: https://stackoverflow.com/questions/3870415/splitting-a-string-that-has-escape-sequence-using-regular-expression-in-java – Lino Feb 20 '19 at 10:57

3 Answers3

4
import java.util.ArrayList;
import java.util.List;

public class SplitUsingAnotherMethodBecauseBossLikesWastingEveryonesTime {

    public static void main(String[] args) {
        System.out.println(split("Why would anyone want to write their own String split function in Java?", ' '));
        System.out.println(split("The|Split|Method|Is|Way|More|Flexible||", '|'));
    }

    private static List<String> split(String input, char delimiter) {
        List<String> result = new ArrayList<>();
        int idx = 0;
        int next;

        do {
            next = input.indexOf(delimiter, idx);

            if (next > -1) {
                result.add(input.substring(idx, next));
                idx = next + 1;
            }
        } while(next > -1);

        result.add(input.substring(idx));

        return result;
    }
}

Outputs...

[Why, would, anyone, want, to, write, their, own, String, split, function, in, Java?]
[The, Split, Method, Is, Way, More, Flexible, , ]
BretC
  • 4,141
  • 13
  • 22
  • because maybe, just maybe: "why|wou|d|anyone|do|this" would also be a valid input should that give wou|d, or wou and d ? – Stultuske Feb 20 '19 at 10:26
  • "Why would anyone want to write their own String split function in Java?" Presumably because the standard string split semantics aren't quite what they want -- which makes it not very useful to implement a method with those same semantics. :) – yshavit Feb 20 '19 at 10:35
  • Then we could call it "cutTheString". No matter what, no matter how, splitting is what it will do. Like saying, I want to add the values from these two integers, but I don't like "add" semantics. p.s, the example string is wonderful, and self declarative indeed – aran Feb 20 '19 at 10:51
1

You can just iterate over all the chars in the string and then use substring() to select the different sub strings:

public static List<String> split(String input, char delimiter) {
    List<String> output = new LinkedList<>();
    int lastIndex = 0;
    boolean doubleQuote = false;
    boolean singleQuoteFound = false;
    for (int i = 0, current, last = 0, length = input.length(); i < length; i++) {
        current = input.charAt(i);
        if (last != '\\') {
            if (current == '"') {
                doubleQuote = !doubleQuote;
            } else if (current == '\'') {
                singleQuoteFound = !singleQuoteFound;
            } else if (current == delimiter && !doubleQuote && !singleQuoteFound) {
                output.add(input.substring(lastIndex, i));
                lastIndex = i + 1;
            }
        }
        last = current;
    }
    output.add(input.substring(lastIndex));
    return output;
}

This is a very crude approach, but from my testing it should take care of escaped delimiters, delimiters in single ' and/or double " -quotes.

Can be called like this:

List<String> splitted = split("Hello|World|"No|split|here"|\|Was escaped|'Some|test'", '|');

Prints:

[Hello, World, "No|split|here", \|Was escaped, 'Some|test']
Lino
  • 19,604
  • 6
  • 47
  • 65
  • ok, and how would that solve the puzzle of the delimiter also being able to be part of an element? – Stultuske Feb 20 '19 at 10:30
  • @Stultuske Updated my answer with a working split method, which takes slashes, and quotes into account – Lino Feb 20 '19 at 11:13
-1

When we use split string, it internally creates Patterns object which overhead but that is only true for before Java 7 version only, In Java 7/8 it use index of since java 7 it wont have any overhead of the regular expression engine.However, if you do pass a more complex expression, it reverts to compiling a new pattern and here the behavior should be the same as that on Java 6 you can use pre compiled pattern and split the string.

public class MyClass {
static Pattern pattern = Pattern.compile("\\|");
public static void main(String[] args) {
    String str = "item_1|item_2|item_3";
    Stream<String> streamsName = pattern.splitAsStream(str);
    streamsName.forEach(System.out::println);
}

}