-4

How to split retrieve data from a string of an array.

The object has an array of variables which can be multi-dimensional.

Demo Input String

CONTAINER [1, "one", false, [CONTAINER [2, "two", false]], 42, true]

Expected Results

CONTAINER
1
"one"
false
[CONTAINER [2, "two", false]]
42
true

(I would then take group 5 and run it again to get the rest of the objects)

What is a good method of splitting the string to gain the data inside?

Can regex be used?


I do have the option of formatting the string differently if another layout would make it easier.

Greg
  • 754
  • 9
  • 18
  • 1
    This is a pretty bad use case for regex. If your container object can have an arbitrarily deep level of self nesting, there's probably no pattern that can describe it consistently. Since you're already working in Java, it should be pretty simple to handle this object with a few separate lines of code instead. – CAustin Aug 09 '18 at 16:37
  • It's a little off-topic but is there actually a ```[``` right before ```CONTAINER```? It looks like ```[``` should follow the word ```CONTAINER```. – zhh Aug 09 '18 at 16:38
  • @CAustin I initially thought the same, I couldn't figure out how you'd split by delimiter without affecting the internal ones, could you post an example? Regex seemed to fit better as it is a pattern, I'm just not aware to the extent of recursion in regex. – Greg Aug 09 '18 at 16:42
  • @zhh Yes because it's an array of containers in the array of variables. – Greg Aug 09 '18 at 16:42
  • Are you wanting to parse the integers? – GBlodgett Aug 09 '18 at 18:49
  • @GBlodgett Parse all the values in the array, including the array (arrays within arrays) – Greg Aug 09 '18 at 22:26
  • The one problem I see with this is that the inner container has less values that the outer one, which will make recursion difficult. – GBlodgett Aug 09 '18 at 23:55
  • @GBlodgett Yes precisely why I asked the question, the containers can have **any** number of values – Greg Aug 10 '18 at 15:48

2 Answers2

0

Most trivial way, using split():

val input = """CONTAINER [1, "one", false, [CONTAINER [2, "two", false]], 42, true]"""

input.split(" ", "[", ",", "]").filter {
    it != ""
}.forEach {
    println(it)
}

Output:

CONTAINER
1
"one"
false
CONTAINER
2
"two"
false
42
true
Alexey Soshin
  • 16,718
  • 2
  • 31
  • 40
  • There's no way to determine whether "42" is apart of the first or second container from your output. I wasn't aware split could take multiple arguments, that helped in figuring out a solution, thanks. – Greg Aug 10 '18 at 16:04
0

As the array in the root container can be identified manually I could replace the square brackets with regular ones making it easier to retrieve the data

input = if(input.endsWith("]]]")) replaceLast(input, "]]]", "])]") else replaceLast(input, "]], ", "]), ")

        val arraySplit = input.split("(", ")")

From there a regex pattern could be used to iterate up the nest to retrieve and replace all the subsequent containers

private val pattern = Pattern.compile("([A-Z]+\\s\\[[^\\[\\]]+])")

Not as clean as I would have liked but it's functional. The main concern was supporting several layers deep of nesting, such as in this example:

Input

CONTAINER [1, "one", false, [CONTAINER [2, "two", CONTAINER [3, "three"], CONTAINER [false], true], CONTAINER [2, "string", false], CONTAINER [4]], 42, true]

Output

CONTAINER [1, "one", false, [$array], 42, true]
$array = $4, $2, $3
$0 = CONTAINER [3, "three"]
$1 = CONTAINER [false]
$2 = CONTAINER [2, "string", false]
$3 = CONTAINER [4]
$4 = CONTAINER [2, "two", $0, $1, true]

Thanks to @Alexey Soshin for split() example.

Full class:

import org.junit.Assert
import org.junit.Test
import java.util.regex.Pattern

class ContainerTest {

    private val pattern = Pattern.compile("([A-Z]+\\s\\[[^\\[\\]]+])")

    /**
     * Checks if string contains a full array in the middle or at the end of the container values
     * @return whether a verified container contains an array
     */
    private fun hasArray(string: String): Boolean {
        return string.contains(", [") && (string.contains("]], ") || string.endsWith("]]]"))
    }

    /**
     * Replaces last occurrence of a substring
     * Credit: https://stackoverflow.com/a/16665524/2871826
     */
    private fun replaceLast(string: String, substring: String, replacement: String): String {
        val index = string.lastIndexOf(substring)
        return if (index == -1) string else string.substring(0, index) + replacement + string.substring(index + substring.length)
    }

    /**
     * Splits root container and returns string contents of it's array
     */
    private fun extractArray(string: String): String {
        if(!hasArray(string))
            return ""

        //Replace square brackets of array with regular so it's easier to differentiate
        var input = string
        input = input.replaceFirst(", [", ", (")
        input = if(input.endsWith("]]]")) replaceLast(input, "]]]", "])]") else replaceLast(input, "]], ", "]), ")

        val arraySplit = input.split("(", ")")
        return arraySplit[1]//Always the second index
    }

    private fun replaceArray(input: String, array: String): String {
        return input.replaceFirst(array, "\$array")
    }

    /**
     * Iterates pattern matching for the remainder containers
     * @return list of individual container strings
     */
    private fun extractContainers(string: String): ArrayList<String> {
        var array = string
        val containers = arrayListOf<String>()
        var index = 0

        //Nature of pattern matches deepest level first then works it's way upwards
        while(array.contains("[")) {//while has more containers
            val matcher = pattern.matcher(array)

            while (matcher.find()) {
                val match = matcher.group()
                containers.add(match)
                array = array.replace(match, "\$${index++}")
            }
        }
        return containers
    }

    /**
     * Replaces container strings with placeholder indices
     */
    private fun replaceContainers(string: String, containers: ArrayList<String>): String {
        var array = string
        containers.forEachIndexed { index, s -> array = array.replaceFirst(s, "\$$index") }
        return array
    }

    /**
     * Splits container variables
     * @return array of values
     */
    private fun getVariables(string: String): List<String> {
        return string.substring(11, string.length - 1).split(", ")
    }

    @Test
    fun all() {
        val input = "CONTAINER [1, \"one\", false, [CONTAINER [2, \"two\", CONTAINER [3, \"three\"], CONTAINER [false], true], CONTAINER [2, \"string\", false], CONTAINER [4]], 42, true]"//"""CONTAINER [1, "one", false, [CONTAINER [2, "two", false]], 42, true]"""

        if(hasArray(input)) {
            val array = extractArray(input)

            val first = replaceArray(input, array)

            val containers = extractContainers(array)

            val final = replaceContainers(array, containers)

            println("$first ${getVariables(first)}")

            println("\$array = $final")

            containers.forEachIndexed { index, s -> println("\$$index = $s ${getVariables(s)}") }
        }
    }

    private val emptyLast = "CONTAINER [1, \"one\", false, []]"
    private val oneLast = "CONTAINER [1, \"one\", false, [CONTAINER [2, \"two\"]]]"
    private val twoLast = "CONTAINER [1, \"one\", false, [CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]]]"
    private val threeLast = "CONTAINER [1, \"one\", false, [CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]]]"

    private val empty = "CONTAINER [1, \"one\", false, [], 42]"
    private val one = "CONTAINER [1, \"one\", false, [CONTAINER [2, \"two\"]], 42]"
    private val two = "CONTAINER [1, \"one\", false, [CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]], 42]"
    private val three = "CONTAINER [1, \"one\", false, [CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]], 42]"

    @Test
    fun hasArray() {
        Assert.assertFalse(hasArray(emptyLast))
        Assert.assertTrue(hasArray(oneLast))
        Assert.assertTrue(hasArray(twoLast))

        Assert.assertFalse(hasArray(empty))
        Assert.assertTrue(hasArray(one))
        Assert.assertTrue(hasArray(two))
    }

    @Test
    fun extractArray() {
        Assert.assertTrue(extractArray(empty).isEmpty())
        Assert.assertTrue(extractArray(emptyLast).isEmpty())

        Assert.assertEquals(extractArray(oneLast), "CONTAINER [2, \"two\"]")
        Assert.assertEquals(extractArray(twoLast), "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]")
        Assert.assertEquals(extractArray(threeLast), "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]")
        Assert.assertEquals(extractArray(one), "CONTAINER [2, \"two\"]")
        Assert.assertEquals(extractArray(two), "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]")
        Assert.assertEquals(extractArray(three), "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]")
    }

    @Test
    fun replaceArray() {
        val last = "CONTAINER [1, \"one\", false, [\$array]]"
        val first = "CONTAINER [1, \"one\", false, [\$array], 42]"
        Assert.assertEquals(replaceArray(oneLast, "CONTAINER [2, \"two\"]"), last)
        Assert.assertEquals(replaceArray(twoLast, "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]"), last)
        Assert.assertEquals(replaceArray(threeLast, "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]"), last)

        Assert.assertEquals(replaceArray(one, "CONTAINER [2, \"two\"]"), first)
        Assert.assertEquals(replaceArray(two, "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]"), first)
        Assert.assertEquals(replaceArray(three, "CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]"), first)
    }

    @Test
    fun extractContainers() {
        Assert.assertEquals(extractContainers("CONTAINER [2, \"two\"]").size, 1)
        Assert.assertEquals(extractContainers("CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]").size, 3)
        Assert.assertEquals(extractContainers("CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]").size, 4)
        Assert.assertEquals(extractContainers("CONTAINER [2, \"two\"]").size, 1)
        Assert.assertEquals(extractContainers("CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"]").size, 3)
        Assert.assertEquals(extractContainers("CONTAINER [2, \"two\", CONTAINER [3, \"three\"]], CONTAINER [4, \"four\"], CONTAINER [5, \"five\"]").size, 4)
    }
}
Greg
  • 754
  • 9
  • 18