-4

I need help extracting a JSON array string into an array of objects so that it can be later processed.

The JSON string is embedded as a value within a pipe delimited string that is itself an XML element value.

A sample string is as below

<MSG>registerProfile|.|D|D|B95||43|5000|43100||UBSROOT43|NA|BMP|508|{"biometrics":{"fingerprints":{"fingerprints":[{"position":"RIGHT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEAAAAAADYAAAA="}},{"position":"LEFT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEADoAAAA"}}]}}}</MSG>

How can I extract the JSON properties and store them in separate arrays like

Format[0] =BMP
Position[0] =RIGHT_INDEX
Data[0]=Qk12WQEAAAAAADYAAAA=
Format[1] =BMP
Position[1]=LEFT_INDEX
Data[1]= Qk12WQEADoAAAA

These objects would then be passed to a separate function like below

FingerprintImage(Format[0],Position[0],Data[0]);
// ...
FingerprintImage(Format[1],Position[1],Data[1]);
// ...

public FingerprintImage(String format, String position, String data) {
    setFormat(format);
    setPosition(position);
    setData(data);
}
Chris Schaller
  • 13,704
  • 3
  • 43
  • 81
Vishal5364
  • 293
  • 1
  • 4
  • 21
  • Yes..Earlier each field was occurring only once so everything was stored in a string. Now position and data is coming multiple times so need to change the code accordingly – Vishal5364 Mar 14 '20 at 13:48
  • 1
    reading [this](https://stackoverflow.com/questions/3429921/what-does-serializable-mean) may be helpful – fuggerjaki61 Mar 14 '20 at 13:53
  • @fuggerjaki61 How is serialisation applicable in this case ? – Vishal5364 Mar 14 '20 at 19:00
  • serialization is for converting classes to forms like a string and back. See [this](https://stackoverflow.com/questions/7290777/java-custom-serialization) to create a custom serializer – fuggerjaki61 Mar 14 '20 at 19:03
  • @fuggerjaki61 I am really not an expert in java and can do only little bit of coding. So this concept of serialisation is out of my understanding. Not sure how to use it in my case. – Vishal5364 Mar 14 '20 at 19:25
  • You are free to do what you want but serialization is one of the best ways – fuggerjaki61 Mar 14 '20 at 19:40
  • 1
    It is unfortunate to see this question get downvotes and closed, OP has specifically targeted the values that they want to extract, nominated the structure to extract them into and provided an implementation example... What more can we ask for... – Chris Schaller Mar 15 '20 at 15:42

1 Answers1

0

I am not a java developer, the following is hopefully helpful to yourself or others who can provide more succinct syntax in java.

Firstly, we should identify there different layers of data serialization going on with your value:

  1. <MSG></MSG> This is an outer XML element, so the first step is to interpret this value as an XML fragment and extract the XML Value.
    • The reason that we use XML deserialization at this top level, and not just use the string position, is that the inner values may have been XML escaped, so we need to parse the inner value using the XML encoding rules.
    • This leaves us with the strimg value: registerProfile|.|D|D|B95||43|5000|43100||UBSROOT43|NA|BMP|508|{"biometrics":{"fingerprints":{"fingerprints":[{"position":"RIGHT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEAAAAAADYAAAA="}},{"position":"LEFT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEADoAAAA"}}]}}}
  2. The next level is pipe-delimited, which is the same as CSV, except the escape character is a | and usually there is no other encoding rules, as | isn't considered part of the normal lexical domain and we shouldn't need any further escaping.
    You could therefore split this string into an array.
    • The value we are interested in is the 15th element in the array, eithe you know this in advance, or you could simply iterate through the elements to find the first one that starts with {
    • This leaves a JSON value: {"biometrics":{"fingerprints":{"fingerprints":[{"position":"RIGHT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEAAAAAADYAAAA="}},{"position":"LEFT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEADoAAAA"}}]}}}
  3. Now that we have isolated the inner value in JSON format, the usual thing to do next is deserialize this value into an object.
    I know OP is asking for arrays, but we can realize JSON objects as arrays if we really want to with the right tools.

In C# the above process is pretty simple, I'm sure it should be in Java as well, but my attempts keep throwing errors.

So, lets instead assume (I know... Ass-U-Me...) that there is only ever a single JSON value in the pipe-delimited array, with this knoweldge we can isolate the JSON using int String.IndexOf(str)

String xml = "<MSG>registerProfile|.|D|D|B95||43|5000|43100||UBSROOT43|NA|BMP|508|{\"biometrics\":{\"fingerprints\":{\"fingerprints\":[{\"position\":\"RIGHT_INDEX\",\"image\":{\"format\":\"BMP\",\"resolutionDpi\":\"508\",\"data\":\"Qk12WQEAAAAAADYAAAA=\"}},{\"position\":\"LEFT_INDEX\",\"image\":{\"format\":\"BMP\",\"resolutionDpi\":\"508\",\"data\":\"Qk12WQEADoAAAA\"}}]}}}</MSG>";

int start = xml.indexOf('{');
int end = xml.lastIndexOf('}') + 1; // +1 because we want to include the last character, so we need the index after it

String json = xml.substring(start, end);

results in: {"biometrics":{"fingerprints":{"fingerprints":[{"position":"RIGHT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEAAAAAADYAAAA="}},{"position":"LEFT_INDEX","image":{"format":"BMP","resolutionDpi":"508","data":"Qk12WQEADoAAAA"}}]}}}

Formatted to be pretty:

{
  "biometrics": {
    "fingerprints": {
      "fingerprints": [
        {
          "position": "RIGHT_INDEX",
          "image": {
            "format": "BMP",
            "resolutionDpi": "508",
            "data": "Qk12WQEAAAAAADYAAAA="
          }
        },
        {
          "position": "LEFT_INDEX",
          "image": {
            "format": "BMP",
            "resolutionDpi": "508",
            "data": "Qk12WQEADoAAAA"
          }
        }
      ]
    }
  }
}

One way would be to create a class structure that matches this JSON value, then we can simply .fromJson() for the whole value, instead, lets meet halfway so we only need to define the inner class structure for the data we will actually use.

Now from this structure we can see there is an outer object that only has a single property called biometrics, this value is again an object witha single property called fingerprints. The value of this property is another object that has a single property called fingerprints except that this time it has an array value.

The following is a proof in Java, I have included first an example using serialization (using the gson library) and after that a similar implementation using only the simple-JSON library to read the values in arrays.

Try it out on JDoodle.com

MyClass.java

import java.util.*;
import java.lang.*;
import java.io.*;

//import javax.json.*;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;

import com.google.gson.Gson;


public class MyClass {
    public static void main(String args[]) {

        String xml = "<MSG>registerProfile|.|D|D|B95||43|5000|43100||UBSROOT43|NA|BMP|508|{\"biometrics\":{\"fingerprints\":{\"fingerprints\":[{\"position\":\"RIGHT_INDEX\",\"image\":{\"format\":\"BMP\",\"resolutionDpi\":\"508\",\"data\":\"Qk12WQEAAAAAADYAAAA=\"}},{\"position\":\"LEFT_INDEX\",\"image\":{\"format\":\"BMP\",\"resolutionDpi\":\"508\",\"data\":\"Qk12WQEADoAAAA\"}}]}}}</MSG>";

        int start = xml.indexOf('{');
        int end = xml.lastIndexOf('}') + 1; // +1 because we want to include the last character, so we need the index after it

        String jsonString = xml.substring(start, end);

        JSONParser parser = new JSONParser();
        Gson gson = new Gson();
        try
        {
            // locate the fingerprints inner array using simple-JSON (org.apache.clerezza.ext:org.json.simple:0.4 )
            JSONObject jsonRoot = (JSONObject) parser.parse(jsonString);
            JSONObject biometrics = (JSONObject)jsonRoot.get("biometrics");
            JSONObject fpOuter = (JSONObject)biometrics.get("fingerprints");
            JSONArray fingerprints = (JSONArray)fpOuter.get("fingerprints");

            // Using de-serialization from gson (com.google.code.gson:gson:2.8.6)

            FingerPrint[] prints = new FingerPrint[fingerprints.size()];
            for(int i = 0; i < fingerprints.size(); i ++)
            {
                JSONObject fpGeneric = (JSONObject)fingerprints.get(i);
                prints[i] = gson.fromJson(fpGeneric.toString(), FingerPrint.class);
            }

            // Call the FingerprintImage function using the FingerPrint objects
            System.out.print("FingerPrint Object Index: 0");
            FingerprintImage(prints[0].image.format, prints[0].position, prints[0].image.data );
            System.out.println();
            System.out.print("FingerPrint Object Index: 1");
            FingerprintImage(prints[1].image.format, prints[1].position, prints[1].image.data );  
            System.out.println();

            // ALTERNATE Array Implementation (doesn't use gson)
            String[] format = new String[fingerprints.size()];
            String[] position = new String[fingerprints.size()];
            String[] data = new String[fingerprints.size()];
            for(int i = 0; i < fingerprints.size(); i ++)
            {
                JSONObject fpGeneric = (JSONObject)fingerprints.get(i);
                position[i] = (String)fpGeneric.get("position");
                JSONObject image = (JSONObject)fpGeneric.get("image");
                format[i] = (String)image.get("format");
                data[i] = (String)image.get("data");
            }

            System.out.print("Generic Arrays Index: 0");
            FingerprintImage(format[0], position[0], data[0] );
            System.out.println();
            System.out.print("Generic Arrays Index: 1");
            FingerprintImage(format[1], position[1], data[1] ); 
            System.out.println();
        }
        catch (ParseException ignore) {
        }


    }

    public static void FingerprintImage(String format, String position, String data) {
        setFormat(format);
        setPosition(position);
        setData(data);
    }
    public static void setFormat(String format) {
        System.out.print(", Format=" + format);
    }
    public static void setPosition(String position) {
        System.out.print(", Position=" + position);
    }
    public static void setData(String data) {
        System.out.print(", Data=" + data);
    }
}

output

jdoodle output

FingerPrint.java

public class FingerPrint {
    public String position;
    public FingerPrintImage image;   
}

FingerPrintImage.java

public class FingerPrintImage {
    public String format;
    public int resolutionsDpi;
    public String data;
}

Deserialization techniques are generally considered superior to forced/manual parsing especially when we need to pass around references to multiple parsed values. In the above example, by simply reading format, position and data into separate arrays, the relationship between them has become de-coupled, through our code implementation we can still use them together as long as we use the same array index, but the structure no longer defines the relationship between the values. De-serializing into a typed structure preserves the relationship between values and simplifies the task of passing around values that are related to each other.


update

If you used serialization, then you could pass through the equivalent FingerPrint object to any methods that need it, instead of passing through the related values individually, further to this you could simply pass around the entire array of FingerPrint objects.


public static void FingerprintImage(FingerPrint print) {
    setFormat(print.image.format);
    setPosition(print.position);
    setData(print.image.data);
}

To process multiple FingerPrint objects in a batch, change the method to accept an array: FingerPrint[]

You could use the same technique to process arrays or each of the Format, Postion and Data, though it is really poor practise to do so. Passing around multiple arrays and expecting the receiving code to know that each of the arrays is supposed to be interpreted in sync, that is the same index in each array corresponds to the same finger print, this level of implementation detail is too ambiguous and will lead to maintenance nightmares down the track, its far better to learn and become proficient in OO concepts and creating business objects for passing around related data elements, instead of packaging everything into disassociated arrays.

The following code can assist you in processing multiple items using OPs array method but it should highlight why the practise is a bad habit to pickup:


public static void FingerprintImage(String[] formats, String[] positions, String[] datas) {
    // now you must iterate each of the arrays using the same index
    // however as there are no restrictions on the arrays, for each array 
    // and each index we should be checking that the array has not gone out
    // of length.
}

From an OO point of view, passing through multiple arrays like this raises a number of issues, firstly, the developer will simply need to know that the same index must be used in each array to retrieve correlated information. The next important issue is error handling...

If datas only has 1 element, but positions has 2 elements, which of the 2 elements does the 1 data element belong to? Or does this indicate that the same data should be used for both?

There are many other issues, consider when you expect 3 elements...
While you can get away with what seems like a shortcut in code if you really need to, you really shouldn't unless you absolutely understand what you are doing, you fully document the related code and you are taking responsibility for the potential fall out down the track.

Chris Schaller
  • 13,704
  • 3
  • 43
  • 81
  • Thanks, this method works for me. Just one question, how can I pass all the values of data[] together into a function. Like I want to pass data[0], data[1] into a function in one go. – Vishal5364 Mar 16 '20 at 10:32
  • this is were serialization is better than separate arrays, but simply pass in `data` instead of the value at any index, you will have to modify your function to accept the array parameter. – Chris Schaller Mar 16 '20 at 22:56