1

I have a BuilderString that contain the same result as in this link: https://hadoop.apache.org/docs/current/hadoop-project-dist/

I'm looking to extract the values of the ``. And return a list of String that contain all the files name.

My code is:

try {
    HttpURLConnection conHttp = (HttpURLConnection) url.openConnection();
    conHttp.setRequestMethod("GET");
    conHttp.setDoInput(true);

    InputStream in = conHttp.getInputStream();
    int ch;
    StringBuilder sb = new StringBuilder();
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

How can I parse JSON to take all the values of pathSuffix and return a list of string that contains the file names ? Could you please give me a suggestion ? Thanks

Community
  • 1
  • 1
Isabelle
  • 151
  • 2
  • 9
  • 3
    That's not really a String, that's JSON. Parsing JSON and parsing a String are a bit different. – Kayaman Feb 24 '21 at 17:22
  • You need to use some json library. E.g. gson or jackson – vijayinani Feb 24 '21 at 17:25
  • Yes its a JSON. I have the same output as in the link. But, I converted it to a String trying to find a solution to parse it :) Assuming I have a JSON. In my code until String response = sb.toString(); How can I parse it the JSON to extract the value of pathSuffix ? Thanks a lot – Isabelle Feb 24 '21 at 20:27

3 Answers3

2

That is JSON formatted data; JSON is not regular, tehrefore, trying to parse this with a regular expression is impossible, and trying to parse it out with substring and friends will take you a week and will be very error prone.

Read up on what JSON is (no worries; it's very simple to understand!), then get a good JSON library (the standard json.org library absolutely sucks, don't get that one), such as Jackson or GSON, and the code to extract what you need will be robust and easy to write and test.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72
  • Could you please propose me a solution ? I would like to extract the value of the pathSuffix and return a List of String that contain all the files names. Thanks in advance – Isabelle Feb 24 '21 at 20:42
  • Yes; add Jackson to your dependencies list, read up on how to use it. – rzwitserloot Feb 24 '21 at 23:59
1

The good option

Do the following steps:

  1. Convert to JSON
  2. Get the value using: JSONObject.get("FileStatuses").getAsJson().get("FileStatus").getAsJsonArray()
  3. Iterate over all objects in the array to get the value you want

The bad option

Although as mentioned it is not recommended- If you want to stay with Strings you can use:

String str_to_find= "pathSuffix"      : \"";

while (str.indexOf(str_to_find) != -1){
   str = str.substring(str.indexOf(str_to_find)+str_to_find.length);
   value = str.substring(0,str.indexOf("\""));
   System.out.println("Value is " + value);
}
ALUFTW
  • 1,914
  • 13
  • 24
  • which language you used ? – Isabelle Feb 24 '21 at 20:21
  • Its java, like the title mentioned. – ALUFTW Feb 24 '21 at 22:59
  • it seems quasi java, . For example the first line its not java. I didn't understand your solution. In fact, what I'm looking is to parse the JSON that in the StringBuilder (sb), take all the value of pathSuffix and return a list of String that contain the file names. – Isabelle Feb 25 '21 at 08:23
  • Here, I added String before. Now its java. I manipulated the String (str) for you. Please edit your message to be something like: " How can I parse JSON to take all the values of pathSuffex and return a list of string that contains the file names ? " , and we'll answer it for you :) – ALUFTW Feb 25 '21 at 08:57
  • Do you have an idea please ? – Isabelle Feb 25 '21 at 12:46
0

I would not recommend to build from scratch an API binding for hadoop. This binding exist already for the Java language:

https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html#listLocatedStatus-org.apache.hadoop.fs.Path-org.apache.hadoop.fs.PathFilter-

digital illusion
  • 497
  • 3
  • 19