0

I am making a program in java where i am getting from html data from css class .report

@RequestMapping(value = "/medindiaparser",  method = RequestMethod.POST)
public ModelMap  medindiaparser(@RequestParam  String  urlofpage ) throws ClassNotFoundException, IOException  {
    System.out.println("saveMedicineName");
    ModelMap mv = new ModelMap(urlofpage);
    System.out.println();
     String url = urlofpage;
     Document document = Jsoup.connect(url).get();

        String TITLE = document.select(".report").text();
        String[] news = TITLE.split(":");
        System.out.println("Question: " + TITLE);


    return mv;
}

Now what TITLE is giving me.

name : aman kumar working in : home,outside what he does: program | sleep | eat

So what i want to get the specific value in an array like.

array[0] : aman kumar
array[1] : home,outside
array[2] : program | sleep | eat

So that, i can set the value of array in my models, anyone did it?

.report consist <h3> where header is located. it goes like this

<report><h3>Name</h3>aman kumar<h3>working in </h3>home, outside .....</report>
assylias
  • 321,522
  • 82
  • 660
  • 783
Aman
  • 806
  • 2
  • 12
  • 38
  • 1
    Are your field name's always the same and given in the same order ? ('name', 'working in','what he does') If yes you can substring these part – L. Carbonne Apr 21 '17 at 09:28
  • Are you sure that to call `text()` on `select(".report")`? Maybe this selection holds data in structure which would allow you to split it in separate parts easier than with `split` method. – Pshemo Apr 21 '17 at 09:52
  • as i said what title is giving me 'name : aman kumar working in : home,outside what he does: program | sleep | eat' and this is all i can select because data is wrapped in .report class – Aman Apr 21 '17 at 09:58
  • I understand what TITLE holds, but that is not what I am asking. You are getting text representation of what `select(".report")` was able to find. But maybe this select was able to find data in form `name : aman kumar working in : home,outside what he does: program | sleep | eat` (or similar) which provides data in specific groups. Handling such groups one by one would be easier then having to guess which part belongs to which group, so if we should split `aman kumar working in` before `working` or `kumar` or maybe before `in`. – Pshemo Apr 21 '17 at 10:03
  • no no acttually .report consist

    where header is located . it goes like this

    Name

    aman kumar

    working in

    home, outside .....
    @Pshemo

    – Aman Apr 21 '17 at 10:10
  • So there is structure which allows us to parse it properly. Post this structure in your question and we can try help you using parser to solve this problem. Currently you are facing problem described at http://stackoverflow.com/a/1732454 – Pshemo Apr 21 '17 at 10:13
  • But be sure to post proper data. For now you claim that it looks like `..` but if that would be true `select(".report")` wouldn't be able to find it because `.report` selects classes, not tag names. – Pshemo Apr 21 '17 at 10:18
  • @Pshemo check your answer's comment section – Aman Apr 21 '17 at 12:14

2 Answers2

1

I have completely overhauled my answer to extract the name, working in, and what he does contents from your TITLE string. This can be done using a regex pattern matcher in Java.

String pattern = "name\\s*:\\s*(.*?)\\s*working in\\s*:\\s*(.*?)\\s*what he does\\s*:\\s*(.*)";
Pattern r = Pattern.compile(pattern);
String line = "name : aman kumar working in : home,outside what he does: program | sleep | eat";
Matcher m = r.matcher(line);
while (m.find()) {
    System.out.println(m.group(1));
    System.out.println(m.group(2));
    System.out.println(m.group(3));
}

Output:

aman kumar
home,outside
program | sleep | eat

Demo here:

Rextester

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

Try this:

String s = "name : aman kumar working in : home,outside what he does: program | sleep | eat";
String[] news = s.split(":");
String exclude = "(working in|what he does)";
int index = -1;
for(int i = 0 ; i < news.length ; i++){
    if("name".equals(news[i].trim())){
        index = i;
        break;
    }
}
if(index != -1){
    String[] content = Arrays.copyOfRange(news, index+1, news.length);
    for(String string : content){
        System.out.println(string.trim().replaceAll(exclude, ""));
    }
}
Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102