1

I want to extract particular fields from a url of a facebookpage. Iam not able to extract since link format is not static.eg:if I gave the below examples as input it should give the o/p as what we desire

1)https://www.facebook.com/pages/Ice-cream/109301862430120?rf=102173023157556

o/p -109301862430120

What about this type of link

can anyone help me

chopss
  • 771
  • 9
  • 19

1 Answers1

3

So in short, you want to get name after last / and (if there is any) before ? mark.

You can do it with using URI and File classes like

String data = "https://www.facebook.com/pages/Anti-Christian-sentiment/149675731889496?ref=br_tf";
System.out.println(new File(new URI(data).getRawPath()).getName());

Output: 149675731889496


If you need to use regex then you can use

([^/?]+)(\\?|$)

and just read content of group 1 (the one in first pair of parenthesis).

If you don't want to use groups, and make regex match only digit part (without including ? in match) then you can use look around mechanisms like look-ahead (?=...). Regex you would have to use would look like

[^/?]+(?=\\?|$)

Code example:

String data = "https://www.facebook.com/pages/Anti-Christian-sentiment/149675731889496?ref=br_tf";
Pattern p = Pattern.compile("([^/?]+)(\\?|$)");
Matcher m = p.matcher(data);
if (m.find()){
    System.out.println(m.group(1));
}

Output:

149675731889496

Pshemo
  • 122,468
  • 25
  • 185
  • 269
  • if i give input as https://www.facebook.com/Federer the it should give o/p as "Federer" – chopss Mar 08 '14 at 10:27
  • @user3392249 It is using `File` and `URI` classes like I showed in at start of my answer. If you would need regex then you can change `\\d` to `[^/?]` to accept instead of digits `\d` all characters that are not `/` or `?`. – Pshemo Mar 08 '14 at 10:30
  • can u give an example ..i gave like this and it's showing no o/p.Pattern p = Pattern.compile("([^/?]+)\\?");...i am very new to the topic reg exp.it is not working for this link https://www.facebook.com/Federer – chopss Mar 08 '14 at 10:36
  • @user3392249 You are right. I forgot to make `?` optional. You can use `([^/?]+)(\\?|$)` instead. This part `(\\?|$)` means `?` or `end of line`. But consider using `URI` and `File` classes. They ware meant to handle tasks like this. – Pshemo Mar 08 '14 at 10:53
  • https://www.facebook.com/plugins/fan.php?connections=100&id=138986636119722...wat about this type of url...it's not getting 138986636119722 as output – chopss Mar 10 '14 at 10:45
  • @chopu Of course it will not get `138986636119722`. Examples in your question ware showing that you ware interested in last part of URL eventually part before `?`. I'm starting to believe that what you are trying to achieve is not suppose to be created with regex but maybe with facebook api for Java. Maybe take a look [here](https://developers.facebook.com/docs/other-sdks) or something like http://restfb.com/. But if you want to still handle links by yourself take a look at URLEncodedUtils from Appache Commons. [Here]*http://stackoverflow.com/a/21857855/1393766) is my short example. – Pshemo Mar 10 '14 at 13:10
  • @pschemo I already tried using api's..what i want to do is if i specify a keyword for eg:interpol in my console then it should give o/p of persons who like that page ..I get by using jsoup by passing id of page in this link https://www.facebook.com/plugins/fan.php?connections=100&id=pageid...so for this purpose i need page id. – chopss Mar 11 '14 at 03:23
  • I want to ask u one more question is it possible through api to get details like(personal info like location,studied at,work etc)of users who like a particular page using api's like restfb and facebook4j..as far as i know only possibility is to access all activities of current usr only – chopss Mar 11 '14 at 03:25
  • Sorry to disappoint you but I don't know how to use Facebook API or its libraries yet (I don't even have Facebook account) so can't help you on that. If you want to find if link contains `id` then just use `yourLink.contains("&id")` or `yourLink.contains("?id")` and if it does fetch id with `URLEncodedUtils` (like I showed earlier) or use another regex like `[?&]id=(\\d+)` and read its group 1. – Pshemo Mar 11 '14 at 10:26