1

I'm trying to extract void <init>(java.lang.String) and java.io.File getOutputMediaFileUri(int) from the string below (and strings like it).

specialinvoke $r1.<java.io.File: void <init>(java.lang.String)>($r16)@<com.jpgextractor.PicViewerActivity: java.io.File getOutputMediaFileUri(int)>

Currently, I'm matching to the regex:

/.*<.* (\S+ .*)>.*<.* (\S+ .*)>.*/

Which works, except for instead of void <init>(java.lang.String) I just get void (java.lang.String).

I'm a bit mistified where <init> has gone off too...I've tried several different ways to coax it back, but so far no luck.

Thanks folks!

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
bcr
  • 1,328
  • 11
  • 27
  • What is the tool you are using to extract the text? – nhahtdh Sep 15 '13 at 23:04
  • javascript regex built-ins – bcr Sep 15 '13 at 23:04
  • 1
    You are getting the result: http://rubular.com/r/uWqCnA7c8N – acdcjunior Sep 15 '13 at 23:04
  • Ah, so it's not displaying in a web page? Needs to be escaped? – bcr Sep 15 '13 at 23:05
  • 1
    @bcr: The problem is how you are displaying the text. Can you show the code? – nhahtdh Sep 15 '13 at 23:06
  • 3
    Here's the result from a javascript regex tool: http://goo.gl/W71sBX . If you are printing the result in a page, the browser is probably interpreting `` as a tag (**and thus showing nothing, as that tag has no meaning in HTML**). Try outputting the result with `console.log(RESULT)`. – acdcjunior Sep 15 '13 at 23:07
  • Yes, so I need to escape the tags. That does make sense. – bcr Sep 15 '13 at 23:09
  • should I delete this question then or update to reflect the real problem? I'm aware of how to escape the resulting string, but perhaps this could be a useful reference for noobs like me – bcr Sep 15 '13 at 23:10
  • @bcr: Just answer yourself. – JayC Sep 15 '13 at 23:15

1 Answers1

1

The issue was nothing to do with the regex; as pointed out by acdcjunior and nhahtdh, the regex was operating correctly, but the issue was in displaying the text. I was putting the output void <init>() into a web page as unescaped HTML, where <init> was interpreted as an HTML tag; the <> characters should be escaped.

See Fastest method to escape HTML tags as HTML entities? for information on escaping such tags.

Community
  • 1
  • 1
bcr
  • 1,328
  • 11
  • 27