1

I need to take a String and deleting all the regexes in it starting with character '[' and ending with character ']'.

Now i don't know how to tackle this problem. I tried to convert the String to character array and then putting empty characters from any starting '[' till his closing ']' and then convert it back to a String using toString() method.

MyCode:

char[] lyricsArray = lyricsParagraphElements.get(1).text().toCharArray();
                for (int i = 0;i < lyricsArray.length;i++)
                {
                    if (lyricsArray[i] == '[')
                    {
                        lyricsArray[i] = ' ';
                        for (int j = i + 1;j < lyricsArray.length;j++)
                        {
                            if (lyricsArray[j] == ']')
                            {
                                lyricsArray[j] = ' ';
                                i = j + 1;
                                break;
                            }
                            lyricsArray[j] = ' ';
                        }   
                    }
                }
                String songLyrics = lyricsArray.toString();
                System.out.println(songLyrics);

But in the print line of songLyrics i get weird stuff like

[C@71bc1ae4
[C@6ed3ef1
[C@2437c6dc
[C@1f89ab83
[C@e73f9ac
[C@61064425
[C@7b1d7fff
[C@299a06ac
[C@383534aa
[C@6bc168e5

I guess there is a simple method for it. Any help will be very appreciated.

For clarification: converting "abcd[dsadsadsa]efg[adf%@1]d" Into "abcdefgd"

God
  • 1,238
  • 2
  • 18
  • 45

5 Answers5

3

Or simply use a regular expression to replace all occurences of \\[.*\\] with nothing:

String songLyrics = text.replaceAll("\\[.*?\\]", "");

Where text is ofcourse:

String text = lyricsParagraphElements.get(1).text();

What does \\[.*\\] mean?

The first parameter of replaceAll is a string describing a regular expression. A regular expression defines a pattern to match in a string.

So let's split it up:

\\[ matches exactly the character [. Since [ has a special meaning within a regular expression, it needs to be escaped (twice!).

. matches any character, combine this with the (lazy) zero-or-more operator *?, and it will match any character until it finally finds:

\\], which matches the character ]. Note the escaping again.

Community
  • 1
  • 1
Tim
  • 5,521
  • 8
  • 36
  • 69
  • Can you explain what the first parameter `\\[.\\]` mean? – God Mar 21 '16 at 16:22
  • Given the example input/output, you should use the ? quantifier on `.*`. – Andy Turner Mar 21 '16 at 16:30
  • @AndyTurner Please explain. – Tim Mar 21 '16 at 16:32
  • 1
    @Tim greedy matching changes `abcd[dsadsadsa]efg[adf%@1]d` to `abcdd`, not the desired output. – Andy Turner Mar 21 '16 at 16:34
  • That's more accurate but still not perfect. fortunately I don't need it to be perfect. It's not perfect because taking `"[Chorus 2x - Eminem:]` making it `"2x"` but as i said it's ok for me. Thanks. – God Mar 21 '16 at 16:49
  • @God It shouldn't. I think there is another error in you code. – Tim Mar 21 '16 at 16:51
  • @God If I execute `System.out.println("a[Chorus 2x - Eminem:]b".replaceAll("\\[.*?\\]", ""));` it will print `ab`. So this should work for you. Can you maybe post your code on [ideone](https://ideone.com/)? – Tim Mar 21 '16 at 16:53
  • 1
    @Tim It's fine. My code is too big and complicated. But i got what i wanted and that's what matters. Thanks again. – God Mar 21 '16 at 16:55
2

Your code below is referencing to the string object and you are then printing the reference songLyrics.

String songLyrics = lyricsArray.toString();
System.out.println(songLyrics);

Replace above two lines with

String songLyrics = new String(lyricsArray);
System.out.println(songLyrics);

Ideone1

Other way without converting it into char array and again to string.

String lyricsParagraphElements = "asdasd[asd]";

String songLyrics = lyricsParagraphElements.replaceAll("\\[.*\\]", "");

System.out.println(songLyrics);

Ideone2

FallAndLearn
  • 4,035
  • 1
  • 18
  • 24
  • Yes. That's good. But It's really necessary to convert the `String` into characters array and then delete and then converting back. There is no method in `java` that can do it in one line? like someone suggested ` lyricsParagraphElements.get(1).text().replaceAll("[\\[\\]]", " ")`?? – God Mar 21 '16 at 16:21
  • @FallAndLearn Little unnecessary to clone my answer. It isn't even correct: `lyricsParagraphElements` isn't of type `String`. – Tim Mar 21 '16 at 16:30
  • @Tim Hi. I didn't copied your answer. In fact I was the first one to post and then he asked about the other way. replaceAll() method is the most simple and concise solution to above problem. – FallAndLearn Mar 21 '16 at 16:32
  • @FallAndLearn Which you edited in 4 minutes after mine. Anyway, it isn't correct. – Tim Mar 21 '16 at 16:33
  • See the first comment on my answer. He asked another way to come to the required solution. – FallAndLearn Mar 21 '16 at 16:35
  • I just took an example string variable name as lyricsParagraphElements. It can be anything. – FallAndLearn Mar 21 '16 at 16:36
  • Which is posted after mine. However the answer is still wrong, I guess that `lyricsParagraphElements` is an `Collection`. Besides your regex will match anything between `[` and `]` greedy, which means that every thing between the first `[` and the last `]` will be removed. Consider a string with multiple occurrences: `123[a]4[b]5[c]67[d]8[e]9` will become `1239` instead of `123456789`. I made the same mistake though. Andy Turner noted the error. – Tim Mar 21 '16 at 16:39
  • 1
    @Tim You are right tim. The `replaceAll("\\[.*\\]", "");` not working. – God Mar 21 '16 at 16:45
  • I checked it. Above will only replace the characters inside first []. Thanks all. – FallAndLearn Mar 21 '16 at 16:55
1

You are getting "weird stuff" because you are printing the string representation of the array, not converting the array to a String.

Instead of lyricsArray.toString(), use

new String(lyricsArray);

But if you do this, you will find that you are not actually removing characters from the string, just replacing them with spaces.

Instead, you can shift all of the characters left in the array, and construct the new String only up to the right number of characters:

int src = 0, dst = 0;
while (src < lyricsArray.length) {
  while (src < lyricsArray.length && lyricsArray[src] != '[') {
    lyricsArray[dst++] = lyricsArray[src++];
  }
  if (src < lyricsArray.length) {
    ++src;
    while (src - 1 < lyricsArray.length && lyricsArray[src - 1] != ']') {
      src++;
    }
  }
}
String lyricsString = new String(lyricsArray, 0, dst);
Andy Turner
  • 137,514
  • 11
  • 162
  • 243
1

You're printing a char[] and Java char[] does not override toString(). And, a Java String is immutable, but Java does have StringBuilder which is mutable (and StringBuilder.delete(int, int) can remove arbitrary substrings). You could use it like,

String songLyrics = lyricsParagraphElements.get(1).text();
StringBuilder sb = new StringBuilder(songLyrics);
int p = 0;
while ((p = sb.indexOf("[", p)) >= 0) {
    int e = sb.indexOf("]", p + 1);
    if (e > p) {
        sb.delete(p, e + 1);
    }
    p++;
}
System.out.println(sb);
Elliott Frisch
  • 198,278
  • 20
  • 158
  • 249
1

This is exactly regex string for your case:

\\[([\\w\\%\\@]+)\\]

It's very hard when your plant string is contain special symbol. I can't find shorter regex, without explain special symbol like an exception. reference: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#cg

================

I'm read your new case, a string contain symbol "-" or something else in !"#$%&'()*+,-./:;<=>?@\^_`{|}~ add them (with prefix "\\") after \\@ on my regex string.

  • So changing `String songLyrics = text.replaceAll("\\[.*?\\]", "");` to `String songLyrics = text.replaceAll("\[([\w\%\@]+)\]", "");`? will make it perfect? because `Eclipse` saying it's invalid escape sequence. – God Mar 21 '16 at 17:07
  • @God: I don't know why, when \\ (two backslash) is replate to \ (one backslash). I'm corrected it. Please remember it's always \\, not \. – cuong hoang Mar 21 '16 at 17:13
  • @God: I mind stackoverflow is confusing me by auto replace 2 backslash to 1 backslash. Try string i'm quote in code tag and enjoy. – cuong hoang Mar 21 '16 at 17:26
  • Please refer to http://stackoverflow.com/questions/36205033/mystring-replaceallregex-not-working-as-expected. Thank you. – God Mar 24 '16 at 16:20