1

I have word sequences like

@ ABC
@ ABCCD
@CDSFSF
@SDDSD
@SFSFS

100000 words in number I need code to remove @ symbol from all word sequence.

3 Answers3

4

You can do this:

str = str.replaceAll("^@", "");

Demo on ideone.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • @karu 1000000 is not that large a number, I doubt that you would see issues running this in a loop. [ideone ran 1000000 replacements in 1.38s](http://ideone.com/ueeeIU). – Sergey Kalinichenko Mar 02 '13 at 10:19
  • @Some1.Kill.The.DJ Then he needs to do this in a loop, one line at a time (it looks like he's got one word per line). – Sergey Kalinichenko Mar 02 '13 at 10:19
  • @karu start position can be start of the line **or** first character in a word...what do you want? – Anirudha Mar 02 '13 at 10:26
  • @dasblinkenlight thanks, Got it! the time is taken for accessing corpora. Code works fine... –  Mar 02 '13 at 10:27
  • @Some1.Kill.The.DJ @ ആവാം @ കൊല്ലം @ ഹൈവേ These are my words I need ആവാം Instead of @ ആവാം Words are aligned line by line –  Mar 02 '13 at 10:30
1

Fastest way to implement it, is of course, replaceFirst method:

String exampleValue = "@ CDSFSF";

long start = System.currentTimeMillis();
for (int i = 0; i < 100000 ; i++) {
    exampleValue.replaceFirst("^@\\s+", "");
}
long end = System.currentTimeMillis();
System.out.println(end - start);

It takes about 350 milliseconds on my computer.

But replaceFirst method creates Pattern instance for each invoke.

String exampleValue = "@ CDSFSF";
Pattern pattern = Pattern.compile("^@\\s+");
long start = System.currentTimeMillis();
for (int i = 0; i < 100000 ; i++) {
    pattern.matcher(exampleValue).replaceFirst("");
}
long end = System.currentTimeMillis();
System.out.println(end - start);

It takes about 150 milliseconds on my computer. More than two times faster.

But if all your cases look like "@ XXXXX" you can write a code which find first letter in the word and get substring after that:

String exampleValue = "@ CDSFSF";

long start = System.currentTimeMillis();
for (int i = 0; i < 100000 ; i++) {
    char[] array = exampleValue.toCharArray();
    int c = 0;
    for (; c < array.length;c++) {
        if (Character.isLetter(array[c])) {
            break;
        }
    }
    exampleValue.substring(c);
}
long end = System.currentTimeMillis();
System.out.println(end - start);

It takes about 30 milliseconds on my computer. The fastest one.

If I were you I would use second solution with Pattern class, because it simple and fast.

Michał Ziober
  • 37,175
  • 18
  • 99
  • 146
0

to remove @ from all words

(?<=\s|^)@

So it would be

str.replaceAll("(?<=\\s|^)@", "");
Anirudha
  • 32,393
  • 7
  • 68
  • 89