3

I am looking for an efficient way to remove last n lines from a String. Efficient as in- fast performing as well as something that does not create too may objects. Therefore would like to stay away from split(). Especially because, at times, my strings could be a few hundred or even thousand lines.

For instance, I am getting a string as such:

This is a sample code line 1.
This is a sample code line 2.

Warm Regards,
SomeUser.

The last 3 lines (an empty line, "Warm Regards,", and "SomeUser.") is what I am trying to get rid of. Note that the content (including the last 3 lines) isn't fixed.

I am thinking of counting the lines first using this solution here: https://stackoverflow.com/a/18816371/1353174 and then again, use another similar loop to reach to a position that is lines - n and do a substring till that position.

However, just posting this problem here to know if there are any other and perhaps more efficient ways to achieve this. External library-based solutions (like Apache Commons StringUtils) are also welcome.

Community
  • 1
  • 1
kpatil
  • 161
  • 1
  • 2
  • 7
  • You want to stay away from `split`, yet the link you provided, top answer uses `split`. Maybe because it's been optimised? Would using `split` really be a bottleneck in your probram? – AntonH Sep 23 '14 at 19:00
  • No, the link I provided is supposed to take you to an answer by user "Veger" and that does not use split. – kpatil Sep 24 '14 at 04:28
  • Which is why I said **top answer**, not **your answer**. – AntonH Sep 24 '14 at 13:16

4 Answers4

3

You can use String.lastIndexOf to find last third occurrence of '\n' symbol and then do String.substring to get the result.

     public static void main(String[] args) {
        String s = "This is a sample code line 1.\n" +
                "This is a sample code line 2.\n" +
                "\n" +
                "Warm Regards,\n" +
                "SomeUser.";

        int truncateIndex = s.length();

        for (int i = 0; i < 3; i++) {
            System.out.println(truncateIndex);
            truncateIndex = s.lastIndexOf('\n', truncateIndex - 1);
        }

        System.out.println(s.substring(0, truncateIndex));
        System.out.println("--");
    }

This code snippet intentionally doesn't care for corner cases, such as when there is less than three lines in input string, to make code simple and readable.

Aivean
  • 10,692
  • 25
  • 39
  • Thanks, mate. Your solution is the one I am using. – kpatil Sep 24 '14 at 04:44
  • @searchengine27 I don't quite understand what you mean, but the first part, when you state `If you say s.lastIndexOf('\n', 2000); in a string that only has 3 new line characters, it will always give you the index of the first new line character in the string.` is simply not true: `"\n\n\n".lastIndexOf('\n', 2000)` returns `2`. – Aivean Jul 28 '15 at 21:42
1
public static final String SAMPLE_TEXT = "This is a sample code line 1.\nThis is a sample code line 2.\r\n\nWarm Regards,\r\nSomeUser.";

public static void main (String[] args) throws java.lang.Exception {
    String[] lines = SAMPLE_TEXT.split("\\r?\\n"); // catches Windows newlines (\r) as well)
    for (int i = 0; i < lines.length - 3; i++) {   // lines.length - 3 to discard the last 3 lines
        System.out.println(lines[i]);
    }
}

Here's a runnable example:

http://ideone.com/nwaMcD

  • 1
    He did explicitly say that he wanted to avoid using String.split(). That being said, I cannot imagine what kind of constraints he actually has that would prevent him from using it... – user3062946 Sep 23 '14 at 19:15
  • No hard constraints as such. I am just a bit paranoid about having split() create many String objects. My text would be anywhere near a couple of thousand lines per entity. In a worst case scenario, there would be between 500 to 800 such entities I have to process, and the process itself has to ensure that it finishes within 10 minutes. Has to sleep for 10 minutes, and start all over again, and keep doing this 24x7. And all this would be happening on a very demanding server. – kpatil Sep 24 '14 at 04:31
0
  @scala.annotation.tailrec
  def rmLines(in: String, nlines: Int): String =
    if (nlines == 0) {
      in
    } else {
      val lastBreakIndex = in.lastIndexOf('\n')
      if (lastBreakIndex == -1) {
        in
      } else {
        rmLines(in.substring(0, lastBreakIndex), nlines - 1)
      }
    }

solyd
  • 782
  • 2
  • 8
  • 18
-2

Use regular expressions to do it : http://docs.oracle.com/javase/tutorial/essential/regex/

Kraal
  • 2,779
  • 1
  • 19
  • 36