2

in my program i want to read an PLSQL file and delete the comments that starts with --
i'm putting every comment in it's own line so i could delete that specific line (sometimes i have the code and the comments in the same line that's way i'm doing "\n--").
i export my program to a jar file and it works fine in my desktop but in another computer (reading different PLSQL files ) it gives me Java heap space error even when i try

java -Xmx256m -jar myjar.jar

error :

Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)

Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Unknown Source)
    at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
    at java.lang.AbstractStringBuilder.append(Unknown Source)
    at java.lang.StringBuffer.append(Unknown Source)
    at ParserDB.ScriptNoComment(ParserDB.java:142)
    at ParserDB.GetTheName(ParserDB.java:54)
    at Rapport.SearchCcInDB(Rapport.java:189)
    at Rapport.listDB(Rapport.java:77)
    at Rapport.main(Rapport.java:472)
    ... 5 more

my code is :

public static String ScriptNoComment(String fileName){
    String result = null ;      
    try{
        FileInputStream fstream = new FileInputStream(fileName);
        DataInputStream in = new DataInputStream(fstream);
        BufferedReader br = new BufferedReader(new InputStreamReader(in));
        StringBuffer strOut = new StringBuffer();
        StringBuilder Out = new StringBuilder();
        String strLine;

         while ((strLine = br.readLine()) != null)   {

            if(strLine.contains("--")){
                strLine = strLine.replaceAll("--","\n--");
            }
            strOut.append(strLine+"\n");
        }

        in.close();   
        //delete comment
        String[] lines = strOut.toString().split("\\n");
        for(String s: lines){
            if(s.contains("--")){
                s="";
            }
            Out.append(s+"\n");
        }

        result = Out.toString();
        result = result.toUpperCase();      
        result = result.replaceAll("\"", "");
        result = result.replaceAll("\\r\\n|\\r|\\n", " ");
        result = result.replaceAll("\\s+", " ");

        }catch (Exception e){          
       System.err.println("Error: " + e.getMessage());
      }

    return result ;

}

is there anyway to optimize my code , thanks in advance

EDIT
1-) i checked the heap size in the other computer with the command :

java -XX:+PrintFlagsFinal -version | findstr /i "HeapSize PermSize ThreadStackSize"

the result was : min : 16M and Maxsize : 256M so i should tape in the java -jar :-Xmx512m instead of -Xms256m

2-) i removed (just for test) the stringbuilder and all the replaceAll and still got the same error because my file was too big .

so what i did is to count the lines for each file i'm reading and try (depending on the lines ) to read only the 50 first lines for example and apply my methods to only this 50 lines

thank you all for your answers

maryam
  • 147
  • 3
  • 11
  • Maybe just give it more heap space? -Xmx2g e.g. – Uwe Allner Apr 19 '16 at 10:01
  • Better use the Stream approach. If your text is large, it will consume a lot of memory because each operation allocates a `new String()`. If you are in trouble when reading the file, you will be in bigger trouble with the regexp operations. – gaborsch Apr 19 '16 at 10:03
  • 2
    This program is way to complicated. At the end, you have a StringBuffer containing the input, a string array with the same content and a StringBuilder which also contains everything but the comments. That's three times the memory requirement. Instead, you could just read the file line by line, check if the line contains a comment (or is a content) and omit/shorten it if so. You could also do the replaces on the line, so you don't need them on the result. Doing that, your memory requirement will be greatly reduced. – Erich Kitzmueller Apr 19 '16 at 10:06
  • @UweAllner is there any damage using -Xmx2g , i mean is this java -Xmx2g -jar myjar.jar can block my computer ? – maryam Apr 19 '16 at 10:08
  • @ammoQ yes but sometimes i have the code and the comments in the same line that's way i'm doing "\n--" – maryam Apr 19 '16 at 10:12
  • Why are you only removing comments wtih "--"? There can be comments with the "/* */" syntax too. – vl4d1m1r4 Apr 19 '16 at 10:48
  • @maryam: In such cases, you can shorten the string using `strLine=strLine.substring(0, strLine.indexOf("--"))` – Erich Kitzmueller Apr 19 '16 at 12:30
  • Use a `FileOutputStream` instead of `StringBuffer` + `StringBuilder`. – Dávid Horváth Apr 19 '16 at 13:24

3 Answers3

2

If you have java 8 you can try this code for inline editing of the lines as you process them

public static String scriptNoComment(String fileName) {

  Path filePath = Paths.get(fileName);
  try (Stream<String> stream = Files.lines(filePath)) {

    List<String> linesWithNoComments = new ArrayList<String>();

    stream.forEach(line -> {

      if (line.startsWith("--")) {
        return;
      }

      String currentLine = line;

      int commentStartIndex = line.indexOf("--");
      if (commentStartIndex != -1) {
        currentLine = line.substring(0, commentStartIndex);
      }

      currentLine = currentLine.toUpperCase();
      currentLine = currentLine.replaceAll("\"", "");
      currentLine = currentLine.replaceAll("\\r\\n|\\r|\\n", " ");
      currentLine = currentLine.replaceAll("\\s+", " ").trim();

      if (currentLine.isEmpty()) {
        return;
      }

      linesWithNoComments.add(currentLine);

    });

    return String.join("\n", linesWithNoComments);

  } catch (IOException e) {
    e.printStackTrace(System.out);
    return "";
  }
}

If java 8 is not an option then you can use Apache StringUtils::join and FileUtils::LineIterator to achieve the same result. Hope this solves the problem.

EDIT

Following Nicolas Filotto suggestion I added writing to file after certain number of processed lines (the number was chosen totally random). I tested both methods and the first one fails with files that have size near to the heap size (the joining of lines in string has the same problem as the OP code). With the second approach I tested with a 2GB file and after 2min of executing I had the ${fileName}_noComments file next to the input file.

public static int LINES_BATCH = 10000;

private static void scriptNoComment(String fileName) {

  Path filePath = Paths.get(fileName);
  try (Stream<String> stream = Files.lines(filePath); BufferedWriter fileOut = getFileOutWriter(fileName)) {

    List<String> linesWithNoComments = new ArrayList<String>();

    stream.forEach(line -> {

      if (line.startsWith("--")) {
        return;
      }

      String currentLine = line;

      int commentStartIndex = line.indexOf("--");
      if (commentStartIndex != -1) {
        currentLine = line.substring(0, commentStartIndex);
      }

      currentLine = currentLine.toUpperCase();
      currentLine = currentLine.replaceAll("\"", "");
      currentLine = currentLine.replaceAll("\\r\\n|\\r|\\n", " ");
      currentLine = currentLine.replaceAll("\\s+", " ").trim();

      if (currentLine.isEmpty()) {
        return;
      }

      linesWithNoComments.add(currentLine);

      if (linesWithNoComments.size() >= LINES_BATCH) {
        writeCurrentBatchToFile(fileOut, linesWithNoComments);
      }

    });

  } catch (IOException e) {
    e.printStackTrace(System.err);
  }
}

private static BufferedWriter getFileOutWriter(String fileName) {
  BufferedWriter fileOut;
  try {
    fileOut = new BufferedWriter(new FileWriter(fileName + "_noComments", false));
    return fileOut;
  } catch (IOException e) {
    throw new RuntimeException("Error while creating out writer", e);
  }
}

private static void writeCurrentBatchToFile(BufferedWriter fileOut, List<String> linesWithNoComments) {
  try {

    for (String line : linesWithNoComments) {
      fileOut.write(line + " ");
    }

    linesWithNoComments.clear();
  } catch(IOException e) {
    throw new RuntimeException("Unable to write lines to file", e);
  }
}
Community
  • 1
  • 1
vl4d1m1r4
  • 1,688
  • 12
  • 21
  • `ArrayList` is not better than a `StringBuilder`. Better use a [StringWriter](https://docs.oracle.com/javase/7/docs/api/java/io/StringWriter.html) with an initial capacity of the filesize. – gaborsch Apr 19 '16 at 12:09
2

Assuming that your PLSQL file is huge, your problem here is probably due to the fact that you load the entire file into memory which is not the good approach in this case, you should read it line by line and write the result into a temporary file instead of returning the content as a String.

It is a little bit more complex to write but it is a much more scalable approach indeed let's say that today you increase your heap size to 4Go, tomorrow the file is twice bigger will you double your heap size?

Nicolas Filotto
  • 43,537
  • 11
  • 94
  • 122
1

You are using:

    strLine = strLine.replaceAll("--","\n--");

and then you are writing to String Buffer then to String Builder.

Since you just want to remove these comments, replace

    if(strLine.contains("--")){
        strLine = strLine.replaceAll("--","\n--");
     }
    strOut.append(strLine+"\n");

with

    int chk=strLine.indexOf("--");
      if(chk!=-1)
        strLine = strLine.subtring(0,chk);
    Out.append(strLine +"\n");

Hopefully this solves your problem as you wont be using StringBuffer and utilizing less memory.