1

I am trying to duplicate the below data 1 million times and want to write to file.

row1,Test,2.0,1305033.0,3.0,sdfgfsg,2452345,sfgfsdg,asdfgsdfg,Gasdfgfsdgh,sdgh,sdhd sdgh,sdgh,sdgh,,sdhg,,sdgh,,,,,,,sdgh,,,,,,,,,05/12/1954,,,,,,sdghdgsh,sdfhgd,,12/25/1981,,,,12/25/1981,,,,,,,,,,,,,sdgh, dsghgh; sdgh,,,,,1.0,sdfsdf,sfgggf,34f

each time I want to update the first column to no of records, so my second row will be

row2,Test,2.0,1305033.0,3.0,sdfgfsg,2452345,sfgfsdg,asdfgsdfg,Gasdfgfsdgh,sdgh,sdhd sdgh,sdgh,sdgh,,sdhg,,sdgh,,,,,,,sdgh,,,,,,,,,05/12/1954,,,,,,sdghdgsh,sdfhgd,,12/25/1981,,,,12/25/1981,,,,,,,,,,,,,sdgh, dsghgh; sdgh,,,,,1.0,asrg,awrgtwag,245sfgsfg

I tried using String builder, I am not able to append more than 10,000 rows.... The program becomes very slow....

Any suggestions...

I'm fine trying to write code in other languages

The below is the code snippet which prepares the data to write to the file and in my app I'll get data as Object[]

   private static void writecsv(Map<String, Object[]> data) throws Exception{
            Set<String> keyset = data.keySet();
            StringBuilder sb =new StringBuilder();;
             for(int count=0; count < OUTPUT_RECORD_COUNT;count++)
                {    

                 for (String key : keyset)
                    {
                     Object[] objArr = data.get(key);
                     for (Object obj : objArr)
                        {
                            if(obj ==null)
                                obj=BLANK;
                            sb.append(obj.toString() + COMMA);
                            sb.toString();
                        }    
                     sb.setLength(sb.length()-1);
                     sb.append(NEW_LINE);
                    }
                }
             System.out.print(  sb.toString());             
        }
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
upog
  • 4,965
  • 8
  • 42
  • 81
  • 1
    Probably you need to use flush(). I'm not sure.. Let's say every 1000 rows do flush.. Again I'm not sure but i suppose that data is not written to file but thay are stored in buffer in memory... – milandjukic88 Dec 23 '13 at 21:18
  • Please show us your code. – vanje Dec 23 '13 at 21:26
  • 1
    where are you writing in file ? are you refering SOP ? use setLength(0) in good interval – Mani Dec 23 '13 at 21:32
  • My actual req is to write to a file, Just printed to check if data is processed correctly – upog Dec 23 '13 at 21:34
  • Processing itself i am not able to complete as StrindBuilder is not able to hold more than 10,000 records – upog Dec 23 '13 at 21:35
  • You are probably running out of memory. Can you share your code? – Wilson Soethe Cursino Dec 23 '13 at 21:19
  • Maybe it would be better to show us your real code including your attempt writing to a file. And where is the part for changing the first column? – vanje Dec 23 '13 at 21:52

4 Answers4

1

If you print to System.out directly in your inner for-loop, you won't have to buffer everything in memory in the StringBuilder.

Markus
  • 613
  • 1
  • 7
  • 20
  • Opening and writing data to a file for each record will be costlier – upog Dec 23 '13 at 21:33
  • 1
    Why would you open your file for each record? You can open the file beforehand, write line after line and after the loop close the file. – vanje Dec 23 '13 at 21:55
1

You want to write to a file, but I don't see any OutputStream or FileWriter in your code.

Don't use a StringBuilder as a buffer.

private static final int OUTPUT_RECORD_COUNT = 1000000;
private static final String BLANK = "";
private static final String COMMA = ",";
private static final String FILE_ENCODING = "Cp1252"; // Windows-ANSI


/*
 * Creates a String for the fields in array fields by joining 
 * the String values with COMMA separator.
 * First character is also a COMMA because later we will put one field
 * in front of the resulting string.
 */
private static String createLine(Object[] fields) {
  StringBuilder sb = new StringBuilder();
  for(Object field: fields) {
    sb.append(COMMA).append(field == null ? BLANK : field.toString());
  }
  return sb.toString();
}


/*
 * Added the fileName parameter.
 */
private static void writecsv(Map<String, Object[]> data, String fileName) throws Exception {
  Set<String> keyset = data.keySet();

  // Use a
  // - FileOutputStream to write bytes to file
  // - OutputStreamWriter to convert text strings to bytes according to a character encoding
  // - BufferedWriter to use an in-memory buffer for writing to the file
  // - PrintWriter for convencience methods like println()
  PrintWriter out = new PrintWriter(new BufferedWriter(
      new OutputStreamWriter(new FileOutputStream(fileName), FILE_ENCODING)));

  try {
    // It seems each key represents one original line
    for (String key : keyset) {
      // Create each line - at least the part after the "rowX" - only once.
      String line = createLine(data.get(key));

      // And you want every line OUTPUT_RECORD_COUNT times duplicates
      for(int count=0; count < OUTPUT_RECORD_COUNT;count++) {    
        // Put "rowX" in front of every line, where X is the value of count.
        out.print("row");
        out.print(count);
        out.println(line);
      }
    } finally {
      // Close the Writer even in case of an exception.
      out.flush();
      out.close();
    }
  }
}
vanje
  • 10,180
  • 2
  • 31
  • 47
0

Ummm, have you tried using bash?

#!/bin/bash
var=1
while [ $var -le 1000000 ]
do
    echo "$var" >> temp
    var=$(( $var + 1 ))
done

I tried to run the program and it took around couple minutes to finish appending 1 million lines

JoeC
  • 1,850
  • 14
  • 12
0

Your code is keeping all the data in memory, which is why it cannot scale. Instead, you should open the file beforehand and then write to it line by line.

See, e.g., this answer for a simple example on how to do this.

Also note that when you are serious about writing proper CSV, you should consider using a library for that, such as opencsv. Then things like proper quoting will be handled for you.

Community
  • 1
  • 1
Alex Krauss
  • 9,438
  • 4
  • 27
  • 31