I have a function to read in a tab delimited file which puts each column in a list and returns a list of lists with all the values from the column. This works fine for my small test file that I used with 1 column and 1850 rows, but I am now trying it with ~30k columns and it has been running for a few hours and still not finished.
How can I modify the code below to do this faster? If reading in a file if 30k rows with 1850 columns is faster i can also transpose the input files.
public static List<List<String>> readTabDelimited(String filepath) {
List<List<String>> allColumns = new ArrayList<List<String>>();
try {
BufferedReader buf = new BufferedReader(new FileReader(filepath));
String lineJustFetched = null;
for (;;) {
lineJustFetched = buf.readLine();
if (lineJustFetched == null) {
break;
}
lineJustFetched = lineJustFetched.replace("\n", "").replace("\r", "");
for (int i = 0; i < lineJustFetched.split("\t").length; i++) {
try {
allColumns.get(i).add(lineJustFetched.split("\t")[i]);
} catch (IndexOutOfBoundsException e) {
List<String> newColumn = new ArrayList<String>();
newColumn.add(lineJustFetched.split("\t")[i]);
allColumns.add(newColumn);
}
}
}
buf.close();
} catch (Exception e) {
e.printStackTrace();
}
return allColumns;
}
>` is a definite sign of object phobia...
– Boris the Spider Feb 21 '16 at 15:43