3

Hey, I need to read a textfile in java. The problem is that the file has the following format:

Id time1 time2 time3 ...
ID2 time1 time2 time3 ...

I need to be able to first read all the IDs, then read all the time1, then all time2 etc. Can anyone give me some hints how can I do this please in java? Efficiency is important here since this needs to be done for thounsands of times <- this is my problem Thanks in advance for your help

Vincent Ramdhanie
  • 102,349
  • 23
  • 137
  • 192
tzer
  • 31
  • 2
  • 6
    Please see Google for approximately 1 billion examples of how to read in a file line-by-line in java. Or search SO. – Richard H Apr 19 '11 at 12:10
  • The problem is efficiency, I have already naively implemented this reading line by line and getting to the specified timer but it is taking quite long. – tzer Apr 19 '11 at 12:12
  • @Richard I don't think his question had to do with reading in a text file, but reading a text file of that particular structure efficiently... – Diego Apr 19 '11 at 12:12
  • @tzer: you can only read in a file as fast as your disk-access will allow. AFAIK you can't really do better than BufferedReader or whatever. – Richard H Apr 19 '11 at 12:18
  • http://stackoverflow.com/questions/4716503/best-way-to-read-a-text-file , http://stackoverflow.com/questions/2714385/read-text-file-in-java , http://stackoverflow.com/questions/2864117/read-data-from-a-text-file-using-java , ... – Joris Meys Apr 19 '11 at 12:45
  • @tzer : If the problem is efficiency, why not adding your code and see if people can optimize it? – Joris Meys Apr 19 '11 at 12:48

5 Answers5

2

Transpose the file. Ids on line 1, time1 on line 2, and so on. Of course, this is beneficial if this can be done only once and then many reads on that file are expected.

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
2

The simplest way would be to read the whole file line by line once, parsing the lines as you go - then you can very easily get "all the IDs" followed by "all the first times" etc.

If the file is too large to do that, you may want to consider writing a tool to change the file structure - open up several files for writing (one per column) then you can read an input line, write the output data to each file, move onto the next line etc. You can do this once and then read each file as and when you need it.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
2

One solution is to parse the file once and create an index of the positions of each ids in the file. Then, you can reposition the reading 'cursor' as needed to ids.

EDIT

This solution is practical if the whole file content cannot be loaded into memory. To limit the number of physical readings, a LRU cache keeping the most recently read or used id-times combinations could improve performance.

Jérôme Verstrynge
  • 57,710
  • 92
  • 283
  • 453
1

We can't read files column-by-column. Read the whole file into memory (FileReader of java.nio) and parse the content (String#split on each line) in a datastructure like

Map<String, List<String>>

where the maps key is the id (ID, ID2, ..) and the value a simple list that contains all the time values.

Andreas Dolk
  • 113,398
  • 19
  • 180
  • 268
0

If you're on a Linux/UNIX platform, you could do some preprocessing with the cut command

cvh
  • 13
  • 6