0

I need some help with parsing badly designed csv (Comma-separated values) file. The file contains current meteorological data which is updated every 2.5s. Here is the structure of the file:

1.00 - Csv File Version ID (XX.XX) Floating Point
2012 - Year (yyyy format) Integer
10 - Month Integer
31 - Day Integer
10 - Hour (24-hour format) Integer
58 - Minute Integer
45 - Second Integer
2 - Wind speed 10min average (mph) Floating Point
3 - Wind speed (mph) Floating Point
103 - Wind Direction(degrees) Floating Point
48 - Inside Humidity (%) Floating Point
91 - Outside Humidity (%) Floating Point
67,5 - Inside Temperature (°F) Floating Point
36,5 - Outside Temperature (°F) Floating Point
29,867 - Barometer (in) Floating Point
35,969 - Total Rain (in) Floating Point
0,00 - Daily Rain (in) Floating Point

Here is an example of actual recording:

1.00,2012,11,3,18,36,16,3,4,281,49,74,73,1,55,5,29,890,37,055,0,00

Now I have already written parser in Java. I am using two additional libraries:

  • JodaTime 2.1
  • OpenCsv 2.3

    // First we read file.
    CSVReader reader = new CSVReader(new FileReader("/VPLive/data.csv"));`
    
    List<String[]> data = reader.readAll();
    reader.close();
    
    // Actual data is in first element, which contains string array.
    String[] records = data.get(0);
    
    // First we parse date and time.
    DateTime dateTime= new DateTime(Integer.parseInt(records[1]), Integer.parseInt(records[2]), Integer.parseInt(records[3]), Integer.parseInt(records[4]), Integer.parseInt(records[5]), Integer.parseInt(records[6]));
    
    // Then we parse air temperature.
    double airTemperatureFahrenheit = Double.parseDouble(records[14] + "." + records[15]);
    

Now the problem with this approach is, that the file separates fixed value from decimal value with comma. Now this is solvable as I have shown for air temperature in code example. But get this:

When the air temperature is for example 55°F, it prints only 55. There is no decimal zero after fixed value. The same goes for wind speed. The values that can have decimal point value are:

  • Wind speed 10min average
  • Wind speed (mph)
  • Inside Temperature (°F)
  • Outside Temperature (°F)

So there are 4^2 = 16 combinations of different file structure. I am currently stuck as I do not know how to solve this problem. I am thinking about setting reference point, for example I know that barometer must be in a given interval.

Edit: I forgot to mention. The single row in data.csv is constantly being overwritten every 2.5s. So I cannot see previous values. But I do have values in archive for every 1min. But I need to read that file whenever is updated because I am implementing live functionality.

Jernej Jerin
  • 3,179
  • 9
  • 37
  • 53
  • If it reads as often as every 2.5 seconds, you can use the last reading as a reference point, the changes should be minimal. To get the first reading, yes, you need some "reasonable values" reference point to be sure to get valid data the first time. – Joachim Isaksson Nov 03 '12 at 18:07
  • I think this is wrong. The wind is updated every 2.5s, where the temperature is updated every 5s. Here is the kicker. How do I know when I start reading what are the conditions? Also I have slightly updated my question. – Jernej Jerin Nov 03 '12 at 18:11
  • If the temperature is constant for 2 readings (ie 5 seconds), you have 2 values that will constant for every 2 readings in the "middle" of the others. That will limit the combinations somewhat. Also, wind direction and humidity have rather obvious limits. – Joachim Isaksson Nov 03 '12 at 18:18
  • Wind direction and humidity is not the problem. Those values are always fixed point. But when I start the program that periodically reads this file, how do I get the reference point? – Jernej Jerin Nov 03 '12 at 18:24
  • Just to clarify, are you saying that there is not always 22 fields? i.e. in cases where the decimal values are whole numbers, there are actually less fields because it doesn't put a 0, it omits the record completely. It's very similar to [this](http://stackoverflow.com/questions/11678238/using-csvbeanreader-to-read-a-csv-file-with-a-variable-number-of-columns) question, but you have more than 2 possibilities, which makes it a lot harder! – James Bassett Nov 06 '12 at 23:01

0 Answers0