I need some help with parsing badly designed csv (Comma-separated values) file. The file contains current meteorological data which is updated every 2.5s. Here is the structure of the file:
1.00 - Csv File Version ID (XX.XX) Floating Point
2012 - Year (yyyy format) Integer
10 - Month Integer
31 - Day Integer
10 - Hour (24-hour format) Integer
58 - Minute Integer
45 - Second Integer
2 - Wind speed 10min average (mph) Floating Point
3 - Wind speed (mph) Floating Point
103 - Wind Direction(degrees) Floating Point
48 - Inside Humidity (%) Floating Point
91 - Outside Humidity (%) Floating Point
67,5 - Inside Temperature (°F) Floating Point
36,5 - Outside Temperature (°F) Floating Point
29,867 - Barometer (in) Floating Point
35,969 - Total Rain (in) Floating Point
0,00 - Daily Rain (in) Floating Point
Here is an example of actual recording:
1.00,2012,11,3,18,36,16,3,4,281,49,74,73,1,55,5,29,890,37,055,0,00
Now I have already written parser in Java. I am using two additional libraries:
- JodaTime 2.1
OpenCsv 2.3
// First we read file. CSVReader reader = new CSVReader(new FileReader("/VPLive/data.csv"));` List<String[]> data = reader.readAll(); reader.close(); // Actual data is in first element, which contains string array. String[] records = data.get(0); // First we parse date and time. DateTime dateTime= new DateTime(Integer.parseInt(records[1]), Integer.parseInt(records[2]), Integer.parseInt(records[3]), Integer.parseInt(records[4]), Integer.parseInt(records[5]), Integer.parseInt(records[6])); // Then we parse air temperature. double airTemperatureFahrenheit = Double.parseDouble(records[14] + "." + records[15]);
Now the problem with this approach is, that the file separates fixed value from decimal value with comma. Now this is solvable as I have shown for air temperature in code example. But get this:
When the air temperature is for example 55°F, it prints only 55. There is no decimal zero after fixed value. The same goes for wind speed. The values that can have decimal point value are:
- Wind speed 10min average
- Wind speed (mph)
- Inside Temperature (°F)
- Outside Temperature (°F)
So there are 4^2 = 16 combinations of different file structure. I am currently stuck as I do not know how to solve this problem. I am thinking about setting reference point, for example I know that barometer must be in a given interval.
Edit: I forgot to mention. The single row in data.csv is constantly being overwritten every 2.5s. So I cannot see previous values. But I do have values in archive for every 1min. But I need to read that file whenever is updated because I am implementing live functionality.