0

I am trying to get the average, mean, max and min temperature and humidity form a text file. This text file is capturing everyday data of temp & Humidity. I am facing the problem with the formats of the data which are as follows:

2017-05-02 17:31:13 24.00,49.00
2017-05-02 17:32:13 24.00,49.00
2017-05-02 17:33:13 24.00,49.00  
2017-05-02 17:34:14 24.00,49.00  
2017-05-02 17:35:14 24.00,49.00 
2017-05-02 17:36:14 24.00,49.00 
2017-05-02 17:37:14 24.00,49.00  
2017-05-02 17:38:14 24.00,49.00

here, I am not able to split the columns properly as there are many spliters. I can calculate average and all, but first the program should read the column of temp & humidity.

Data Description: 1st column: Date 2nd column: time 3rd column: Temp 4th column: Humidity

Can someone please help me to read the both temp and humidity properly, so that I can calculate average and all.

Hackaholic
  • 19,069
  • 5
  • 54
  • 72
Avionix
  • 1
  • 2
  • show us what you have tried so far? – Hackaholic May 03 '17 at 09:38
  • one simple idea is to use the "," to locate your two values - 5 characters before and 5 after. At least that is what I would do - in excel. – Solar Mike May 03 '17 at 09:38
  • replace the comma with a whitespace, then split the line at the whitespaces. There are tons of receipies: http://stackoverflow.com/questions/3277503/how-do-i-read-a-file-line-by-line-into-a-list – Moritz May 03 '17 at 09:41
  • Furthermore, I would try to change the data logger so that it does not include this stupid comma. I would use pandas to analyse the data (if the formatter are correct). – Moritz May 03 '17 at 09:43

1 Answers1

1

For example:

import numpy as np
import pandas as pd

data = []
with open('data.txt', 'r') as f:
    for line in f:
        temp = line.replace(',',' ').strip('\n').split(' ')
        data.append(temp)

df = pd.DataFrame.from_records(data)
df.columns = ['date', 'time', 'temperature', 'humidity']
# if the data is not recogniced as float
df = df.apply(pd.to_numeric, errors='ignore')
# you could use mean max median etc
df.humidity.mean()
Moritz
  • 5,130
  • 10
  • 40
  • 81