0

I am new in python and in the last days I have educated myself in opening and performing operations on data stored in txt, xls, asc files with pandas, but I still have confusion when doing operations with datarames.

I have a .wac file which is in the right formatting (it has to be then used as input file for a software), but contains partially wrong values, and an .xlsx file containing the right values.

I have transferred the data to two dataframes with this code (I used skiprows to skip through the string data in both files):

data_format = pd.read_csv('Example.wac', skiprows=11, delim_whitespace=True, names=["Date", "Hour", "Temp gap North [C]", "RH %"])

data_WUFI =pd.read_excel('Temperature_RH_North.xlsx', skiprows=1, header=None, dtype=float, names=["Hour", "Temp gap [C]", "RH %"]) 

Now I need to do the following modifications to the dataframes, but I do not know where to start from and I hope I came to the right place to seek help. For data_format:

- the column 'Date' is in the format *2018-01-01* and runs to *2019-12-31*. Being obviously a date, it stays the same for 24 positions and then it increases by 1 day. I need to add rows to that column up to *2027-12-31* (without leap years)
- the column 'Hour' is in the format *01:00*. Values run from *01:00* to *24:00*. I need to add rows so that every 24 hours the date in the first column increases by one day, then the hour numbering restarts at *01:00*
- The column 'RH %' contains the same value in all rows, i.e. 0.5 

I add a snapshot of data_format to make it more clear:

enter image description here

Once the new dataframe is created, e.g. data_format_NEW I can substitute the values in 'Temp gap North [C]' with the correct values from data_WUFI (already of the right size):

data_format_NEW['Temp gap North [C]'] = data_WUFI['Temp gap [C]']

At that point I will write data_format_NEW in a .wac file:

data_format_NEW.to_csv('Example_NEW.wac', index=False, delim_whitespace=True)

but the first 12 rows will have to contain string values as in the picture:

enter image description here

I am not sure whether I got the planning right , but I hope I managed to explain myself enough to be clear

AMaz
  • 181
  • 1
  • 2
  • 10
  • 3
    I don't know what a `.wac` file is. But that error isn't because "python" doesn't recognize the file format (python doesn't really know anything about file formats, all files are just bytes), that error is coming because the file is not being *found at all*. What is your working directory? Because that is what matters if you don't want to use a full path. It is irrelevant that "It is in the same folder as the py file." – juanpa.arrivillaga Apr 25 '18 at 09:44
  • Yes, you're very correct, thanks!. I feel dumb now. Solved it! I will edit my question now because I now have the issue I was expecting after reading the file - @juanpa.arrivillaga – AMaz Apr 25 '18 at 09:58
  • 1
    Try `df = pd.read_csv('file.wac', skiprows=10, delim_whitespace=True)` – cs95 Apr 25 '18 at 10:00
  • 1
    Maybe you can use `skiprows=11` to ignore the first 11 rows. – Lambda Apr 25 '18 at 10:01

0 Answers0