2

I'v expanded my datafiles one by one but now I need to exchange two rows from my data_lv1_%Y%m%d.csv file with 2 columns from from my spektrum_%Y%m%d.csv file.

So My Folder is packed with data_lv1_%Y%m%d.csv and spektrum_%Y%m%d.csv and I need to find the matching 'data_lvl1....' and 'spektrum.....' files get the 'columns 2 and 3' from the spektrum file and paste it into the columns and 42 and 43 from the data_lv1_%Y%m%d.csv files ... The Timestamps for each line within the files are the same %Y-%m-%d %H:%M

So a data_lv1_%Y%m%d.csv looks like this:

2014-11-05 08:00    0.19    0.16    0.00    0.16    9   14.0    14.1    14.0    69  8.4 0.0 SSE 0.00    0.4 SSE 14.0    13.6    13.6    14.7    1002.7  0.00    0.0 111 0.16    111 0.0 0.00    0.0 0.003   0.000   25.5    42  11.6    25.1    0.03    17  1   73.9    1   1.264390582977165   982.2914032430997
2014-11-05 08:01    0.19    0.15    0.00    0.15    9   14.1    14.1    14.1    69  8.5 0.4 SSE 0.03    1.8 SSE 14.1    13.6    13.6    14.9    1002.7  0.00    0.0 116 0.17    116 0.0 0.00    0.0 0.003   0.000   25.5    42  11.6    25.1    0.00    9   1   39.1    1   1.2642303482739734  982.4159036331284   47  40  62  65
2014-11-05 08:02    0.18    0.15    0.00    0.15    9   14.1    14.1    14.1    69  8.5 0.0 SSW 0.00    0.4 SSW 14.1    13.7    13.7    15.0    1002.9  0.00    0.0 118 0.17    118 0.0 0.00    0.0 0.003   0.000   25.6    42  11.7    25.2    0.00    16  1   69.6    1   1.2643326687934051  982.3363982086155
2014-11-05 08:03    0.18    0.14    0.00    0.14    9   14.1    14.2    14.1    69  8.5 0.0 SSW 0.00    0.9 SSW 14.1    13.7    13.7    15.1    1002.7  0.00    0.0 121 0.17    121 0.0 0.00    0.0 0.003   0.000   25.6    42  11.7    25.2    0.00    11  1   47.8    1   1.264371779179926   982.3060119275706
2014-11-05 08:04    0.18    0.14    0.00    0.14    9   14.2    14.2    14.2    69  8.6 0.4 S   0.03    1.3 S   14.2    13.7    13.7    15.1    1002.7  0.00    0.0 121 0.17    121 0.0 0.00    0.0 0.003   0.000   25.7    42  11.8    25.3    0.00    9   1   39.1    1   1.2642078834510273  982.4333610462828
2014-11-05 08:05    0.16    0.13    0.00    0.13    9   14.2    14.2    14.2    69  8.6 0.4 S   0.03    2.2 S   14.2    13.8    13.8    15.3    1002.8  0.00    0.0 125 0.18    125 0.0 0.00    0.0 0.003   0.000   25.7    42  11.8    25.3    0.00    18  1   78.3    1   1.2641495894315147  982.4786642208419
2014-11-05 08:06    0.16    0.13    0.00    0.13    9   14.3    14.3    14.2    69  8.7 3.1 S   0.19    6.3 SSW 13.6    13.8    13.1    14.7    1002.8  0.00    0.0 128 0.18    128 0.0 0.00    0.0 0.003   0.000   25.8    42  11.9    25.4    0.00    11  1   47.8    1   1.2641310901311893  982.4930418182402
2014-11-05 08:07    0.16    0.13    0.00    0.13    9   14.3    14.3    14.3    69  8.7 0.4 SSE 0.03    1.3 S   14.3    13.8    13.8    15.4    1002.7  0.00    0.0 128 0.18    128 0.0 0.00    0.0 0.003   0.000   25.8    42  11.9    25.4    0.00    9   1   39.1    1   1.2640878249125196  982.5266690516158
2014-11-05 08:08    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    69  8.7 0.0 SSW 0.00    0.4 SSW 14.3    13.9    13.9    15.5    1002.8  0.00    0.0 128 0.18    128 0.0 0.00    0.0 0.003   0.000   25.9    42  12.0    25.6    0.00    10  1   43.5    1   1.2640420201364455  982.5622726259796
2014-11-05 08:09    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    69  8.7 0.9 S   0.05    2.7 S   14.3    13.9    13.9    15.6    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   25.9    42  12.0    25.6    0.00    13  1   56.5    1   1.2640722442946564  982.5387794136938
2014-11-05 08:10    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    68  8.5 0.4 S   0.03    1.8 S   14.3    13.8    13.8    15.6    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.0    42  12.1    25.7    0.00    14  1   60.9    1   1.263973775777945   982.6153230398941
2014-11-05 08:11    0.16    0.12    0.00    0.12    9   14.3    14.4    14.3    68  8.5 0.4 SSE 0.03    1.8 S   14.3    13.8    13.8    15.6    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.1    41  11.8    25.7    0.00    12  1   52.2    1   1.2639454425247079  982.6373498520063
2014-11-05 08:12    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    68  8.5 0.4 SSE 0.03    0.9 SSE 14.3    13.8    13.8    15.6    1002.9  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.2    41  11.9    25.8    0.00    10  1   43.5    1   1.2639898014116238  982.6028648434778
2014-11-05 08:13    0.15    0.12    0.00    0.12    9   14.4    14.4    14.4    68  8.6 0.4 S   0.03    1.8 S   14.4    13.9    13.9    15.7    1002.8  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.3    41  12.0    25.8    0.00    10  1   43.5    1   1.263874621956528   982.6924114334497
2014-11-05 08:14    0.15    0.12    0.00    0.12    9   14.4    14.4    14.4    68  8.6 0.0 SSE 0.00    0.4 S   14.4    13.9    13.9    15.7    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.5    41  12.2    26.1    0.00    14  1   60.9    1   1.2638742617490812  982.6926915033389
2014-11-05 08:15    0.15    0.11    0.00    0.11    9   14.4    14.4    14.4    68  8.6 0.4 S   0.03    1.3 S   14.4    13.9    13.9    15.9    1002.7  0.00    0.0 156 0.22    156 0.0 0.00    0.0 0.003   0.000   26.7    41  12.4    26.2    0.00    23  1   100.0   1   1.2638991682523613  982.6733264785339
2014-11-05 08:16    0.15    0.11    0.00    0.11    9   14.4    14.4    14.4    68  8.6 0.0 S   0.00    1.8 S   14.4    13.9    13.9    16.0    1002.8  0.00    0.0 158 0.23    158 0.0 0.00    0.0 0.003   0.000   26.9    41  12.6    26.4    0.00    23  1   100.0   1   1.2639438134514795  982.6386163546646   60  49  88  82
2014-11-05 08:17    0.15    0.12    0.00    0.12    9   14.4    14.5    14.4    68  8.6 0.4 S   0.03    1.8 S   14.4    13.9    13.9    16.1    1002.6  0.00    0.0 160 0.23    160 0.0 0.00    0.0 0.003   0.000   27.1    41  12.7    26.6    0.00    24  1   100.0   1   1.2637607975085479  982.780920605032

the spektrum_%Y%m%d.csv files look like this:

2014-11-05 06:54    1.2821175173539188  968.709953018414
2014-11-05 06:55    1.2816088505861456  969.0944311378387
2014-11-05 06:56    1.2812624875214302  969.3564059637908
2014-11-05 06:57    1.2810378335850467  969.5264007341631
2014-11-05 06:58    1.280514458802534   969.9226677701473
2014-11-05 06:59    1.2804410978592646  969.9782380278693
2014-11-05 07:00    1.2799436634193884  970.3552081987569
2014-11-05 07:01    1.2793760789237318  970.7856981700221
2014-11-05 07:02    1.2788601016214594  971.1773777485712
2014-11-05 07:03    1.2787234324253685  971.2811766061762
2014-11-05 07:04    1.2786275370664746  971.3540213982033
2014-11-05 07:05    1.2786632282014252  971.3269081390605

It would be great if I could overwrite columns in data_lvl1.... file with the columns from the spektrum_ ....file but I wouldn't mind if the data from spektrum_%Y%m%d.csv is just attached on the end of the file as long as they don't interfere with the last for rows of the 'data...' files which only have values every 15 Minutes....

I would appreciate some help in how to get that started and which modules to use ? Is panda an option here.... The simplest and fastest thing would be to simply attach the files from the spektrum file at the end from the data_lvl1... Thank you for any suggestions

Edit: Sorry for Unclear Question... the merge criterion would be that for every data_lvl1_TIMESTAMP file there is a spektrum_TIMESTAMP file in the folder. The Timesstamp ist %Y%m%d. both files have rows for every minute and I want to merge the data rows. The spektrum file has more rows (range from 00:00 to 23:59), the data_lvl1has fewer lines. So the rows from the spektrum files that match the timestamp in the data_lvl1 file should be added to the data_lvl1 file. either by overwriting the existing column 43 and 42 or by adding it to the end of the file.

Peter S
  • 625
  • 1
  • 9
  • 32

1 Answers1

1

You can use merge and if you want remove columns with NaN, use function dropna.

Maybe you cant read first csv, because error:

pandas.parser.CParserError: Error tokenizing data. C error: Expected 42 fields in line 2, saw 46

It is problem with number of columns. Pandas by default determines number of columns by first row of data (it can be header). So in your solution it recommended 42 columns, but in second line is 46 columns.
One solution is add tabulators to first row. If you want to remove all columns with NaN values, you can add equally number of tabulators or more, because function dropna remove them.
Next solution is definition of all names of 46 columns. link

I use first approach with adding tabulators.

import pandas as pd
import numpy as np
import io

temp=u"""2014-11-05 08:00   0.19    0.16    0.00    0.16    9   14.0    14.1    14.0    69  8.4 0.0 SSE 0.00    0.4 SSE 14.0    13.6    13.6    14.7    1002.7  0.00    0.0 111 0.16    111 0.0 0.00    0.0 0.003   0.000   25.5    42  11.6    25.1    0.03    17  1   73.9    1   1.264390582977165   982.2914032430997               
2014-11-05 08:01    0.19    0.15    0.00    0.15    9   14.1    14.1    14.1    69  8.5 0.4 SSE 0.03    1.8 SSE 14.1    13.6    13.6    14.9    1002.7  0.00    0.0 116 0.17    116 0.0 0.00    0.0 0.003   0.000   25.5    42  11.6    25.1    0.00    9   1   39.1    1   1.2642303482739734  982.4159036331284   47  40  62  65
2014-11-05 08:02    0.18    0.15    0.00    0.15    9   14.1    14.1    14.1    69  8.5 0.0 SSW 0.00    0.4 SSW 14.1    13.7    13.7    15.0    1002.9  0.00    0.0 118 0.17    118 0.0 0.00    0.0 0.003   0.000   25.6    42  11.7    25.2    0.00    16  1   69.6    1   1.2643326687934051  982.3363982086155
2014-11-05 08:03    0.18    0.14    0.00    0.14    9   14.1    14.2    14.1    69  8.5 0.0 SSW 0.00    0.9 SSW 14.1    13.7    13.7    15.1    1002.7  0.00    0.0 121 0.17    121 0.0 0.00    0.0 0.003   0.000   25.6    42  11.7    25.2    0.00    11  1   47.8    1   1.264371779179926   982.3060119275706
2014-11-05 08:04    0.18    0.14    0.00    0.14    9   14.2    14.2    14.2    69  8.6 0.4 S   0.03    1.3 S   14.2    13.7    13.7    15.1    1002.7  0.00    0.0 121 0.17    121 0.0 0.00    0.0 0.003   0.000   25.7    42  11.8    25.3    0.00    9   1   39.1    1   1.2642078834510273  982.4333610462828
2014-11-05 08:05    0.16    0.13    0.00    0.13    9   14.2    14.2    14.2    69  8.6 0.4 S   0.03    2.2 S   14.2    13.8    13.8    15.3    1002.8  0.00    0.0 125 0.18    125 0.0 0.00    0.0 0.003   0.000   25.7    42  11.8    25.3    0.00    18  1   78.3    1   1.2641495894315147  982.4786642208419
2014-11-05 08:06    0.16    0.13    0.00    0.13    9   14.3    14.3    14.2    69  8.7 3.1 S   0.19    6.3 SSW 13.6    13.8    13.1    14.7    1002.8  0.00    0.0 128 0.18    128 0.0 0.00    0.0 0.003   0.000   25.8    42  11.9    25.4    0.00    11  1   47.8    1   1.2641310901311893  982.4930418182402
2014-11-05 08:07    0.16    0.13    0.00    0.13    9   14.3    14.3    14.3    69  8.7 0.4 SSE 0.03    1.3 S   14.3    13.8    13.8    15.4    1002.7  0.00    0.0 128 0.18    128 0.0 0.00    0.0 0.003   0.000   25.8    42  11.9    25.4    0.00    9   1   39.1    1   1.2640878249125196  982.5266690516158
2014-11-05 08:08    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    69  8.7 0.0 SSW 0.00    0.4 SSW 14.3    13.9    13.9    15.5    1002.8  0.00    0.0 128 0.18    128 0.0 0.00    0.0 0.003   0.000   25.9    42  12.0    25.6    0.00    10  1   43.5    1   1.2640420201364455  982.5622726259796
2014-11-05 08:09    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    69  8.7 0.9 S   0.05    2.7 S   14.3    13.9    13.9    15.6    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   25.9    42  12.0    25.6    0.00    13  1   56.5    1   1.2640722442946564  982.5387794136938
2014-11-05 08:10    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    68  8.5 0.4 S   0.03    1.8 S   14.3    13.8    13.8    15.6    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.0    42  12.1    25.7    0.00    14  1   60.9    1   1.263973775777945   982.6153230398941
2014-11-05 08:11    0.16    0.12    0.00    0.12    9   14.3    14.4    14.3    68  8.5 0.4 SSE 0.03    1.8 S   14.3    13.8    13.8    15.6    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.1    41  11.8    25.7    0.00    12  1   52.2    1   1.2639454425247079  982.6373498520063
2014-11-05 08:12    0.15    0.12    0.00    0.12    9   14.3    14.3    14.3    68  8.5 0.4 SSE 0.03    0.9 SSE 14.3    13.8    13.8    15.6    1002.9  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.2    41  11.9    25.8    0.00    10  1   43.5    1   1.2639898014116238  982.6028648434778
2014-11-05 08:13    0.15    0.12    0.00    0.12    9   14.4    14.4    14.4    68  8.6 0.4 S   0.03    1.8 S   14.4    13.9    13.9    15.7    1002.8  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.3    41  12.0    25.8    0.00    10  1   43.5    1   1.263874621956528   982.6924114334497
2014-11-05 08:14    0.15    0.12    0.00    0.12    9   14.4    14.4    14.4    68  8.6 0.0 SSE 0.00    0.4 S   14.4    13.9    13.9    15.7    1002.7  0.00    0.0 137 0.20    137 0.0 0.00    0.0 0.003   0.000   26.5    41  12.2    26.1    0.00    14  1   60.9    1   1.2638742617490812  982.6926915033389
2014-11-05 08:15    0.15    0.11    0.00    0.11    9   14.4    14.4    14.4    68  8.6 0.4 S   0.03    1.3 S   14.4    13.9    13.9    15.9    1002.7  0.00    0.0 156 0.22    156 0.0 0.00    0.0 0.003   0.000   26.7    41  12.4    26.2    0.00    23  1   100.0   1   1.2638991682523613  982.6733264785339
2014-11-05 08:16    0.15    0.11    0.00    0.11    9   14.4    14.4    14.4    68  8.6 0.0 S   0.00    1.8 S   14.4    13.9    13.9    16.0    1002.8  0.00    0.0 158 0.23    158 0.0 0.00    0.0 0.003   0.000   26.9    41  12.6    26.4    0.00    23  1   100.0   1   1.2639438134514795  982.6386163546646   60  49  88  82
2014-11-05 08:17    0.15    0.12    0.00    0.12    9   14.4    14.5    14.4    68  8.6 0.4 S   0.03    1.8 S   14.4    13.9    13.9    16.1    1002.6  0.00    0.0 160 0.23    160 0.0 0.00    0.0 0.003   0.000   27.1    41  12.7    26.6    0.00    24  1   100.0   1   1.2637607975085479  982.780920605032"""

#for testing use io.StringIO(temp)
df = pd.read_csv(io.StringIO(temp), sep="\t", header=None)
#insteed io.StringIO(temp) use file
#df = pd.read_csv('data.csv', sep="\t", header=None)
print df

temp1=u"""2014-11-05 06:54  1.2821175173539188  968.709953018414
2014-11-05 07:55    1.2816088505861456  969.0944311378387
2014-11-05 07:56    1.2812624875214302  969.3564059637908
2014-11-05 07:57    1.2810378335850467  969.5264007341631
2014-11-05 07:58    1.280514458802534   969.9226677701473
2014-11-05 07:59    1.2804410978592646  969.9782380278693
2014-11-05 08:00    1.2799436634193884  970.3552081987569
2014-11-05 08:01    1.2793760789237318  970.7856981700221
2014-11-05 08:02    1.2788601016214594  971.1773777485712
2014-11-05 08:03    1.2787234324253685  971.2811766061762
2014-11-05 08:04    1.2786275370664746  971.3540213982033
2014-11-05 08:05    1.2786632282014252  971.3269081390605"""

#for testing use io.StringIO(temp1)
df = pd.read_csv(io.StringIO(temp1), sep="\t", header=None)
#insteed io.StringIO(temp1) use file
#df = pd.read_csv('spectrum.csv', sep="\t", header=None)
print df1

#you can remove columns, where are NaN values
df = df.dropna(axis=1, how='any')
print df
#merge by first column - by dates
result = pd.merge(df, df1, on=0)
#fill NaN values by empty string
result = result.fillna('')
print result
print result.to_csv(header=False, index=False)
Community
  • 1
  • 1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252