I have two csv files with Time-Series data, and I want to do a column merge on them using bash. The merge part is simple paste -d , file1.csv file2.csv > combined.csv
produces a merged file with the extra columns.
The problem is that the time series data from each file doesn't line up and I want to be able to handle that situation. So the merged rows are aligned by the timestamp in column A.
This same problem is described in this SO Question. But that was related to the R programming language, so the answer doesn't work for bash.
# File 1 # File 2
| Time | Datapoint | | Time | Datapoint |
| 2021-04-01T00:00:00Z | 43 | | 2021-04-01T00:00:05Z | 51 |
| 2021-04-01T00:00:01Z | 44 | | 2021-04-01T00:00:10Z | 52 |
| 2021-04-01T00:00:02Z | 45 | | 2021-04-01T00:00:15Z | 53 |
| 2021-04-01T00:00:03Z | 46 | | 2021-04-01T00:00:20Z | 54 |
| 2021-04-01T00:00:04Z | 47 | | 2021-04-01T00:00:25Z | 55 |
| 2021-04-01T00:00:05Z | 48 | | 2021-04-01T00:00:30Z | 56 |
| 2021-04-01T00:00:06Z | 49 | | 2021-04-01T00:00:35Z | 57 |
# Desired File
| Time | Datapoint | Datapoint |
| 2021-04-01T00:00:00Z | 43 | |
| 2021-04-01T00:00:01Z | 44 | |
| 2021-04-01T00:00:02Z | 45 | |
| 2021-04-01T00:00:03Z | 46 | |
| 2021-04-01T00:00:04Z | 47 | |
| 2021-04-01T00:00:05Z | 48 | 51 |
| 2021-04-01T00:00:06Z | 49 | |
I know I can write a script to read both files, and write the data related to each timestamp seperately. But I wondered if there was another way of doing this using bash utilities?