3

I'm trying to write some data from my InfluxDB database to a csv. While the influxdb-cli does allow me to export data to csv, I am not allowed to run the cli on the server myself. I'm using the influx-db python package to query the database.

Now, here's the code that I'm using:

import csv
from influxdb import InfluxDBClient

USER='root'
PASSWORD='root'
DBNAME='sensordata'
HOST='localhost'
PORT=8086

client = InfluxDBClient(HOST,PORT,USER,PASSWORD,DBNAME)
query="select * from home where time > '2017-11-15 14:55:00' AND time <= '2017-11-15 14:55:05' tz('Asia/Kolkata')"
result = client.query(query,epoch='ns')
exported_data = list(result.get_points())
with open("output.csv", "a", newline='') as fp:
    writer = csv.writer(fp, dialect='excel')
    for line in exported_data:
        print(line)
        writer.writerow(line)

The problem is that when I print the lines, I get both key and values like so:

{'time': 1510737900336919297, 'value1': 18.84, 'value10': 19.83, 'value2': 18.56, 'value3': 12.61, 'value4': 17.57, 'value5': 16.6, 'value6': 16.81, 'value7': 12.84, 'value8': 11.54, 'value9': 14.26}
{'time': 1510737901370333995, 'value1': 11.32, 'value10': 12.98, 'value2': 12.34, 'value3': 12.22, 'value4': 11.08, 'value5': 12.07, 'value6': 17.62, 'value7': 14.68, 'value8': 16.87, 'value9': 11.4}
{'time': 1510737902403461281, 'value1': 12.37, 'value10': 16.18, 'value2': 18.83, 'value3': 14.59, 'value4': 11.79, 'value5': 18.52, 'value6': 11.25, 'value7': 17.28, 'value8': 10.54, 'value9': 19.1}
{'time': 1510737903436997966, 'value1': 13, 'value10': 12.04, 'value2': 10.02, 'value3': 14.28, 'value4': 14.51, 'value5': 17.3, 'value6': 16.14, 'value7': 15.04, 'value8': 13.16, 'value9': 10.47}
{'time': 1510737904470366806, 'value1': 16.2, 'value10': 10.83, 'value2': 12.64, 'value3': 13.51, 'value4': 13.74, 'value5': 11.52, 'value6': 13.42, 'value7': 13.14, 'value8': 16.6, 'value9': 11.24}

But, the csv file contains just the keys like so:

time,value1,value10,value2,value3,value4,value5,value6,value7,value8,value9
time,value1,value10,value2,value3,value4,value5,value6,value7,value8,value9
time,value1,value10,value2,value3,value4,value5,value6,value7,value8,value9
time,value1,value10,value2,value3,value4,value5,value6,value7,value8,value9
time,value1,value10,value2,value3,value4,value5,value6,value7,value8,value9

What am I doing wrong here?

Also, is it possible for me to have the csv formatted like:

time_heading, value1_heading, value2_heading ....
time_value, value1_value, value2_value ....
time_value, value1_value, value2_value ....
time_value, value1_value, value2_value ....
.
.
.

I did try out a few solutions that I googled but none of them helped. Does anyone know what's going wrong?

Edit

I also wanted to "natural_sort" the headers in since it makes sense to have it in that format. To do that I referred to this answer on SO. But, to do that, I needed to ignore the first item which is "time" and sort the "value" headers only which I did using this

One last thing that I needed is to calculate the offset of the text since some of the measurements have "val" as the field name and some have "value". The offset is 3 and 5 respectively which I will use in the natural sorting snippet. To do that, I referred to this. I finally ended up having the data look like

time,value1,value2,value3,....value10 and not time,value1,value10,value2,value3,...value9 which is what I was looking for.

Here's what the final code looks like:

import csv
from influxdb import InfluxDBClient

USER = 'root'
PASSWORD = 'root'
DB_NAME = 'sensordata'
HOST = 'localhost'
PORT = 8086

client = InfluxDBClient(HOST, PORT, USER, PASSWORD, DB_NAME)
query = "select * from home where time > '2017-11-15 14:55:00' AND time <= '2017-11-15 14:55:05' tz('Asia/Kolkata')"
result = client.query(query, epoch='ns')
exported_data = list(result.get_points())
header_list = list(exported_data[0].keys())

with open("output.csv", "w", newline='') as fp:
    writer = csv.writer(fp, dialect='excel')
    # print(header_list[1:])
    value_header = header_list[1]
    offset = sum(c.isalpha() for c in value_header)
    # print(offset)
    header_list[1:] = sorted(header_list[1:], key=lambda x: int(x[offset:]))
    # print(header_list)
    writer.writerow(header_list)
    for line in exported_data:
        # print(line)
        writer.writerow([line[kn] for kn in header_list])

Thanks to @be_good_do_good for the answer

3 Answers3

1

Since you don't just have iterables of values, but dictionaries of key:value pairs, you should use a csv.DictWriter instead of a csv.writer.

timgeb
  • 76,762
  • 20
  • 123
  • 145
  • As you can see, I'm using a list here: `exported_data = list(result.get_points())` Will a DictWriter help in this case? I'm assuming it won't because of how it's named but I'll give it a whirl either way. – karanrajpal14 Nov 20 '17 at 08:11
  • @karanRajpal the point is that your `list` contains `dicts`. That's exactly what the `DictWriter` was made for. Each `dict` in your `list` is treated as a row by the `DictWriter`. – timgeb Nov 20 '17 at 08:18
1

You need to do it in below way. This way has a problem of random order of writing data in the csv:

with open("output.csv", "a", newline='') as fp:
    writer = csv.writer(fp, dialect='excel')
    writer.writerow(exported_data[0].keys())
    for line in exported_data:
        print(line)
        writer.writerow(line.values())

You did everything right, but when you pass a dictionary, keys are taken as list, so you need to specify what needs to be written to csv - line.values()

If you are interested in the order of the keys, then do it below way:

with open("output.csv", "a", newline='') as fp:
    writer = csv.writer(fp, dialect='excel')
    header_list = ['time,value1,value10,value2,value3,value4,value5,value6,value7,value8,value9']
    writer.writerow(header_list)
    for line in exported_data:
        print(line)
        writer.writerow([line[kn] for kn in header_list])
be_good_do_good
  • 4,311
  • 3
  • 28
  • 42
  • `line.values()` will be in arbitrary order and therefore produce garbage results. – timgeb Nov 20 '17 at 07:39
  • But it will be the same arbitrary as keys are, isn't it? – be_good_do_good Nov 20 '17 at 07:41
  • The keys have no order. `line.values()` may give you a different ordering with respect to the keys which are represented in every iteration of the `for` loop. – timgeb Nov 20 '17 at 07:43
  • i double checked it. line.values are printed as same order of line.keys() – be_good_do_good Nov 20 '17 at 07:48
  • Changed the answer in such a way that headers now match the values – be_good_do_good Nov 20 '17 at 07:49
  • *if* the order of keys and values are stable in *this* interpreter session for *you* then due to implementation details which are not guaranteed to work all the time and on which you should not rely. For example, if the dictionaries were created differently, then `keys` can return the keys in different order for each `line`. – timgeb Nov 20 '17 at 07:53
  • To be clear, `keys` and `values` will correspond for each individual `line` (see [this](https://stackoverflow.com/questions/835092/python-dictionary-are-keys-and-values-always-the-same-order)) but it is not guaranteed that they correspond *across* lines. – timgeb Nov 20 '17 at 07:58
  • Thank you, now added to satisfy the order, can you please review and remove the downvote? – be_good_do_good Nov 20 '17 at 07:59
  • The first block of code is still "wrong" in the sense that it can - from time to time - produce unexpected results, but whatever, I removed the downvote anyway. :p – timgeb Nov 20 '17 at 08:01
  • agreed, let me add that part to the answer – be_good_do_good Nov 20 '17 at 08:02
  • @be_good_do_good your method works really well and yes, the order matters so I was trying to get the second method to work but I couldn't. I'm using the exact same as yours but I'm facing a couple of issues. Firstly, the last line, `[line[kn] for kn in header_list.keys()]` gives me the error `AttributeError: 'list' object has no attribute 'keys'`. This is a Python 3 specific issue as stated here: [stackoverflow_keys_issue](https://stackoverflow.com/questions/44570561/how-can-i-correct-the-error-attributeerror-dict-keys-object-has-no-attribut) – karanrajpal14 Nov 20 '17 at 09:56
  • **Note: Pardon the multiple comments. There's a character limit that I have to stick to.** @be_good_do_good To overcome that I changed the last line to `[line[kn] for kn in list(header_list)]` and now I get this error `File "/mnt/ECA0327EA0324EF8/pycharmProjects/csvParse.py", line 23, in writer.writerow([line[kn] for kn in list(header_list)]) File "/mnt/ECA0327EA0324EF8/pycharmProjects/csvParse.py", line 23, in writer.writerow([line[kn] for kn in list(header_list)]) KeyError: 'time,value1,value10,value2,value3,value4,value5,value6,value7,value8,value9'` – karanrajpal14 Nov 20 '17 at 10:01
  • @be_good_do_good I've accepted your answer for now since you've helped me resolve the main issue. I'll edit your answer as well just to add a few points regarding sorting the headers. – karanrajpal14 Nov 20 '17 at 10:35
0

Use csv.DictWriter() for dictionaries and csv.writer() for lists.

with open('output.csv','a') as file:
    writer = csv.DictWriter(file, exported_data[0].keys())
    writer.writeheader()
    writer.writerows(exported_data)
Sagun Shrestha
  • 1,188
  • 10
  • 23
  • I get this error `File "/usr/lib/python3.6/csv.py", line 143, in writeheader header = dict(zip(self.fieldnames, self.fieldnames)) TypeError: zip argument #1 must support iteration` From what I could gather with a little bit of searching, a dict writer would only be useful if my "exported_data" is a dictionary, correct? In this case, I'm using a list as you can see here: `exported_data = list(result.get_points())` What would you suggest me to do here? – karanrajpal14 Nov 20 '17 at 08:19