1

Below is the code that is grabbing some data from elastic search and exporting that data to a csv file called ‘mycsvfile’.

I want to change the column names so that it is readable by a human.

Below is the code:

from elasticsearch import Elasticsearch
import csv

es = Elasticsearch(["9200"])

# Replace the following Query with your own Elastic Search Query
res = es.search(index="search", body=
                {
                    "_source": ["DTDT", "TRDT", "SPLE", "RPLE"],
                    "query": {
                        "bool": {
                            "should": [
                                {"wildcard": {"CN": "TEST1"}}

                            ]
                        }
                    }
}, size=10)



with open('mycsvfile.csv', 'w') as f:  # Just use 'w' mode in 3.x
    header_present  = False
    for doc in res['hits']['hits']:
        my_dict = doc['_source'] 
        if not header_present:
            w = csv.DictWriter(f, my_dict.keys())
            w.writeheader()
            header_present = True


        w.writerow(my_dict)

when I run the above query the CSV file data look like below:

DTDT    TRDT    SPLE    SACL    RPLE

20170512    12/05/2017 15:39    1001    0   0

20170512    12/05/2017 15:39    1001    0   0

20170908    08/09/2017 02:42    1001    0   0

20170908    08/09/2017 06:30    1001    0   0

As you can see the column names are the same as in the query and I want to give them readable names when the file is being generated.

Could someone show and fix my code up for me to enter column names to the CSV file?

Thank you in advance

hyamanieu
  • 1,055
  • 9
  • 25
Rich
  • 177
  • 2
  • 3
  • 13

3 Answers3

0

What you want is to rename the dictionary keys. To do that, you can for example pop it to a new column name. Example to change DTDT to date.

my_dict['date'] = my_dict.pop('DTDT')

I think you can figure out how to rename your other columns. After that you can call w.writerow in your for-loop.

hyamanieu
  • 1,055
  • 9
  • 25
  • thanks for taking a look and i did not figure that out, i used examples online that is why i am struggle to do this rename the column name, i appropriate it if you could show me how this is done in my code please with a example? – Rich Oct 11 '17 at 14:18
  • This is one of the most trivial thing to do to copy and paste a line and change the strings. If I do it for you, you will never learn. – hyamanieu Oct 11 '17 at 14:18
  • i understand and i completely agree, but this is all new to me that is why i went for the copy and paste and changing lines etc. and i have been stuck on this for days now and its really getting to a point where i feel like giving up because this coding is not my strongest point. i hope you can understand – Rich Oct 11 '17 at 14:21
0

How about that? Define a dict with translations at the top, e.g. right after initializing your es client, on line 4 or 5 or so :

readableColumnNames = {"DTDT" : "Date", "BLI" : "Blub"}

and then replace this line of code:

w = csv.DictWriter(f, [readableColumnNames[colName] for colName in my_dict.keys()])

Didn't test this, to be honest, but it should do the job, and it's simple and transparent. You probably want to make sure translations for all field names are actually provided.

Or else, how about using sensible field names in Elasticsearch :-) ?

Uli R
  • 66
  • 6
  • Hi, where should i put the readableColumnNames = {"DTDT" : "Date", [...]}?. Are you able to update my code and show me. i think i may be placing it in the wrong place. thank you those funny fields are from the database which i did not create. – Rich Oct 11 '17 at 15:39
  • I clarified my answer to that end. Whats the error you get? I also changed the first line a bit. The [...] bit was an ellipsis that I wanted you to replace with something meaningful. Now the example works as is,but you need to provide translations for all field names, or you'll get a KeyError. – Uli R Oct 11 '17 at 16:12
  • i tired what you suggested, i put readableColumnNames = {"DTDT" : "Date", "BLI" : "Blub"} right after the es line and then i replaced my line which is - w.writerow(my_dict) with your line - w = csv.DictWriter(f, [readableColumnNames[colName] for colName in my_dict.keys()]). However it is not working. the error i am getting is - File "C:/Users/.PyCharmCE2017.2/config/scratches/test1.py", line 30, in w = csv.DictWriter(f, [readableColumnNames[colName] for colName in my_dict.keys()])w = csv.DictWriter(f, [readableColumnNames[colName] for colName in my_dict.keys()]) KeyError: 'DF' – Rich Oct 12 '17 at 07:32
  • This is exactly what I predicted in my last comment: you need to provide translations for ALL the field names. The error tells you: "I can not translate the field name 'DF' ". So just extend your readableColumnNames and provide translations FOR ALL FIELD NAMES. – Uli R Oct 12 '17 at 10:21
-2

You could try using pandas if you don't mind or are willing. If you used pandas your solution would be:

import pandas as pd

df = pd.DataFrame.read_csv('mycsvfile.csv')

print(df.columns)

also I just wanted to add that if your header situation is different or your index column not there, you might need to make some adjustments to the df creation statement. Here's the docs link:

Kevin Frankola
  • 29
  • 1
  • 1
  • 5
  • Hi Kevin, i tried to do that just now and didnt work. could you show me in my code and updated your answer? – Rich Oct 11 '17 at 14:07
  • This is not answering the question. – hyamanieu Oct 11 '17 at 14:15
  • Sorry that's my bad. I didn't read the question properly. first consider [this](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rename.html) it's the built in function for renaming pandas dataframe columns. in your code/based on what I have above, try: `df.rename(index=str, columns={'DTDT': 'datestamp', 'RPLE': 'code'})` after you make the changes you want, you can do a: `pd.DataFrame.to_csv('newcsv')` to get a new csv after your edits. – Kevin Frankola Oct 11 '17 at 14:35