python csv to dictionary using csv or pandas module

Question

I am using Python's csv.DictReader to read in values from a CSV file to create a dictionary where keys are first row or headers in the CSV and other rows are values. It works perfectly as expected and I am able to get a dictionary, but I only want certain keys to be in the dictionary rather than all of the column values. What is the best way to do this? I tried using csv.reader but I don't think it has this functionality. Maybe this can be achieved using pandas?

Here is the code I was using with CSV module where Fieldnames was the keys that I wanted to retain in my dict. I realized it isn't used for what I described above.

import csv
with open(target_path+target_file) as csvfile:
    reader = csv.DictReader(csvfile,fieldnames=Fieldnames)
    for i in reader:
        print i

[`pandas.DataFrame.to_dict`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_dict.html#pandas.DataFrame.to_dict) — blacksite, Jun 04 '17 at 18:21
try to use "pandas.read_csv" method with parameter "usecols" — Konstantin Purtov, Jun 04 '17 at 18:28

Onel Harrison · Accepted Answer · 2017-06-04T18:41:32.603

13

You can do this very simply using pandas.

import pandas as pd

# get only the columns you want from the csv file
df = pd.read_csv(target_path + target_file, usecols=['Column Name1', 'Column Name2'])
result = df.to_dict(orient='records')

Sources:

edited Jun 04 '17 at 18:41

answered Jun 04 '17 at 18:28

Onel Harrison

1,244
12
14

1

thanks for reply but is the end result a list of dicts? i wanted to convert each row into a dict with with some keys. let me try this . thanks :) – anekix Jun 04 '17 at 18:31
1

It didn't before, but the updated answer should work for you once you change the column names to be what you have in your data. – Onel Harrison Jun 04 '17 at 18:42
can i also rename some of the existing key names like i provide a mapping between old keys and new keys? – anekix Jun 04 '17 at 19:04
1

Here's a SOA for renaming existing key names https://stackoverflow.com/questions/11346283/renaming-columns-in-pandas#11354850 – Onel Harrison Jun 04 '17 at 23:48
thanks again just one last thing there are some columns with empty values in csv file so in dictionary it become `nan` .is there any way to make hem default to `0` for empty cells in csv. should i ask a seperate question for it? – anekix Jun 05 '17 at 02:31
i treis .fillna(0) but `nan` is replaced with `0.0` instead of `0` – anekix Jun 05 '17 at 02:43
.also i noticed some field are being converted to `floats` by adding `.0` to them. is it a know behaviour?? – anekix Jun 05 '17 at 02:48
@anekix your follow up questions deviate from main question. I'd recommend investing sometime in thoroughly learning pandas as it solves many of the problems you are asking and creating a new question if you run into more problems. – Onel Harrison Jun 10 '17 at 21:50

score 5 · Answer 2 · answered Jun 04 '17 at 18:35

5

You can use the to_dict method to get a list of dicts:

import pandas as pd

df = pd.read_csv(target_path+target_file, names=Fieldnames)

records = df.to_dict(orient='records')

for row in records:
    print row

to_dict documentation:

In [67]: df.to_dict?
Signature: df.to_dict(orient='dict')
Docstring:
Convert DataFrame to dictionary.

Parameters
----------
orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}
    Determines the type of the values of the dictionary.

    - dict (default) : dict like {column -> {index -> value}}
    - list : dict like {column -> [values]}
    - series : dict like {column -> Series(values)}
    - split : dict like
      {index -> [index], columns -> [columns], data -> [values]}
    - records : list like
      [{column -> value}, ... , {column -> value}]
    - index : dict like {index -> {column -> value}}

      .. versionadded:: 0.17.0

    Abbreviations are allowed. `s` indicates `series` and `sp`
    indicates `split`.

Returns
-------
result : dict like {column -> {index -> value}}
File:      /usr/local/lib/python2.7/dist-packages/pandas/core/frame.py
Type:      instancemethod

answered Jun 04 '17 at 18:35

sirfz

4,097
23
37

thanks a lot for your reply wish i could accept more than one answer.but can you please tell if there is any way that i can rename some existing keys by providing a mapping or something? – anekix Jun 04 '17 at 19:05
The `names` attribute allows you to provide your own column names. If your file contains a header, just include `header=0` (i.e. 1st line) to make sure you replace the header with your own mappings in `names`. In addition, if you wish to only keep specific columns, you can specify them in the `usecols` parameter. – sirfz Jun 04 '17 at 19:09
i want to keep specefic columns but after that rename those columns – anekix Jun 04 '17 at 19:13
That's exactly what the `names` parameter allows you to do. Refer to `pandas.read_csv` documentation for details and play around with it to get a better grasp of its features. – sirfz Jun 04 '17 at 19:14
looking into it – anekix Jun 04 '17 at 19:15

python csv to dictionary using csv or pandas module

2 Answers2