1

Learning python and having trouble understanding on how to create this function to read a file and return it as a dictionary. I'm aware I need to open the file and then use the .read(), but so far I'm not sure how to sort the data. Since there will be multiple "titles," I'm trying to sort upper-case letters to come before all lower-case. Any advice on how to proceed?

Code I have so far:

def read_text(textname):
    d = {}
    with open(textname) as f:
        for line in f:
            (title, year, height, width, media, country) = line.split() # I need to skip the first line in the file as well which just shows the categories.

Text file example:

text0='''"Artist","Title","Year","Total Height","Total 
Width","Media","Country"
"Leonardo da Vinci","Mona Lisa","1503","76.8","53.0","oil paint","France"
"Leonardo da Vinci","The Last Supper","1495","460.0","880.0","tempera","Italy" 

What I want to return file as:

{'Leonardo da Vinci': [("Mona Lisa",1503,76.8,53.0,"oil paint","France"),
('The Last Supper', 1495, 460.0, 880.0, 'tempera', 'Italy')]}
AthenAl
  • 27
  • 3
  • @UnholySheep that's a csv file – Patrick Haugh Nov 13 '16 at 19:17
  • 1
    What's going on? - there are [more](https://stackoverflow.com/questions/40566245/function-read-a-file-then-add-multiple-items-to-dictionary) and [more](https://stackoverflow.com/questions/40577549/converting-csv-file-to-dictionary-python) questions on this particular problem... – Maurice Nov 13 '16 at 19:22
  • Possible duplicate of [Sort a Python dictionary by value](http://stackoverflow.com/questions/613183/sort-a-python-dictionary-by-value) – AthenAl Nov 13 '16 at 19:28
  • What part are you struggling with? It looks like you're not even attempting to create a dictionary, or return anything. – Bryan Oakley Nov 13 '16 at 19:44

3 Answers3

2

One approach is to use the csv module and the setdefault method for dicts:

>>> import csv
>>> with open('data.csv') as f:
...   d = {}
...   reader = csv.reader(f)
...   header = next(f) # skip first line, save it if you want to
...   for line in reader:
...     artist, *rest = line
...     d.setdefault(artist,[]).append(tuple(rest))
... 
>>> d
{'Leonardo da Vinci': [('Mona Lisa', '1503', '76.8', '53.0', 'oil paint', 'France'), ('The Last Supper', '1495', '460.0', '880.0', 'tempera', 'Italy')]} 

The more pythonic way is to use a defaultdict:

>>> from collections import defaultdict
>>> with open('data.csv') as f:
...   d = defaultdict(list)
...   reader = csv.reader(f)
...   header = next(f) # skip header
...   for line in reader:
...     artist, *rest = line
...     d[artist].append(rest)
... 
>>> d
defaultdict(<class 'list'>, {'Leonardo da Vinci': [('Mona Lisa', '1503', '76.8', '53.0', 'oil paint', 'France'), ('The Last Supper', '1495', '460.0', '880.0', 'tempera', 'Italy')]})
>>> 

Figuring out the best way to get the data types you need is left as an exercise... as apparently this whole thing was from the beginning.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
0

Your input file is a CSV file (comma separated values). There's a module called csv for reading them.

import csv
import ast
def our_function(filename):
    output = {}
    with open(filename) as f:
        r = csv.reader(f)
        _ = next(r) #ignore the first line
        for line in r:
             head, *tail = map(ast.literal_eval, line) #make values the right types
             if head in output:
                 output[head].append(tuple(tail))
             else:
                 output[head] = [tuple(tail)]
    return output

ast.literal_eval will take inputs like '"Mona Lisa"', '1234' and return outputs like 'Mona Lisa' and 1234

Patrick Haugh
  • 59,226
  • 13
  • 88
  • 96
0

The solution using csv.reader object and enumerate function:

import csv

picture_info = {}
# let's say: `pictures.csv` is your initial file
with open('pictures.csv', 'r', newline='\n') as fh:
    r = csv.reader(fh)
    for k, line in enumerate(r):
        if k == 0: continue
        if not picture_info.get(line[0], None):
            picture_info[line[0]] = [tuple(line[1:])]
        else:
            picture_info[line[0]].append(tuple(line[1:]))

print(picture_info)

The output:

{'Leonardo da Vinci': [('Mona Lisa', '1503', '76.8', '53.0', 'oil paint', 'France'), ('The Last Supper', '1495', '460.0', '880.0', 'tempera', 'Italy')]}
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105