1

There are similar questions/answers on SO, but this refers to a specific error, and I have referred to the relevant SO topics to solve this, but with no luck.

The code I have seeks to retrieve lines from a text file and read them into a dictionary. It works, but as you can see below, not completely.

File

"['a', 5]"
"['b', 2]"
"['c', 3]"
"['d', 0]"

Code

def readfiletodict():

   with open("testfile.txt","r",newline="") as f:
     mydict={} #create a dictionary called mydict
     for line in f:
        (key,val) = line.split(",")
        mydict[key]=val
     print(mydict) #test
     for keys in mydict:
       print(keys) #test to see if the keys are being retrieved correctly


readfiletodict()     

Desired output:

I wish the dictionary to hold keys: a,b,c,d and corresponding values as shown in the file, without the unwanted character. Simiarly, I need the values to be stored correctly in the dictionary as integers (so that they can be worked with later)

For quick replication see: https://repl.it/KgQe/0 for the whole code and problem

Current (erroneous) output:

Python 3.6.1 (default, Dec 2015, 13:05:11)
[GCC 4.8.2] on linux

{'"[\'a\'': ' 5]"\r\n', '"[\'b\'': ' 2]"\r\n', '"[\'c\'': ' 3]"\r\n', '"[\'d\'': ' 0]"\r\n'}
"['a'
"['b'
"['c'
"['d'

The Stackoverflow answer I have used in my current code is from: Python - file to dictionary? but it doesn't quite work for me...

Compoot
  • 2,227
  • 6
  • 31
  • 63
  • Since some, or most, of the answers depend in some way on discarding characters it's worth considering the following question and the answers to it: https://stackoverflow.com/questions/3939361/remove-specific-characters-from-a-string-in-python/21357173. – Bill Bell Sep 04 '17 at 20:41

6 Answers6

2

The efficient way to do this would be using python lists as suggested by @Tico.

However, if for some reason you can't, you can try this.

lineFormat = re.sub('[^A-Za-z0-9,]+', '', line) this will transform "['a', 5]" to a,5. Now you can apply your split function.

(key,val) = lineFormat.split(",") mydict[key]=val

Anusha
  • 647
  • 11
  • 29
  • Thanks, I do need this to be a dictionary for various reasons. Are there other methods to do this other than the use of regular expressions, that I was trying to avoid? I suppose I could also ensure that when it was written to file the additional characters aren't there...that's a whole new question! – Compoot Sep 04 '17 at 16:54
  • @MissComputing. Your last sentence sums up exactly the right way to solve your problem: i.e. make sure you write your files in a way that is easy to read. – ekhumoro Sep 04 '17 at 16:59
  • Thanks ekhumoro. Also Anusha - that works - perfect! Thank you - and can I just check that the values are integers, so could be worked with if I wanted to find an average. OR would I have to further strip the ' ...? – Compoot Sep 04 '17 at 17:01
  • Well, I can think of a more crude way. If you do `list(line)`. This will return `[ '[' , 'a' , ',', ' ', ,'5' ,']' ]`. If you are very sure this will be the format you will be using throughout your application, then try this. Otherwise, regex is a better bet. – Anusha Sep 04 '17 at 17:07
  • use `int('5')` to make sure you are saving values as an int – Anusha Sep 04 '17 at 17:10
2

Your code slightly modified - the key is to strip out all the chars that we don't care about ([Python]: str.rstrip([chars])):

def readfiletodict():
    with open("testfile.txt", "r") as f:
        mydict = {} #create a dictionary called mydict
        for line in f:
            key, val = line.strip("\"\n[]").split(",")
            mydict[key.strip("'")] = val.strip()
    print(mydict) #test
    for key in mydict:
        print(key) #test to see if the keys are being retrieved correctly


readfiletodict()

Output:

(py35x64_test) c:\Work\Dev\StackOverflow\q46041167>python a.py
{'d': '0', 'c': '3', 'a': '5', 'b': '2'}
d
c
a
b
CristiFati
  • 38,250
  • 9
  • 50
  • 87
  • Works and without any additions. Your answer is the closest to what I wanted, in terms of correcting my existing code. Useful comments and other answers upvoted noted. Thanks! – Compoot Sep 04 '17 at 17:05
  • Nice, unusual application of strip. – Bill Bell Sep 04 '17 at 17:08
  • One small note: now the values in the dictionary are still strings. If you want to convert them to ints, simply replace the line `mydict[key.strip("'")] = val.strip()` to `mydict[key.strip("'")] = int(val.strip())`. I didn't do it myself, so that the code (without extra handling) should support entries in the input file like `"['d', x]"`. – CristiFati Sep 04 '17 at 17:17
1

It's much easier if you transform your string_list in a real python list, so you don't need parsing. Use json loads:

import json 

...
  list_line = json.loads(line)
...

Hope it helps!

Alex
  • 1,252
  • 7
  • 21
  • Thanks, I need to do this without the use of json. – Compoot Sep 04 '17 at 16:53
  • @ekhumoro That's rude dude... Anyway. I've tested it json.loads('["a",2]') returns ['a',2]. Oh.. the quotes.. ok you need to replace that first. – Alex Sep 04 '17 at 16:54
  • @Tico. Try it with `"['a', 5]"`, which is shown in the OP's question. – ekhumoro Sep 04 '17 at 16:56
  • Hey @MissComputing we regex suggested by anusha! Good answer there! – Alex Sep 04 '17 at 16:56
  • @ekhumoro I switched the quotes. Ok.. You need one more step switching the quotes from strings. Anyway even I hadnt tested, that was really rude dude... you shouln't do that... – Alex Sep 04 '17 at 16:58
  • shouldn't do what? – Compoot Sep 04 '17 at 16:59
  • @Tico. It is not in any way rude to point out problems with someone's answer. That is how SO is supposed to work. The way to respond to constructive criticism is to improve your answer. I don't understand why you are taking this so personally. – ekhumoro Sep 04 '17 at 17:03
  • Ok. Good manners is as follows: I've tested this answer with json.loads("['a',2]") and it didn't work. Can you elaborate on that? Or did you try it? And I would answer: Oh! sorry about that. You need to invert quotes. Just be polite. Your 'if you had taken the time to actually test this'' statement is just rude. People won't like you if you do that... if you care about that anyway. – Alex Sep 04 '17 at 17:10
1

Using only a very basic knowledge of Python:

>>> mydict = {}
>>> with open('temp.txt') as the_input:
...     for line in the_input:
...         values = line.replace('"', '').replace("'", '').replace(',', '').replace('[', '').replace(']', '').rstrip().split(' ')
...         mydict[values[0]] = int(values[1])
...         
>>> mydict
{'a': 5, 'b': 2, 'c': 3, 'd': 0}

In other words, discard all of the punctuation, leaving only the blank between the two values needed for the dictionary. Split on that blank, then put the pieces from the split into the dictionary.

Edit: In a similar vein, using a regex. The re.sub looks for the various alternative characters given by its first argument and any that are found are replaced by its second argument, an empty string. The alternatives are delimited by the '|' character in a regex pattern. Some of the alternatives, such as the '[', must be escaped with an '\' because on their own they have special meanings within a regex expression.

>>> mydict = {}
>>> with open('temp.txt') as the_input:
...     for line in the_input:
...         values = re.sub(r'"|\'|\,|\[|\]|,', '', line).split(' ')
...         mydict[values[0]] = int(values[1])
... 
>>> mydict
{'a': 5, 'b': 2, 'c': 3, 'd': 0}
Bill Bell
  • 21,021
  • 5
  • 43
  • 58
  • Thank you for this - it seems a bit cumbersome, but may well work. Just waiting on any other answers. I'd obviously like the simplest method – Compoot Sep 04 '17 at 16:57
  • 1
    I couldn't agree more. I have offered an alternative. – Bill Bell Sep 04 '17 at 17:05
  • Thank you Bill Bell - are you able to comment what precisely the regular expression handling is doing on line 4 – Compoot Sep 04 '17 at 18:49
1

You can use regex and a dict-comprehension to do that:

#!/usr/bin/env python

import re

with open('file.txt', 'r') as f: l = f.read().splitlines()
d = {''.join(re.findall('[a-zA-Z]+',i)):int(''.join(re.findall('\d',i))) for i in l}

Result:

{'a': 5, 'c': 3, 'b': 2, 'd': 0}
coder
  • 12,832
  • 5
  • 39
  • 53
0

You were almost there, missing two things:

  • stripping the keys
  • converting the values

The following code does what you need (I think):

#!/usr/bin/env python
# -*- coding: utf-8 -*-

output = dict()

with open('input', 'r') as inputfile:
    for line in inputfile:
        line = line.strip('"[]\n')
        key, val = line.split(',')
        output[key.strip("'")] = int(val)

Be careful however, since this code is very brittle. It won't process any variations on the input format you have provided correctly. To build on top of this, I'd recommend to at least use except ValueError for the int conversion and to think about the stripping characters again.

Bart Van Loon
  • 1,430
  • 8
  • 18