2

I am a newbie in Python and would like to read a text file into a dictionary. The problem is that it reads in everything and only keeps the last record. I want to read in all the data and store everything in a Python dict.

It is just a simple text file reading and python data dictionary storing. Not sure why it does not work. Appreciate if someone can help.

book_data = {}

with open('test_data.txt', 'r', encoding='utf8') as raw_data:
        for item in raw_data:
            if ':' in item:
                key,value = item.split(':', 1)
                book_data[key]=value.lower()

test_data.txt

Book_ID: #111
Book_Title: Python 101
Book_description: This is a book about Python for beginners. 

Book_ID: #222
Book_Title: Java 101
Book_description: This is a book about  Java  for beginners. 


Book_ID: #333
Book_Title: Ruby 101
Book_description: This is a book about  Ruby for beginners. 


Book_ID: #444
Book_Title: C# 101
Book_description: This is a book about  C#  for beginners. 

My output is just one record instead of 4 records.

for k,v in book_data.items():
    print(k," : ", v)

Output:

Book_ID  :   #444

Book_Title  :   c# 101

Book_description  :   this is a book about  c#  for beginners.
petezurich
  • 9,280
  • 9
  • 43
  • 57
  • 1
    Dictionaries can't hold duplicate keys. So if you add, you end up with only last added value. – Austin Jun 02 '19 at 12:57
  • Each dictionary key is associated with exactly one value. You only have 3 keys, which you keep reusing. Each time you reuse a key, the previous value is replaced with the new one. So the final dict entry corresponds to the last record in your file, and the others are gone. Perhaps you should use a list of records rather than a dict. Or else, find a unique key for each record and store all of the data for that record under that key. – Tom Karzes Jun 02 '19 at 12:57
  • Probably the best thing for you to do at this point it (1) find a good Python tutorial, and read about lists and dicts (2) Think about how you want to store your data, and if you use a dict at some level, think carefully about what keys and what values it should contain (3) *then* rewrite your code to implement your design. – Tom Karzes Jun 02 '19 at 13:02
  • Can you recommend? I thought I knew enough but obviously I am not. Any web site that I can learn from? –  Jun 02 '19 at 14:37
  • Could you update your question with the desired output from your example? – palvarez Jun 02 '19 at 16:25
  • Thanks for your comments. It is working now. –  Jun 03 '19 at 05:02

3 Answers3

1

You are overwriting each time, the key "Book_ID" for example, is first saved as 111, then 222, then 333 and then 444 by the time your print. You are probably using the wrong datastructure for your problem. If you wanted to use the id as a key, you should probably create a new dict for each for the objects and insert them into book_data with the ID as key

kumalka
  • 33
  • 4
1

You overwrote the value for the same key in your book_data every iteration.

1

You're using three keys: Book_ID, Book_Title and Book_description. A dict can hold only one value for a key. You need to find a more suitable data structure to represent that file in memory.

What you probably want is a dict where the keys are 111, 222, 333, 444. And the value for each key would be another dict with keys Book_ID, Book_Title and Book_description.

Loop through the file, processing keys as you do now. But whenever you encounter an empty line, put the dict you have collected up to that point into a parent dict. Then continue scanning with an empty dict. At the end of the file, put the last dict into the parent as well.

Here's some skeleton code, taking into account the comments:

book_stash = {}

with open(...) as raw_data:
        for item in raw_data:
            if ':' in item:
                key,value = item.split(':', 1)
                if key == 'Book_ID':
                   book_data = {}
                   book_stash[value] = book_data
                else:
                   book_data[key] = value
Roland Weber
  • 3,395
  • 12
  • 26
  • Thank you for all your comments. I am really grateful for such in depth feedback. OK, upon further consideration, I would like to use just the Book_ID as the key and the values should contain both Book_Title and Book_Description. Can someone provide me with some direction? Your feedback is very much appreciated. –  Jun 02 '19 at 13:41
  • You cannot put two values for a key. You have to combine the two values into a single object. Either as a dict, or a tuple, or a list, or some other way. I'd go with a dict, it's the most flexible approach. – Roland Weber Jun 02 '19 at 13:55
  • @Tony See my answer on how to put three values into a dict, and then store that dict as a value in another. If you only want two values in the dict, then leave one out. – Roland Weber Jun 02 '19 at 14:40
  • @Tony I've added a code skeleton to my answer, adjust as required. If you're still in doubt what to do, I recommend that you first write a program that hard-codes some example book info into the kind of data structure you want. Then, after you're clear about the data structure, write the code that reads the file into the structure. – Roland Weber Jun 02 '19 at 16:06
  • Thank you for your pointers. I follow your code skeleton and It is working now. I am able to read all the records into the Python dictionary. –  Jun 03 '19 at 05:02
  • Roland Weber. Another question. How do you remove word length greater than 5 (len(value)>5) for values in Python dictionary? I have tried and failed. I am still new to dictionary and found it intimidating. Appreciate if you could provide some examples. –  Jun 03 '19 at 23:41
  • @Tony You should post new questions as separate questions. Somebody might have answered by now. Or it might have been closed as duplicate ;-) See https://stackoverflow.com/questions/5384914/how-to-delete-items-from-a-dictionary-while-iterating-over-it – Roland Weber Jun 04 '19 at 19:55