Python text file into directory list

Question

I needed help to add into directory below text file. Can anyone can help me to do it? I tried
I have data.txt like below:?

A1234 161
A1234 106
A456  185
A456  108
037   125

**Output:**
directory = {
"A1234": [161,106],
"A456": [185,108],
"037": [125],
}

Thank you in advance for your help.

Stijn B · Answer 1 · 2022-03-04T14:39:57.600

1

data.txt:

From file to dictionary:

with open('data.txt', 'r') as file:
    data_lines = file.readlines()

directory = {}

for line in data_lines:
    a, *b = line.split()
    # convert all elements of b into integers:
    b = [int(item) for item in b]
    if directory.get(a, False):
        if isinstance(b, list):
            directory[a].extend(b)
        else:
            directory[a].append(b)
    else:
        directory[a] = list(b)

print(directory)
# {'A1234': [161, 106], 'A456': [185, 108], '037': [125]}
# prettified:
"""
{
    'A1234': [161, 106],
    'A456': [185, 108],
    '037': [125]
}
"""

edited Mar 04 '22 at 14:39

answered Mar 02 '22 at 22:14

Stijn B

350
2
12

For above file your code works fine. Thank you. But when I use other file which has 9375 rows of data, Its throw error like "ValueError: too many values to unpack (expected 2)". – Shailesh Patel Mar 02 '22 at 22:46
I'm trying to do ICD mapping file dynamically added into directory. – Shailesh Patel Mar 02 '22 at 22:48
Ok, I think `ValueError: too many values to unpack (expected 2)` has nothing to do with the amount of data. It has to do with the line `a, b = line.split()` : it's unpacking the line split (a list) in two variables. Does your data file contain lines with more than 2 'words' ? (I guess so. If so, show me an exemple I will adapt the code) See more info [here](https://itsmycode.com/valueerror-too-many-values-to-unpack-expected-2/#:~:text=ValueError%3A%20too%20many%20values%20to%20unpack%20(expected%202)%20occurs,you%20get%20a%20value%20error.) – Stijn B Mar 02 '22 at 22:50
I tested the code with the example input data just by changing the first line to `A1234 161 55` and that's throwing the same error. I edited my answer, now it should work fine with any amount of 'words' or 'numbers' per line :) – Stijn B Mar 02 '22 at 23:04
Awesome, Stijn B Sir, It works. Thank you. – Shailesh Patel Mar 03 '22 at 14:49
Hi Stijn B, in answer, How can I remove? { "A1234": [161,106], "A456": [185,108], "037": [125], } – Shailesh Patel Mar 03 '22 at 19:41
Thanks. You mean how to make this `['161', '106']` like this `[161, 106] ` ? (goingi from string to integers) – Stijn B Mar 03 '22 at 21:27
Correct, String to int – Shailesh Patel Mar 04 '22 at 14:31
@ShaileshPatel answer edited. Now it outputs lists of integers – Stijn B Mar 04 '22 at 14:40
Hi SB, Thank you, I got this error, ValueError: invalid literal for int() with base 10: 'A' – Shailesh Patel Mar 04 '22 at 14:57
Well if you have something like `A1234 161 A85` in your file it will obviously not work (Python cannot convert `A85` to an integer). Please provide the line which is producing that error – Stijn B Mar 04 '22 at 15:04
Hi SB, Thank you, Your code is perfectly fine, my data had issue, I fixed it. Some data has space like A1234 A 161 in it. I change it to A1234A 161. – Shailesh Patel Mar 04 '22 at 18:03

score 0 · Accepted Answer · answered Mar 02 '22 at 22:14

0

Try this:

output_dict = {}
data = list(map(
    lambda a:a.strip().split(),
    open("data.csv").readlines()
))
for k,v in data:
    try:
        output_dict[k].append(v)
    except:
        output_dict[k] = [v]
output_dict

Output:

{'A1234': ['161', '106'], 'A456': ['185', '108'], '037': ['125']}

answered Mar 02 '22 at 22:14

Hussain Bohra

985
9
15

This works, but using a bare exception clause is generally considered to be a bad practice. See here for more. https://stackoverflow.com/questions/14797375/should-i-always-specify-an-exception-type-in-except-statements – Matthew Borish Mar 02 '22 at 22:39
For above file your code works fine. Thank you. But when I use other file which has 9375 rows of data, Its throw error like "ValueError: too many values to unpack (expected 2)". – Shailesh Patel Mar 02 '22 at 22:46

Matthew Borish · Answer 3 · 2022-03-02T23:12:48.133

0

Here's a pandas solution. First we employ read_csv() and use one or more spaces as our delimiter. We need to specify string (object) dtypes to get string values for the list items as in your output. If you want ints, you could skip the dtype argument and pandas will infer them. Next we groupby the fist column (0) and convert the values in column 1 to a list with apply. Finally, we use .to_dict to get a dictionary.

import pandas as pd 

df = pd.read_csv('dir.txt', header=None, sep=r"[ ]{1,}", dtype='object')

directory = df.groupby(0)[1].apply(list).to_dict()

output:

{'037': ['125'], 'A1234': ['161', '106'], 'A456': ['185', '108']}

edited Mar 02 '22 at 23:12

answered Mar 02 '22 at 22:16

Matthew Borish

3,016
2
13
25

For above file your code works fine. Thank you. But when I use other file which has 9375 rows of data, Its throw error like "ValueError: too many values to unpack (expected 2)". – Shailesh Patel Mar 02 '22 at 22:46
You likely have some irregularities in your .txt file, but it's tricky to account for them without seeing the full data. Can you post a larger sample? – Matthew Borish Mar 02 '22 at 22:56

Python text file into directory list

3 Answers3