0

I can't merge two lists into a dictionary.I tried the following :

Map two lists into a dictionary in Python

I tried all solutions and I still get an empty dictionary

from sklearn.feature_extraction import DictVectorizer
from itertools import izip
import itertools

text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')


diction = dict(itertools.izip(words,lines))
new_dict = {k: v for k, v in zip(words, lines)}
print new_dict

I get the following :

{'word': ''} ['word=']

The two lists are not empty.

I'm using python2.7

EDIT :

Output from the two lists (I'm only showing a few because it's a vector with 11k features)

//lines
['change', 'I/O', 'fcnet2', 'ifconfig',....
//words
['word', 'word', 'word', .....

EDIT :

Now at least I have some output @DamianLattenero

{'word\n': 'XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n'}
['word\n=XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n']
  • Print out what `lines` and `words` is to make sure that worked ok – MrJLP Jun 05 '17 at 17:45
  • You have extra imports not needed too. `DictVectorizer` isn't used in this example, and probably `itertools` isn't required either as shown in answer below – MrJLP Jun 05 '17 at 17:46
  • @MrJLP That's correct, the problem should be in the loading of the data – developer_hatch Jun 05 '17 at 17:46
  • For clarity I'd remove `itertools`, `DictVectorizer` imports and `diction` assignment as it's not relevant to the example – MrJLP Jun 05 '17 at 17:51
  • Use this `new_dict = dict(zip(words,lines))`. The answer below has it, but for the variable `diction` which isn't being printed out – MrJLP Jun 05 '17 at 17:53
  • How are the words in the file? like a stack? one obove other? or like a consecutive list? one aside the other? – developer_hatch Jun 05 '17 at 18:05
  • Have you tried doing it the "C" way? `dict = {}`, `i = 0` `while i < len(lines):`, `dict[lines[i]] = words[i]`, `i += 1`. Does this have the expected output? – lpares12 Jun 05 '17 at 18:07
  • It's a text file with each word in a separate line. – Veselin Ivanov Jun 05 '17 at 18:07
  • If you're getting output like above the original problem has been solved. Having newline `\n` on each item is a different problem – MrJLP Jun 05 '17 at 18:08
  • That's right, I added you an observation in the anwer, Consider that a professional courtesy (John wick 2 :P) – developer_hatch Jun 05 '17 at 18:11
  • @MrJLP with the exception that I have only 1 value for the key...so the dictionary is not full. – Veselin Ivanov Jun 05 '17 at 18:13

2 Answers2

1

I think the root of a lot of confusion is code in the example that is not relevant.

Try this:

text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')

# to remove any extra newline or whitespace from what was read in
map(lambda line: line.rstrip(), lines)
map(lambda word: word.rstrip(), words)

new_dict = dict(zip(words,lines))
print new_dict

Python builtin zip() returns an iterable of tuples from each of the arguments. Giving this iterable of tuples to the dict() object constructor creates a dictionary where each of the items in words is the key and items in lines is the corresponding value.

Also note that if the words file has more items than lines then there will either keys with empty values. If lines has items then only the last one will be added with an None key.

MrJLP
  • 978
  • 6
  • 14
0

I tryed this and worked for me, I created two files, added numbers 1 to 4, letters a to d, and the code creates the dictionary ok, I didn't need to import itertools, actually there is an extra line not needed:

lines = [1,2,3,4]
words = ["a","b","c","d"]


diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

If that worked, and not the other, you must have a problem in loading the list, try loading like this:

def create_list_from_file(file):
  with open(file, "r") as ins:
    my_list = []
    for line in ins:
      my_list.append(line)
    return my_list

lines = create_list_from_file("/home/vesko_/evnt_classification/bag_of_words")
words = create_list_from_file("/home/vesko_/evnt_classification/sdas")

diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)

Observation: If you files.txt looks like this:

1
2
3
4

and

a
b
c
d

the result will have for keys in the dictionary, one per line:

{'a\n': '1\n', 'b\n': '2\n', 'c\n': '3\n', 'd': '4'}

But if you file looks like:

1 2 3 4

and

a b c d

the result will be {'a b c d': '1 2 3 4'}, only one value

developer_hatch
  • 15,898
  • 3
  • 42
  • 75
  • For some strange reason this doesn't work in my case @DamianLattenero – Veselin Ivanov Jun 05 '17 at 17:46
  • @VeselinIvanov even with those two testing lists? Because if those list works for you and do the job, the problem is when you load the wanted lists from the files... did you tried with those values from the list? – developer_hatch Jun 05 '17 at 17:48
  • @VeselinIvanov I updated the answer, let me know if that work for you, there is no need to import anything – developer_hatch Jun 05 '17 at 17:54
  • @DamianLattenero with the example lists it works , but unfortunately I can't do it manually, so I need to figure out why merging the two lists like in the other answers in stackoverflow didn't work me. – Veselin Ivanov Jun 05 '17 at 17:54
  • @VeselinIvanov Did you try load the list like that? I updated – developer_hatch Jun 05 '17 at 17:55
  • I updated my question.Yes this time I atleast have something, but it's still only with lenght 1.As far as I see it took the last element only. – Veselin Ivanov Jun 05 '17 at 18:02
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/145909/discussion-between-damian-lattenero-and-veselin-ivanov). – developer_hatch Jun 05 '17 at 18:12