1

I have a dictionary with the below values :-

test_dict = {'a': ['a1', 'a2'], 'b': ['1.1.1.1:1111', '2.2.2.2:2222', '3.3.3.3:3333,4.4.4.4:4444', '5.5.5.5:5555']}

I need to replace the comma (,) between 3.3.3.3:3333 and 4.4.4.4:4444 with (',) which is (single quote comma space) like that of the others.

I tried the code below but the output is coming with double quotes (")

val = ','
valnew = '\', \''  # using escape characters - all are single quotes
for k, v in test_dict.items():
    for i, s in enumerate(v):
        if val in s:
           v[i] = s.replace(val, valnew)

print(test_dict)

Output:

{'a': ['a1', 'a2'], 'b': ['1.1.1.1:1111', '2.2.2.2:2222', "3.3.3.3:3333', '4.4.4.4:4444", '5.5.5.5:5555']}

Expected Output:

{'a': ['a1', 'a2'], 'b': ['1.1.1.1:1111', '2.2.2.2:2222', '3.3.3.3:3333', '4.4.4.4:4444', '5.5.5.5:5555']}

Please suggest.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Pynewbie
  • 25
  • 1
  • 7
  • `'3.3.3.3:3333,4.4.4.4:4444'` is a single string. Perhaps you have a bug in whatever wrote the list in the first place? Could any of these lists have the problem or is it always list index 2? – tdelaney Jun 13 '20 at 00:47
  • Thanks! @tdelaney. There is no bug in the dictionary. The source from where we are getting this combination is same to what i have posted in my question. – Pynewbie Jun 13 '20 at 09:45
  • Does this answer your question? [How to split strings inside a list by whitespace characters](https://stackoverflow.com/questions/44085616/how-to-split-strings-inside-a-list-by-whitespace-characters). You will just have to use comma instead of whitespace: `line.split(',')`. – Georgy Jun 18 '20 at 11:29

4 Answers4

2

print is displaying a representation of the dict, as if print(repr(test_dict)) was called.

[repr returns] a string containing a printable representation of an object. For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval() ..

Since the value is a string which contains a ' it is using a " instead during the representation of the string. Example:

print(repr("helloworld"))   # -> 'helloworld'
print(repr("hello'world"))  # -> "hello'world"

This representation should generally only be used for diagnostic purposes. If needing to write this special format, the dict has to be walked and the values printed explicitly "per requirements".

If wishing for a reliable output/encoding with well-defined serialization rules, use a common format like JSON, XML, YAML, etc..

user2864740
  • 60,010
  • 15
  • 145
  • 220
2

You're confusing data with representation. The single quotes, space, and comma ', ' are part of the representation of strings inside a list, not the string itself.

What you're actually trying to do is split a string on a comma, e.g.

>>> '3,4'.split(',')
['3', '4']

You can do this within a list by splitting and flattening, like this:

[s1 for s0 in v for s1 in s0.split(',')]

So:

>>> b = ['1', '2', '3,4', '5']  # Using simpler data for example
>>> b = [s1 for s0 in b for s1 in s0.split(',')]
>>> print(b)
['1', '2', '3', '4', '5']
wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • 1
    Thanks for the explanation. This is really helpful. Got the difference between data and representation. – Pynewbie Jun 13 '20 at 09:49
2

'3.3.3.3:3333,4.4.4.4:4444' is a single string and the outer quote marks are just python's way of showing that. The same thing for "3.3.3.3:3333', '4.4.4.4:4444" - it is a single string. The outer double quotes are just python's way of showing you the string. The internal single quotes and comma are literally those characters in the string.

Your problem seems to be that some values in the list have been merged. Likely the problem is whatever wrote the list in the first place. We can fix it by splitting the strings and extending the list. List items that don't have embedded commas split to a single item list so extend into our new list as a single item. No change. But items with a comma split into a 2 item list and extend the new list by 2.

test_dict = {'a': ['a1', 'a2'], 'b': ['1.1.1.1:1111', '2.2.2.2:2222', '3.3.3.3:3333,4.4.4.4:4444', '5.5.5.5:5555']}

def list_expander(alist):
    """Return list where values with a comma are expanded"""
    new_list = []
    for value in alist:
        new_list.extend(value.split(","))
    return new_list

new_dict = {key:list_expander(val) for key, val in test_dict.items()}
print(new_dict)

The result is

{'a': ['a1', 'a2'], 'b': ['1.1.1.1:1111', '2.2.2.2:2222', '3.3.3.3:3333', '4.4.4.4:4444', '5.5.5.5:5555']}
tdelaney
  • 73,364
  • 6
  • 83
  • 116
1

Try something like this:

test_dict["b"] = ",".join(test_dict["b"]).split(",")

Updated:

import re

# do this once for the entire list
do_joinsplit_regex = re.compile(
    r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,4}"
)

for d in sample_list:
    for k,v in d.items():
        if not isinstance(v, list) or len(v) < 1:
            continue
        d[k] = ",".join(v).split(",")

  • Why do you join just to split again? – wjandrea Jun 13 '20 at 00:39
  • Because the problem is that one element in that list should actually be two, and a quick way to deal with it is to just concatenate or “join” it with the other elements of the list. If you split that string on the comma you’ll have all entries. As for the quotation mark issue, I apologize. I am using the mobile app, and it’s not as reliable as my pc keyboard. Feel free to edit it! –  Jun 13 '20 at 00:46
  • OK, I fixed the quotes for you. While I was there I noticed the comprehension was redundant so I removed it too. Much shorter now! :) – wjandrea Jun 13 '20 at 00:58
  • Thank you! Also, I should’ve said my mobile device keyboard is not as reliable as my pc keyboard. –  Jun 13 '20 at 01:06
  • Thanks will_f and wjandrea. This looks short and sweet :).Works like a charm. – Pynewbie Jun 13 '20 at 09:51
  • Thanks!I had follow up question.For one dictionary,works fine but I have multiple dictionaries now and they are stored in a list. E.g. sample_list=[{'host': ['a1', 'a2'], 'ip': ['1.1.1.1:1111', '2.2.2.2:2222', '3.3.3.3:3333,4.4.4.4:4444', '5.5.5.5:5555']}, {'host': ['c1', 'c2'], 'ip': ['1.1.1.1:1111', '2.2.2.2:2222', '3.3.3.3:3333,4.4.4.4:4444', '5.5.5.5:5555']}]. I am using update function using loop and here is the code. It doesn't work. for d in sample_list: j.update((v, (",".join(d["ip"]).split(","))) for k,v in d.items() if v ==ip). Could you pls help suggest. It doesn't throw any error – Pynewbie Jun 15 '20 at 17:21
  • I have (slowly) made an update that works well enough. Be sure to research a better regular expression for ip:port pairs (ie one that can handle ipv4 and ipv6) –  Jun 15 '20 at 17:57
  • This is awesome! :) Thanks will_f. I had to then copy the values in a text file in form of columns. E.g. for each value in 1st dict, i had to copy all corresponding values from the 2nd dict in a column manner in new lines one after the other. I was able to achieve it by doing like this: with open('test.txt', 'w') as f: for d in sample_list: for i in product(d['host'], d['ip']): f.write('{} {}\n'.format(*i)). I imported product from itertools for this. I am new to this and would like to thank you again :) – Pynewbie Jun 15 '20 at 19:22