3

I just wonder if it's possible to print the duplicate values from a dictionary.

For exemple I have this dictonary:

responses={
    'greet':'Hello! How can I help you?',
    'types':'Our coffee types are: light roasted, medium roasted, medium dark roasted, dark roasted.',
    'light':'Coffee Bros Paraideli Cup Of Excellence, Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee.',
    'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',
    'dark':'Koa Coffee Estate, Atlas Coffee Club, Lifeboost Coffee Organic Dark Roast, Peets House Blend, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'fruit':'Frutis notes coffee: Coffee Bros Paraideli Cup Of Excellence, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee, Volcanica Coffee Kenya AA, Peets House Blend.',
    'vanilla':'Vanilla notes coffee: Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza.',
    'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'timings':'We are open from 9AM to 5PM, Monday to Friday. We are closed on weekends and public holidays.',
    'fallback':'I dont quite understand. Could you repeat that?',
}

So if I pick two different keys like:

'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',

They have a common coffee name like:

Purity Coffee Flow, Out Of The Grey Costa Rica La Minita

So if I insert the keys for it like: chocolate, medium.

The program need to print only those two duplicates:

Purity Coffee Flow, Out Of The Grey Costa Rica La Minita

It's possible to print just those 2 words in console which are duplicates in there?

The only thing that I manage to work is to print the duplicates values if values are completly the same, but that's not my use case.

maximus383
  • 584
  • 8
  • 25
  • 1
    This could be a solution: https://stackoverflow.com/questions/18715688/find-common-substring-between-two-strings. But you need to provide more specific rules on what duplicates – Yevhen Kuzmovych Nov 11 '22 at 10:54

4 Answers4

3

EDIT: After some clarification from OP, the keys needs to be input from the console, so I will keep the old answer as well, adding a way to get the keys from user input:

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("key1",)
parser.add_argument("key2")

command = input() #Input something like "chocolate medium"
args = parser.parse_args(command.split(" "))


#If user has input only one key, print values from that key
if args.key2 == None:
    print(responses[args.key1])
else
    # Then use the find common
    print(find_common_values(responses,args.key1,args.key2))


ORIGINAL ANSWER:

This code first calculates all the pairs of keys you have in your dict through itertool.product which is the cartesian product of two lists.

Then uses a function to find the common elements in two lists. The idea is the following

    common = [ k for k in list1 if k in list2]

However, instead of list1 and list2 we can use the values in your dict[key]. I noticed that they are strings, so I used the split method to split where commas are.

from itertools import product

responses={
    'greet':'Hello! How can I help you?',
    'types':'Our coffee types are: light roasted, medium roasted, medium dark roasted, dark roasted.',
    'light':'Coffee Bros Paraideli Cup Of Excellence, Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee.',
    'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',
    'dark':'Koa Coffee Estate, Atlas Coffee Club, Lifeboost Coffee Organic Dark Roast, Peets House Blend, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'fruit':'Frutis notes coffee: Coffee Bros Paraideli Cup Of Excellence, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee, Volcanica Coffee Kenya AA, Peets House Blend.',
    'vanilla':'Vanilla notes coffee: Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza.',
    'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'timings':'We are open from 9AM to 5PM, Monday to Friday. We are closed on weekends and public holidays.',
    'fallback':'I dont quite understand. Could you repeat that?',
}

def find_common_values(d,key1,key2):
    common = [ k for k in d[key1].split(",") if k in d[key2].split(",")]
    return common

# get all pairs of keys
keys = responses.keys()
pairs = list(product(keys,keys))

for P in pairs: 
    if P[0] != P[1]:
        comm = find_common_values(responses, P[0] , P[1] )
        if len(comm) != 0:
            print( P , comm ) 
    
    
    

Which gives:

('light', 'fruit') [' Peets Coffee Costa Rica Aurora']
('medium', 'chocolate') [' Purity Coffee Flow', ' Out Of The Grey Costa Rica La Minita']
('dark', 'chocolate') [' Lifeboost Coffee Organic Dark Roast', ' Coffee Bros Dark Roast', ' Death Wish Coffee', ' Kicking Horse Coffee Grizzly Claw.']
('fruit', 'light') [' Peets Coffee Costa Rica Aurora']
('chocolate', 'medium') [' Purity Coffee Flow', ' Out Of The Grey Costa Rica La Minita']
('chocolate', 'dark') [' Lifeboost Coffee Organic Dark Roast', ' Coffee Bros Dark Roast', ' Death Wish Coffee', ' Kicking Horse Coffee Grizzly Claw.']
Fra93
  • 1,992
  • 1
  • 9
  • 18
  • 1
    It works, in this way. But if I want to get just a specific combination when I type in the console. E.g. when I type: "medium, chocolate" I want to print just those values, not all of them like your output, is not working. – maximus383 Nov 11 '22 at 12:39
  • Well, if you want only two keys, use ```find_common_values(resources,"medium","chocolate")``` – Fra93 Nov 11 '22 at 14:07
  • 1
    In the last way there is a problem. If I do put values "medium" "chocolate" for example, for any of this it will print in console the duplicate elements for those 2 keys. But when I type "medium" I want to print the values for that key, from my dictionary, in this way it's print the same thing as I write down "chocolate". – maximus383 Nov 11 '22 at 14:35
  • I modified the answer to accept input from the user. – Fra93 Nov 11 '22 at 14:55
3
a = 'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.'

b = 'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.'

a_list = a.split(",")
b_list = b.split(",")

a_set = set(a_list)
b_set = set(b_list)


print(a_set.intersection(b_set))

How about this?

2

It's possible to split each dict value by punctuation into partial sentences. After which you can iterate through the dict and check if partials have been seen already.

import re
pattern = r'[\w\s]+'
partials  = set()
for value in responses.values():
    for partial in re.findall(pattern,value):
        if partial in partials:
            print(partial, partials)
        else:
            partials.add(partial)

Output

 Peets Coffee Costa Rica Aurora
 Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee
 Peets House Blend
 Lifeboost Coffee
 Driftaway Coffee Colombia Antioquia And Burundi Kayanza
 Coffee Bros Decaf
 Purity Coffee Flow
 Out Of The Grey Costa Rica La Minita
 Lifeboost Coffee Organic Dark Roast
 Coffee Bros Dark Roast
 Death Wish Coffee
 Kicking Horse Coffee Grizzly Claw
bn_ln
  • 1,648
  • 1
  • 6
  • 13
2

Using regular expressions and set.intersection:

import re

responses={
    'greet':'Hello! How can I help you?',
    'types':'Our coffee types are: light roasted, medium roasted, medium dark roasted, dark roasted.',
    'light':'Coffee Bros Paraideli Cup Of Excellence, Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee.',
    'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',
    'dark':'Koa Coffee Estate, Atlas Coffee Club, Lifeboost Coffee Organic Dark Roast, Peets House Blend, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'fruit':'Frutis notes coffee: Coffee Bros Paraideli Cup Of Excellence, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee, Volcanica Coffee Kenya AA, Peets House Blend.',
    'vanilla':'Vanilla notes coffee: Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza.',
    'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'timings':'We are open from 9AM to 5PM, Monday to Friday. We are closed on weekends and public holidays.',
    'fallback':'I dont quite understand. Could you repeat that?',
}

def split_string(str):
    return [s.strip() for s in re.split(r':|,|\.|!|\?|;', str) if len(s) > 0]

def common_strings(str_a, str_b):
    set_a = set(split_string(str_a))
    set_b = set(split_string(str_b))
    return list(set_a.intersection(set_b))

common_strings(responses['chocolate'], responses['medium'])

# ['Purity Coffee Flow',
# 'Out Of The Grey Costa Rica La Minita',
# 'Coffee Bros Decaf']
Dan Nagle
  • 4,384
  • 1
  • 16
  • 28