1

Given this list of tuples:

lists = [('the', 'DT'), ('cat', 'NN'), ('drink', 'NN'), ('the', 'DT'), ('soup', 'NN')]

where the,

DT NN NN DT NN

are the part-of-speech tag of each words, I convert the lists into dictionary:

my_dict = dict(lists)

It gave me this output:

{'soup': 'NN', 'the': 'DT', 'drink': 'NN', 'cat': 'NN'}

as I notice there's only one 'the': 'DT' and also the order was changed. What I expect is that the converted lists would be like this:

{'the': 'DT','cat': 'NN','drink': 'NN','the': 'DT','soup': 'NN'}

then using pypyodbc, I'll will query the Tagalog value of the key in my_dict into my database (sql server):

myDatabase
+---------+---------+
| English | Tagalog |
+---------+---------+
| cat     | pusa    |
| soup    | sopas   |
| the     | ang     |
| drink   | inom    |
+---------+---------+

and display the output into string like this:

ang pusa inom ang sopas
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343

2 Answers2

0

Dictionaries are mappings of unique keys to a value. Note the unique there; they contain key-value mappings, but there is only ever one copy of a key.

This restriction gives the dictionary implementation its power; you can look up the value for any key in constant time. Regardless of how many (unique) keys you put into a dictionary, you can expect that in the common case it'll not take more time to look up any key than in a small dictionary.

To manage this feat, dictionaries do not care about the order the keys are given in; the implementation will put them in an order (internally) that is more convenient to the dictionary than it is to you. See Why is the order in Python dictionaries and sets arbitrary?

All this just means you misunderstood what dictionaries are for. You just want to extract the first elements of your list so you can pass them to a query:

queryparams = [l[0] for l in lists]

then give those to a pypyodbc SQL query using parameters:

query = 'SELECT tagalog FROM myDatabase WHERE english in ({})'.format(
    ', '.join(['?'] * len(queryparams)))
cursor.execute(query, queryparams)
for row in cursor:
    print('Tagalog:', row[0])

I used a WHERE <column> IN (<value1>, <value2>, .., <valueN>) query here to limit what Tagalog words should be looked up. To make that work with query parameters you need to generate a list of ? placeholders first.

A IN SQL membership test treats the elements as a set (unique values only again) so you may as well make queryparams a set here and avoid sending duplicate words to the database:

queryparams = Iist({l[0] for l in lists})

The set is turned back into a list because I don't know if pypyodbc accepts sets as query parameter value input.

If you needed to use the input order to map English to Tagalog, use the database results as a dictionary:

query = 'SELECT english, tagalog FROM myDatabase WHERE english in ({})'.format(
    ', '.join(['?'] * len(queryparams)))
cursor.execute(query, queryparams)
english_to_tagalog = dict(cursor) # use each (english, tagalog) pair as a mapping

output = [english_to_tagalog[l[0]] for l in lists]

If your list of words gets very long, you may have to switch to using a temporary table, insert all your words in there (all of them, not just unique words) and use an inner join query to have SQL Server translate the words for you. You can have SQL Server preserve the order of your original input list that way too, so the final query result gives you Tagalog words in the same order.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Yes, I really don't know well about `dictionary` of python but anyway, thank you sir, it works. But how to sort now the output to be like that of the expected output `ang pusa inom ang sopas`? Is there a way to be like that? –  Mar 07 '15 at 11:52
  • @TomByler: you can include the english word with each row, then use that pairing as a dictionary; so mapping `the` to `ang`, and use another loop over the original input list to look up the translation. – Martijn Pieters Mar 07 '15 at 11:58
  • @TomByler: added that approach into my answer. – Martijn Pieters Mar 07 '15 at 12:00
  • THANK YOU VERY MUCH SIR!! You're a life saver. It does what I exactly want. Atleast, I'm now able to translate sentence in word by word. You're a brilliant python programmer sir! Thank you very much. Next step is quite complicated. –  Mar 07 '15 at 12:59
  • In case you're familiar with NLTK Sir @Martijn http://stackoverflow.com/q/28926143/4501264 –  Mar 08 '15 at 12:48
-1

A dictionnary in python has no order by design, and neither it has duplicate keys so you cannot get what you expect from a dict. See dictonnaries'doc for more information.

collections.defaultdict which seems closer to what you're trying to achieve.

d6bels
  • 1,432
  • 2
  • 18
  • 30
  • What would be the better method sir to get the output I want? I've been trying to program a translator, basically a word by word translator for now. I'm a beginner in python language. –  Mar 06 '15 at 14:27
  • I've added a link to my answer for defaultdict – d6bels Mar 06 '15 at 14:32
  • A defaultdict is not the solution here; it looks like the OP wants a list of dictionaries instead. – Martijn Pieters Mar 06 '15 at 15:35
  • What do you mean @MartijnPieters? Ok, let's do it this way, from the given tuples example above in my codes: `lists = [('the', 'DT'), ('cat', 'NN'), ('drink', 'NN'), ('the', 'DT'), ('soup', 'NN')]`, I want to get the `Tagalog` value of the words from `myDatabase` by accessing the first element of each tuples and display the output in string format. –  Mar 07 '15 at 01:43
  • 1
    @TomByler: you just want to get a list with the first indices then; `queryparams = [l[0] for l in lists]`. You misunderstood what dictionaries are about, in any case. – Martijn Pieters Mar 07 '15 at 08:35
  • Well it may not be THE solution but it's one anyway. But Martijn is right about you misunderstanding what dicts are – d6bels Mar 07 '15 at 11:07