Converting all the names to identifiers in Python

Question

Lets assume we have a list with names of people given as strings

people = ["john", "mike", "patrick", "vince", "mike"]

I want to have a list where instead of names there will be used other identifiers for this people, for example:

people_ids = ["p1", "p2", "p3", "p4", "p2"]

There are 2 things I want to point out:

1) the id format is not important, the numbering can start from 0, if it makes things easier

2) when a given name is repeated, i want the corresponding id to be repeated as well in the same position in people_ids

how do I achieve this? most probably using some dictionary stuff right?

With `people = ["john", "mike", "patrick", "vince", "mike"]` you can refer to vince as them as `people[3]`. meaning his "id" is `3`. And the question then becomes why this is not enough? — Lennart Regebro, Oct 28 '13 at 10:33
And can you give us a sample input and output with duplicate names? — Martijn Pieters, Oct 28 '13 at 10:35
@MartijnPieters yes I want to generate `people_ids`, I do not have any sample inputs at the moment, just what I wrote in the question — Kristof Pal, Oct 28 '13 at 15:24
@LennartRegebro sorry, I am not sure I quite understood your comment — Kristof Pal, Oct 28 '13 at 15:25
@WolfgangKuehne: You want a list of ids for the items in a list. Well, you already have that. Their id is : 0,1,2,3,4,5,6... You access them with `people[0]`, `people[1]`, `people[2]` etc. In otehr words: Each item in that list **already has an id**. You don't need to generate new id's. — Lennart Regebro, Oct 28 '13 at 15:37
dear @LennartRegebro rather than calling them as IDs I use the word index. My use of word ID here refers to something which can be used for identification, for eg: university ID, username etc. You are thinking a bit too Pythonic in this case :) — Kristof Pal, Oct 28 '13 at 15:55
@WolfgangKuehne: I don't see the difference. The index is an ID. It is the identification of that item in a list. If your question is more specific than that, I suggest you ask the specific question you have including giving the specific context and specific usecase. As it stands now, your question doesn't make much sense. — Lennart Regebro, Oct 28 '13 at 16:36
@LennartRegebro If index were an ID, and the order of commenting in this thread was saved in a list, that would mean that 7 different people (including me) have taken place in the discussion. In real world you need ID to identify something uniquely as well as avoid redundancy. You can not use index as an ID because some values may repeat along the path while they will have different IDs (list index) — Kristof Pal, Oct 28 '13 at 17:20
@WolfgangKuehne: OK, then I suggest you use the name itself as ID. No, your question still makes no sense. You still should make a concrete question with a concrete usecase. — Lennart Regebro, Oct 28 '13 at 17:34

score 1 · Accepted Answer · answered Oct 28 '13 at 10:40

1

Are you looking for something like that

people = ["john", "mike", "patrick", "vince", "mike", "foo"]

def build(l):
    d = {}
    i = 1
    for p in people:
        if not p in d:
            d[p] = 'p' + str(i)
            i += 1
        yield d[p]

people_ids = list(build(people))

answered Oct 28 '13 at 10:40

sloth

99,095
21
171
219

@Xi Huan thanks for the response, this seems to work fine... in the first line after `for` why not use `if p not in d:` rather than `if not p in d:`, is there a difference in the syntax and logic? the results are the same though – Kristof Pal Oct 28 '13 at 15:35
1

@WolfgangKuehne `x not in y` and `not x in y` produce the same bytecode, so it doesn't matter which one you use. It's a matter of personal preference in the end. – sloth Oct 28 '13 at 15:41

Converting all the names to identifiers in Python

1 Answers1