4

I know someone might think this question has been answered here but it doesn't have answer to what I want to achieve.

I have list of phone numbers, a very large one, and a whole lot of them starts with 08 and there is a lot of duplication, which is what I am trying to remove. Now I need to put them in a list or set so that I can use them in my program but it returns Invalid token as shown in the picture below:

enter image description here

Python assumes anything that starts with 0 as octal. How do I device a mean to bypass this and have these numbers in a list and then in a set?

Community
  • 1
  • 1
Yax
  • 2,127
  • 5
  • 27
  • 53
  • 4
    Phone numbers should be strings. – interjay Apr 21 '15 at 17:04
  • @interjay: I already have these numbers copied from somewhere and converting them to string would mean having each of them quoted. The numbers are too many for one to try that. – Yax Apr 21 '15 at 17:10
  • Based on the error, you are using Python 3 in which prefix zeros are not permitted. Otherwise, it is an octal number in Python 2. – Malik Brahimi Apr 21 '15 at 17:23

2 Answers2

2

If you need to have them prepended by 08, use strings instead of ints.

a = ["08123","08234","08123"]
a = list(set(a)) # will now be ["08123","08234"]

Since (as you say) you don't have an easy way of surrounding the numerous numbers with quotes, go to http://www.regexr.com/ and enter the following:

Expression: ([0-9]+)

Text: Your numbers

Substitution (expandable pane at the bottom of the screen: "$&"

EvenLisle
  • 4,672
  • 3
  • 24
  • 47
  • Sorry but this can not solve my problem. I can't put double quotes on each of the numbers. That will take like forever. They are in thousands. – Yax Apr 21 '15 at 17:08
  • Where do you get the numbers from? – EvenLisle Apr 21 '15 at 17:09
  • Updated the answer to contain a solution to your problem – EvenLisle Apr 21 '15 at 17:21
  • 1
    If this is a one-off conversion (i.e., all future numbers will always be entered as strings), import the list of numbers into an Excel spreadsheet, pad them and wrap them in quotes using an Excel formula, and import them back into your script (I like easy answers, where possible :-D ). Or, if not, save the number list in a text file and import them into your script as a string. – Deacon Apr 21 '15 at 17:22
2

read your phone input file, save each phone as string to a set, then the duplicates will be removed due to set only hold unique elements, and you can do further work on them.

def get_unique_phones_set():
    phones_set = set()
    with open("/path/to/your/duplicated_phone_file", "r") as inputs:
        for phone in inputs:
            # phone is read as a string
            phones_set.add(phone.strip())
    return phones_set
Haifeng Zhang
  • 30,077
  • 19
  • 81
  • 125