2

I have a file with numbers and I need to find the missing ones.

cat head.txt
7045000000
7045000001
7045000003

As you can see the number 7045000002 is missing. This code is not working:

check=int(7044999999)
with open('head.txt' , 'r') as f:
    for line in f:
        myl = line[:5]
        if myl == '70450':
            if int(line) == check+1:
                check = int(line)
            else:
                check = int(line)+1
                print check

The numbers should start with 70450 and there the "myl" variable is necessary.

shantanuo
  • 31,689
  • 78
  • 245
  • 403

1 Answers1

1

Here is the logic;

for i in range(7045000000,9045000000):
    if i not in line:
        print ("{} is missing".format(i))

Just use the basics of for loop. 9045000000 is a casual number. I don't know your database actually, so you can(should) change the last number in range() function.

Here is a demo;

with open("coz.txt") as f:
    rd=f.readlines()
    x=[t.strip("\n") for t in rd]
    for i in range(7045000000,7045000004):
        if str(i) not in x:
            print ("{} is missing".format(i))

Output;

>>> 
7045000002 is missing
>>> 

And this is the .txt file that I tried these codes;

enter image description here

GLHF
  • 3,835
  • 10
  • 38
  • 83
  • But since the first line is `7045000000`, this will immediately say "7045000001 is missing", which is false (it's the second line!). {I didn't downvote, just pointing out the glaring bug). – Alex Martelli Jan 26 '15 at 04:49
  • 2
    If you have enough memory to read everything in at once (rather than going by line as the first snippet suggests by its variable names) your latest code will work, but not as efficiently as `set`s would. – Alex Martelli Jan 26 '15 at 05:08
  • Yes. set seems to be much faster. I have benchmark tests to prove. I can not post the answer here because the question is now closed as duplicate. – shantanuo Jan 26 '15 at 05:56