1

i'm attempting to solve: http://orac.amt.edu.au/cgi-bin/train/problem.pl?problemid=980&set=aio17int

The algorithm exceeds the time limit for larger files. I'm fairly new to python, and can't find a more efficient solution online for checking if an element is in a list and then appending accordingly. Please help improve the speed without using any imports. Thank you.

input file:

8 5
2 7
1 8
8 4
7 5
8 6
re = 1
bl = 1
red = [1]
blue = [2]
input_file = open("tagin.txt","r")
n, m = map(int, input_file.readline().split())
for i in range(m):
    a, b = map(int, input_file.readline().split())
    if a in red:
        red.append(b)
        re+=1
    elif a in blue:
        blue.append(b)
        bl+=1

output_file = open("tagout.txt", "w")
output_file.write(str(re) + " " + str(bl))
output_file.close()

output file:

4 3

Also please advise if stack overflow is the wrong platform to ask this question, if so what should I use?

  • `item in list` is a relativity expensive way of doing things (theoretically O(n) for each such operation), you may want to switch to a dictionary (which lets you check "existence" in O(1)). – Nullman Feb 27 '22 at 08:29
  • @Nullman thank you, could you please post an example of how to integrate that? – Checkmate1941 Feb 27 '22 at 08:31
  • looking at the code again I realize that you don't even need `red` and `blue` as lists, you are just counting occurrences, you don't need to save them. just have a couple of incrementing counters – Nullman Feb 27 '22 at 08:37
  • @Nullman I've updated the code to use counters, but I don't see how it's possible without saving the values – Checkmate1941 Feb 27 '22 at 08:55
  • This question has been answered well already, but it's important to mention that using the [`with`](https://stackoverflow.com/questions/7395542/is-explicitly-closing-files-important) keyword is much better than calling `open` and `close` on files (so you can't forget to call `close()` on input_file, as happens here). Also, if you're new to Python, the core of your issue was [not using `set`](https://stackoverflow.com/questions/8929284/what-makes-sets-faster-than-lists), so both of these posts would be very helpful reading in the future. – kcsquared Feb 27 '22 at 09:48
  • I read the link and i have mislead you. Olvin's answer below is the correct approach – Nullman Feb 27 '22 at 10:22
  • @kcsquared thank you, initially i used "with open", but oddly removing it somehow decreased my runtime. – Checkmate1941 Feb 27 '22 at 10:41
  • Any performance gain from not using `with` is [almost certainly](https://stackoverflow.com/questions/31334061/file-read-using-open-vs-with-open) because you never called close() manually and didn't cleanup. In CPython (the standard Python you're probably using), the cleanup will happen anyway during eventual garbage collection or process exit. In other Pythons (like Pypy), the file handle might not get returned even after exit. The official Python tutorial also recommends using `with` for file I/O as a good practice. – kcsquared Feb 27 '22 at 11:01

2 Answers2

2

Some things you can improve:

  • Use a set instead of a list for collecting team members. That will increase the efficiency of the in operator.

  • There is no need to keep track of the size of the team, as you can get that with the len function

  • Only collect the members of one team. We can derive the size of the other team from m, which represents how many people get tagged. If we add 2 to that, we also take into account the first two which are already tagged. Subtract from that the size of one team, and you get the size of the other.

Code:

input_file = open("tagin.txt","r")
n, m = map(int, input_file.readline().split())
red = {1}  # A set, only for the red team
for i in range(m):
    a, b = map(int, input_file.readline().split())
    if a in red:  # Better efficiency as red is now a set
        red.add(b)

output_file = open("tagout.txt", "w")
# Size of blue team can be deduced from what we already know:
output_file.write(str(len(red)) + " " + str(m + 2 - len(red)))
output_file.close()
trincot
  • 317,000
  • 35
  • 244
  • 286
2

If I understand the problem correctly then this should fulfil the objective:

with open('tagin.txt') as tagin:
    _, M = map(int, next(tagin).split())
    red = {1}
    blue = {2}
    for _ in range(M):
        a, b = map(int, next(tagin).split())
        (red if a in red else blue).add(b)
    print(len(red), len(blue))

Output:

4 3
DarkKnight
  • 19,739
  • 3
  • 6
  • 22
  • just wondering is using _ a general convention? if so, when are you meant to do so? – Checkmate1941 Feb 27 '22 at 10:48
  • _ is actually a legal variable name. I, and many others, use it to indicate that the value is irrelevant as is the case here. It does however (in Python 3.10+) have a special meaning in the context of *match* – DarkKnight Feb 27 '22 at 11:29