0

I want to match numbers that are present in two arrays (not of equal length) and output them to another array if there is a match. The numbers are floating point.

Currently I have a working program in Python but it is slow when I run it for large datasets. What I've done is two nested for loops.

The first nested for loop runs through array1 and checks if any numbers from array2 are in array 1. If there is a match, I write it to an array called arrayMatch1.

I then check array2 and see if there is a match with arrayMatch1. And output the final result to arrayFinal.

arrayFinal will have all numbers that exist within both array1, array2.

My problem: Two nested for loops give me a complexity of O(n^2). This method works fine for data sets under an array length of 25000 but slows down significant if greater. How can I make it more efficient. The numbers are floating point and always are in this format ######.###

I want to speed up my program but keep using Python because of the simplicity. Are there better ways to find matches between two arrays?

2 Answers2

1

Why not just find the interesection of two lists?

a = [1,2,3,4.3,5.7,9,11,15]
b = [4.3,5.7,6.3,7.9,8.1]

def intersect(a, b):
    return list(set(a) & set(b))

print intersect(a, b)

Output:

[5.7, 4.3]

Gotten from this question.

Community
  • 1
  • 1
Rivasa
  • 6,510
  • 3
  • 35
  • 64
0

So what you're basically trying to do is find intersection(logically correct term) of 2 list.

First you need to eliminate the duplicate form the list itself, set is great way to do that, then you can just & those lists and you will be good to go.

a = [23.3213,23.123,43.213,12.234] #List First
b = [12.234,23.345,34.224] #List Second
def intersect(a, b):
    return list(set(a) & set(b))

print intersect(a, b)
harshil9968
  • 3,254
  • 1
  • 16
  • 26