1

I have created a list:

arraynums = [ 0.3888553  0.3898553  0.3908553  0.3918553  0.3928553  0.3938553
  0.3948553  0.3958553  0.3968553  0.3978553  0.3988553]

and a dictionary that has been sorted by key values (here's a portion of the dictionary):

sd =({'0.3880434': ['GGATCG'], '0.3883449': ['TTCACG'], '0.388449': ['ATGGCG'], '0.3890966': ['ACTCGC'], '0.3893325': ['GTGGAT'], '0.3893478': ['GATACG'], '0.3900749': ['CAGAAG'], '0.3900875': ['CGAGAG'], '0.3900915': ['ATCGGG'], '0.3901032': ['CACCGG'], '0.3901743': ['AAAGAC'], '0.3906361': ['TACGGC'], '0.390682': ['CCATCG'], '0.3909258': ['GGATGA'], '0.3910728': ['AAGATA'], '0.391648': ['GCAACG'], '0.3919125': ['AGGACT', 'GATCGC'], '0.3921844': ['AGAGAA'], '0.3922956': ['CGGGAA'], '0.3927617': ['ATGGAA'], '0.3927763': ['TTGTCG'], '0.3928683': ['ACAGAC'], '0.39309': ['CGCGCT'], '0.3938553': ['AGGACG'], '0.3940998': ['AAGAGC'], '0.3941768': ['GTCGGA'], '0.394966': ['CGTTCC'], '0.395116': ['TGGAAG'], '0.3954179': ['CCGTCC'], '0.3955623': ['AATCGC'], '0.3956923': ['GGACGG']})

I have been using this code to find the closest values to the values listed in the above list:

for k  in arraynums:
    index = sd.bisect(k)
    key = sd.iloc[index]
    seq = sd[key]

However, the key and seq's printed from this portion of the code does not correctly identify the closest values for k. I'm not quite sure what is going wrong. I think it might have to do with the way I created the arraynums list. I created the list using this:

arraynums = numpy.arange(float(middlevalue) - 0.005, float(middlevalue) + 0.005, 0.001)

EDIT: A note to the dictionary above: some of the values are negative and the output for each key is the same negative value... I've also sorted the dictionary using SortedDict()

cosmictypist
  • 555
  • 1
  • 7
  • 17
  • dictionaries are not sorted. – njzk2 Feb 29 '16 at 19:41
  • @njzk2 I sorted the dictionary using `SortedDict(d2)` – cosmictypist Feb 29 '16 at 19:42
  • Why does sd have '0.3875103' in front? My compiler doesn't run it. A SortedDict is created like this after all: http://stackoverflow.com/a/25988933/2734863 – Frikster Feb 29 '16 at 20:07
  • 1
    Are you perhaps giving SortedDict an unsorted dictionary? It looks like you might be doing the "this does NOT work" thing mentioned here: https://code.djangoproject.com/wiki/SortedDict – Frikster Feb 29 '16 at 20:14
  • @DirkHaupt I'm sorry. That was a mistake. I've fixed the dictionary provided. – cosmictypist Feb 29 '16 at 21:59
  • @DirkHaupt It's weird because if I just try a value that is not within the list, the `index`, `key`, and `seq` are the correct output. The problem seems to be with the list. – cosmictypist Feb 29 '16 at 22:02
  • I think I've made a mistake with how `.bisect` works. I thought it finds the closest value in a dictionary to the inputed value. – cosmictypist Feb 29 '16 at 22:06

1 Answers1

1

beyond the type mismatch (sd keys are str and arraynums elements are float), an efficient way can be:

keys=list(zip(sd.items())) 
values= array([x[0] for x in sd.values()])
indices=np.searchsorted(sorted(sd.keys()),arraynums)

In [390]: indices
Out[390]: array([ 3,  6, 13, 16, 21, 23, 26, 31, 31, 31, 31], dtype=int64)

indices says that arraynums[0] is beetween keys[2] and keys[3] and so on (see searchsorted). There is just a problem with last values : it can be avoid by an other choice of bounds. You have just now to compare for the closest and conclude.

B. M.
  • 18,243
  • 2
  • 35
  • 54