1

I am trying to calculate the average distance between string list elements. So the average distance between names below.

testList = ['sarah', 'jon', 'mark', 'jessica', 'sarah', 'sarah', 'liz',
'jon', 'liz', 'sarah', 'jon', 'mark']

So Sarah --> Jon: [0, 2, 1, 1, 0]. The average distance between them is 0.8.

Sarah --> Mark: [1, 1, 1]. The average distance between them is 1.

I started writing code for this, but got stuck. I am also new to Python and would like to learn how to write this code using basic Python fundamentals like loops, instead of using a library.

def nameDistance(nameOne, NameTwo):
  distance = 0
  distanceList = []
  for name in nameList:
    if name == nameOne or nameTwo:
      #continue parsing through nameList but increment distance by 1 for each name 
      #that is not nameOne or nameTwo
      #once we get to the other name passed in the function
      #append distance to distanceList and set distance back to zero
      #continue to do this for the rest of nameList
      #take the average of distanceList 
zfa
  • 105
  • 1
  • 2
  • 7
  • I don't have time to write an answer but you should use dir(nameList) to learn about all of the methods available for lists. One useful method is index. You can get an index for two items and then you can use those indexes to calculate distances. Cheap way to start is to create a set of the names, so you only have the unique values and then index each name relative to the others – PyNEwbie Mar 14 '16 at 00:20
  • would mark sarah have as the first value -1 – PyNEwbie Mar 14 '16 at 00:22
  • mark sarah would have have 1 as the first value, it doesn't matter which name shows up first. I'll play around with index and see if I can figure it out. – zfa Mar 14 '16 at 00:25
  • 1
    What is average distance between strings? – Ivan Gritsenko Mar 14 '16 at 00:25
  • Look at this http://stackoverflow.com/questions/6294179/how-to-find-all-occurrences-of-an-element-in-a-list sorry, interesting question but I have to put steaks on the grill – PyNEwbie Mar 14 '16 at 00:27

1 Answers1

1

Here's a fully 'vanilla', arbitrary version of the method that can be used on any list of any type.

def avgdistance(item1, item2, lst):
    distlist = []
    curr = item1 if (lst.index(item1) < lst.index(item2)) else item2
    index = lst.index(curr)
    for i in range(len(lst)):
        item = lst[i]
        if (item == item1) or (item == item2):
            if item == curr:
                index = i
            else:
                distlist.append(i - index - 1)
                curr = item
                index = i
    if (len(distlist) > 0):
        return sum(distlist)/len(distlist)
    else:
        return 0

This is a generalized version of the method. If you want me to explain any part of the algorithm, just ask.

MutantOctopus
  • 3,431
  • 4
  • 22
  • 31