0

My goal is to identify the odd element in the list below.

list_1=['taska1', 'taska2', 'taska3', 'taskb2', 'taska7']

The odd item is tasksb2 as the other four items are under taska.

They all have equal length, hence discriminating using the len function will not work. Any ideas? thanks.

Tiger1
  • 1,327
  • 5
  • 19
  • 40
  • 3
    I'm guessing it's the one that contains the `b`? Seriously though - what exactly do you expect anyone to be able to help with here? There is no code you're having problems with and the logic about correctly identifying odd items is whatever you can suitably define it to be... – Jon Clements Sep 22 '13 at 20:10
  • Is your goal to identify the odd element in _that_ list or in any list? They're completely different problems and require different definitions of "odd". For _that_ list just look for the one where the penultimate character is `b`... – Ben Sep 22 '13 at 20:13
  • The list i gave is just an example, the actual problem is a long complex list. – Tiger1 Sep 22 '13 at 20:15
  • 1
    Then you need to define what "odd" is. If this data is not representative then you can't expect any help. Are the first 4 letters of every non-"odd" string in your list the same? Is there anything else anyone can use to help you here? – Ben Sep 22 '13 at 20:18

3 Answers3

3

If you simply want to find the item that does not start with 'taska', then you could use the following list comprehension:

>>> list_1=['taska1', 'taska2', 'taska3', 'taskb2', 'taska7']
>>> print [l for l in list_1 if not l.startswith('taska')]
['taskb2']

Another option is to use filter + lambda:

>>> filter(lambda l: not l.startswith('taska'), list_1)
['taskb2']
Mingyu
  • 31,751
  • 14
  • 55
  • 60
1

Seems to be an easy problem solved by alphabetical sort.

print sorted(list_1)[-1]

Don't wanna sort? Try an O(n) time-complexity solution with O(1) space complexity:

print max(list_1)
Shashank
  • 13,713
  • 5
  • 37
  • 63
0

If you know what the basic structure of the items will be, then it's easy.

If you don't know the structure of your items a priori, one approach is to score the items according to their similarity against each other. Using info from this question for the standard library module difflib,

import difflib
import itertools

list_1=['taska1', 'taska2', 'taska3', 'taskb2', 'taska7']

# Initialize a dict, keyed on the items, with 0.0 score to start
score = dict.fromkeys(list_1, 0.0)

# Arrange the items in pairs with each other
for w1, w2 in itertools.combinations(list_1, 2):
    # Performs the matching function - see difflib docs
    seq=difflib.SequenceMatcher(a=w1, b=w2)
    # increment the "match" score for each
    score[w1]+=seq.ratio()
    score[w2]+=seq.ratio()

# Print the results

>>> score
{'taska1': 3.166666666666667,
 'taska2': 3.3333333333333335,
 'taska3': 3.166666666666667,
 'taska7': 3.1666666666666665,
 'taskb2': 2.833333333333333}

It turns out that taskb2 has the lowest score!

Community
  • 1
  • 1
Caleb Hattingh
  • 9,005
  • 2
  • 31
  • 44