Given that I have:
- a list of words
- points/scores that indicates "simplicity" for each word
- the difficulty levels of each word:
E.g.
>>> words = ['apple', 'pear', 'car', 'man', 'average', 'older', 'values', 'coefficient', 'exponential']
>>> points = ['9999', '9231', '8231', '5123', '4712', '3242', '500', '10', '5']
>>> bins = [0, 0, 0, 0, 1, 1, 1, 2, 2]
Currently, the word list is ordered by the simplicity points
.
What if I want to model the simplicity as a "quadratic curve"?, i.e. from highest to a low point and then back to high, i.e. produce a word list that looks like this with the corresponding points:
['apple', 'pear', 'average', 'coefficient', 'exponential', 'older', 'values', 'apple', 'pear']
I have tried this but it's painfully crazy:
>>> from collections import Counter
>>> Counter(bins)[0]
4
>>> num_easy, num_mid, num_hard = Counter(bins)[0], Counter(bins)[1], Counter(bins)[2]
>>> num_easy
4
>>> easy_words = words[:num_easy]
>>> mid_words = words[num_easy:num_easy+num_mid]
>>> hard_words = words[-num_hard:]
>>> easy_words, mid_words, hard_words
(['apple', 'pear', 'car', 'man'], ['average', 'older', 'values'], ['coefficient', 'exponential'])
>>> easy_1 = easy_words[:int(num_easy/2)]
>>> easy_2 = easy_words[len(easy_1):]
>>> mid_1 = mid_words[:int(num_mid/2)]
>>> mid_2 = mid_words[len(mid_1):]
>>> new_words = easy_1 + mid_1 + hard_words + mid_2 + easy_1
>>> new_words
['apple', 'pear', 'average', 'coefficient', 'exponential', 'older', 'values', 'apple', 'pear']
Imagine the no. of bins is >3 or maybe I want to "points" of the words to fit an sine-shape curve.
Note that this has not exactly an nlp
question nor it has anything to do with 'zipf' distribution and creating something to match or reorder the ranking of the word.
Imagine there's a list of integers you have an object (in this case a word) map to each integer and you want to reorder the list of object to fit a quadratic curve.