0

I have a SArray called word_count in a SFrame called sf. Every row in the word_count SArray consists of a dict. I have an array called selected_words I am trying to loop through every column to see which of the words from "selected_words" appears in the column. If it appears i take the value and write it into a new column. Here is an example for just one word ('great'):

selected_words = ['awesome ', 'great']
def word_count(row):
    if 'great' in row:
           sf['great']=row['great']
    else:
         abc="a" #nothing should happen
sf['word_count'].apply(word_count)

+-------------------------------+
|           word_count          |
+-------------------------------+
| {'and': 5, '6': 1, 'stink'... |
| {'and': 3, 'love': 1, 'it'... |
| {'and': 2, 'quilt': 1, 'it... |
| {'ingenious': 1, 'and': 3,... |
| {'and': 2, 'parents!!': 1,... |
| {'and': 2, 'this': 2, 'her... |
| {'shop': 1, 'noble': 1, 'i... |
| {'and': 2, 'all': 1, 'righ... |
| {'and': 1, 'help': 1, 'giv... |
| {'journal.': 1, 'nanny': 1... |
+-------------------------------+


print sf['great']
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... ]

As far as I have understood, the same value(1) gets applied to every row, but i only need it in that row where the word 'great' was actually found. How can i do this?

Insane Skull
  • 9,220
  • 9
  • 44
  • 63
ustl
  • 1
  • 1

1 Answers1

2

The problem in your code is that you are changing the full column sf['great'] after each call of the function word_count. Here's another approach :

def word_count(d):
    return d['great'] if 'great' in d else 0

and after that apply this function to the sf['word_count'] column :

sf['great'] = sf['word_count'].apply(word_count)
ig-melnyk
  • 2,769
  • 2
  • 25
  • 35