0

I use the Numpy library and I convert sentences (sents) into an array of int32 (words_idxs), but I don't know how to use Numpy to reverse this step. That is, convert the array of int32 into a sentence.

def build_input_data(voc, sents, tags, tags_uni, cues, scopes, labels):

    words_idxs = [np.array([voc['w2idxs'][w] if w in voc['w2idxs'] else voc['w2idxs']["<UNK>"] for w in sent],dtype=np.int32) for sent in sents]
    tags_idxs = [np.array([voc['t2idxs'][t] for t in tag_sent],dtype=np.int32) for tag_sent in tags]
    tags_uni_idxs = [np.array([voc['tuni2idxs'][tu] for tu in tag_sent_uni],dtype=np.int32) for tag_sent_uni in tags_uni]
    y_idxs = [np.array([voc['y2idxs'][y] for y in y_array],dtype=np.int32) for y_array in labels]
    cues_idxs = [np.array([1 if c=="CUE" else 0 for c in c_array],dtype=np.int32) for c_array in cues]
    scope_idxs = [np.array([1 if s=="S" else 0 for s in s_array],dtype=np.int32) for s_array in scopes]

    return words_idxs, tags_idxs, tags_uni_idxs, cues_idxs, scope_idxs,  y_idxs
eesiraed
  • 4,626
  • 4
  • 16
  • 34
  • Some of these arrays are going to be impossible to reverse engineer, because you are losing data. For example, in your `cues_idxs`, you can only say if a word was `"CUE"` or not, but you can't determine what the other words were because they are all represented by `0`. – user3483203 Jan 03 '19 at 23:11
  • It seems like `voc` is a dictionary with keys all the words in your vocabulary and values their index value. If you [reverse this mapping](https://stackoverflow.com/questions/483666/python-reverse-invert-a-mapping) you should be able to convert your integers back to words. If that's not what you're looking for, you ought to explain what all those objects are used for, ideally with some example code so we get a better idea. – Reti43 Jan 04 '19 at 01:14

0 Answers0