How to print tokenized data in keras?

Asked Aug 10 '19 at 12:18

Active Aug 10 '19 at 12:25

Viewed 127 times

I am training a sentiment analysis neural network using keras and tensorflow as back-end. After tokenizing I want to print and see the data.

I tried using print() function but it shows very large array of numbers.

I only want to get first few data like in head() function. How to achieve this?

data = pd.read_csv('training.csv', encoding='latin-1', usecols=[0,5], 
names=['sentiment', 'text'])

data['text'] = data['text'].apply(lambda x: x.lower())
data['text'] = data['text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x)))


 for idx,row in data.iterrows():
    row[1] = row[1].replace('rt',' ')
batch_size = 3000
tokenizer = Tokenizer(num_words=batch_size, split=' ')
tokenizer.fit_on_texts(data['text'].values)
X = tokenizer.texts_to_sequences(data['text'].values)
print(X)

edited Aug 10 '19 at 12:25

asked Aug 10 '19 at 12:18

Manusha

1

Use `print(X[:10])` If you want to print for first 10 elements, and so on – Ashwin Geet D'Sa Aug 10 '19 at 16:26
See https://stackoverflow.com/q/41971587/461847 for converting the integers back to tokens. – aab Aug 10 '19 at 19:40
this question as well: https://stackoverflow.com/questions/51956000/what-does-keras-tokenizer-method-exactly-do – Mehdi Aug 11 '19 at 17:25

How to print tokenized data in keras?

0 Answers0