I am training a sentiment analysis neural network using keras and tensorflow as back-end. After tokenizing I want to print and see the data.
I tried using print() function but it shows very large array of numbers.
I only want to get first few data like in head() function. How to achieve this?
data = pd.read_csv('training.csv', encoding='latin-1', usecols=[0,5],
names=['sentiment', 'text'])
data['text'] = data['text'].apply(lambda x: x.lower())
data['text'] = data['text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x)))
for idx,row in data.iterrows():
row[1] = row[1].replace('rt',' ')
batch_size = 3000
tokenizer = Tokenizer(num_words=batch_size, split=' ')
tokenizer.fit_on_texts(data['text'].values)
X = tokenizer.texts_to_sequences(data['text'].values)
print(X)