I'm trying to tokenize this file with 32 MB of size. I'm trying to do this with Google Colab and Visual Code. It worked with a smaller file, but I would like to know how to do it with a bigger file in a feasible way (it's been passing more than 1 hour).
My code in google Colab:
import nltk
nltk.download('punkt')
from nltk import word_tokenize
from google.colab import drive
drive.mount('/drive')
raw = open('../drive/MyDrive/NLTK/data.txt').read()
tokens = word_tokenize(raw)
Am I doing something wrong?