I have the following code that uses nlp() on every column to determine the type. However, it could take a long time depending on the size of my data. I was wondering how could I apply the function on selected amounts of rows? For example if I only wanted to apply it to the first 100 rows of every column instead?
import spacy
import pandas as pd
import en_core_web_sm
import numpy
nlp = en_core_web_sm.load()
df = pd.read_csv('https://climate.weather.gc.ca/climate_data/bulk_data_e.html?format=csv&stationID=27211&Year=2019&Month=5&Day=1&timeframe=2&submit=Download+Data')
df['Station Name'] = df['Station Name'].str.title()
col_list = df.columns
for col in col_list:
df[col] = df[col].apply(lambda x: [[w.label_] for w in list(nlp(str(x)).ents)])
df