How to apply a function to a set amount of rows in a Dataframe?

Question

I have the following code that uses nlp() on every column to determine the type. However, it could take a long time depending on the size of my data. I was wondering how could I apply the function on selected amounts of rows? For example if I only wanted to apply it to the first 100 rows of every column instead?

import spacy
import pandas as pd
import en_core_web_sm
import numpy
nlp = en_core_web_sm.load()

df = pd.read_csv('https://climate.weather.gc.ca/climate_data/bulk_data_e.html?format=csv&stationID=27211&Year=2019&Month=5&Day=1&timeframe=2&submit=Download+Data')

df['Station Name'] = df['Station Name'].str.title()

col_list = df.columns 

for col in col_list:
    df[col] = df[col].apply(lambda x: [[w.label_] for w in list(nlp(str(x)).ents)])

df

I would [chunk](https://stackoverflow.com/a/44729807/6361531) your dataframe using a dictionary or list, and process each dictionary entry: — Scott Boston, Jul 05 '20 at 22:48

score 1 · Accepted Answer · answered Aug 17 '20 at 02:52

1

Use the applymap method to apply the function to all columns with a selected index range.

For the first 100 rows:

df.iloc[:100] = df.iloc[:100].applymap(lambda x: [[w.label_] for w in list(nlp(str(x)).ents)])

answered Aug 17 '20 at 02:52

thorntonc

2,046
1
8
20

How to apply a function to a set amount of rows in a Dataframe?

1 Answers1