There are various ways to do that. Below we will go through at least three options.
In order to keep the original dataframe df
, we will be assigning the sliced dataframe to df_new
.
At the end, in section Time Comparison we will show, using a random dataframe, the various times of execution.
Option 1
df_new = df[:10] # Option 1.1
# or
df_new = df[0:10] # Option 1.2
Option 2
Using head
df_new = df.head(10)
For negative values of n, this function returns all rows except the last n rows, equivalent to df[:-n]
[Source].
Option 3
Using iloc
df_new = df.iloc[:10] # Option 3.1
# or
df_new = df.iloc[0:10] # Option 3.2
Time Comparison
For this specific case one has used time.perf_counter()
to measure the time of execution.
method time
0 Option 1.1 0.00000120000913739204
1 Option 1.2 0.00000149995321407914
2 Option 2 0.00000170001294463873
3 Option 3.1 0.00000120000913739204
4 Option 3.2 0.00000350002665072680

As there are various variables that might affect the time of execution, this might change depending on the dataframe used, and more.
Notes:
Instead of 10
one can replace the previous operations with the number of rows one wants. For example
df_new = df[:5]
will return a dataframe with the first 5
rows.
There are additional ways to measure the time of execution. For additional ways, read this: How do I get time of a Python program's execution?
One can also adjust the previous options to a lambda function, such as the following
df_new = df.apply(lambda x: x[:10])
# or
df_new = df.apply(lambda x: x.head(10))
Note, however, that there are strong opinions on the usage of .apply() and, for this case, it is far from being a required method.