0

I'm practicing some statistical analysis on a data frame of la liga record, in which i'm trying to find teams that started playing between 1930-1980. But there are few rows that have dates like 1941-42, 1975-76. I have the dataset like this

I have tried this, but this results in error

dfnew = df[(df['Debut']>1930) & (df['Debut']<1980)]

enter image description here

Sreeram TP
  • 11,346
  • 7
  • 54
  • 108
Bhavishya
  • 9
  • 1
  • 3
  • Please share a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) – yatu Apr 17 '19 at 10:21
  • copy paste a sample of the dataframe – Sreeram TP Apr 17 '19 at 10:24
  • Welcome to StackOverflow. To get better answers on your question, I recommend you add some example dataframe and some expected output so we can visually see what you want to do. Find more information [here](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Erfan Apr 17 '19 at 10:26

2 Answers2

0

One way of handling this would be to use Series.str and slice the first 4 characters, then cast to int:

df['Debut'] = df['Debut'].astype(str).str[:4].astype(int)

Then filter with Series.between and boolean indexing:

df_new = df[df['Debut'].between(1930, 1980)]
Chris Adams
  • 18,389
  • 4
  • 22
  • 39
0

After slicing, the teams who have played their debut matches between given dates can be found in boolean form, ie. if team had played between 1930-80, will show True else False.

s=pd.Series([pd.read_csv('Laliga.csv')])

s=Laliga['Debut'].astype(str).str[:4].astype('int', copy=True, errors= 'ignore')

s.between(1930, 1980)
Simon Notley
  • 2,070
  • 3
  • 12
  • 18