-1

I have a dataframe which looks like this:


ID  Unit    Semester    Note    BNF
0   3537    143066.0    4010    2.3 5
1   3537    143067.0    4010    m.E.    E
2   75      113142.0    4011    5.0 5
3   3726    113142.0    4011    3.3 5
4   5693    113142.0    4011    5.0 5

this dataframe contains three categories. These categories are based on the values in the "Semester"-column. There are values which start with 113, 143 and 153.

Now I want to split this whole dataframe that I get three new dataframes for every categorie.

I tried to convert the column to string and work with 'startswith'.

mi = df[df['Unit'].apply(str)]
mi = df[df['Unit'].startswith('143')]

but that didn't work.

I hope someone could help me. Thanks a lot!

Goldsucher
  • 13
  • 4

2 Answers2

0

Isn't your target meant to be the Semester and not Unit mi = df[df['Unit'].apply(str)]? If so, then I would suggest making a new column (or use a multi-level index) with the following approach:

df["Semester_Start"] = df["Semester"].apply(lambda x: str(x)[:3])

#Take sub-sections
df[df["Semester_Start"] == "143"]

https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html

Bugbeeb
  • 2,021
  • 1
  • 9
  • 26
  • no, I really need the Units but I achieved that with df["Studiengang"] = df["Unit"].apply(lambda x: str(x)[:3]). Thanks @all for helping – Goldsucher Jan 14 '20 at 00:19
  • ```apply``` is not really good idea here. Instead try: ```df["Studiengang"] = df["Unit"].astype("str").str[:3]``` – Grzegorz Skibinski Jan 14 '20 at 06:51
  • avoid using apply whenever you can, https://stackoverflow.com/questions/54432583/when-should-i-ever-want-to-use-pandas-apply-in-my-code – ansev Jan 14 '20 at 07:34
0

This should do the trick:

dfs=[df.loc[df.Unit.astype(str).str.startswith(el)] for el in df.groupby(df["Unit"].astype("str").str[:3]).groups]

In short - you get the list of all possible first 3 digits of Unit.

Then You just iterate over that list in list comprehension filtering for each element using python string startswith(...) method.

Hope this helps!

Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34