In the grand scheme of things, I am still somewhat new to Python (using Jupyter Notebook).
As I'm gaining more experience, I'm starting to use it more and because of this, I'm looking to increase my efficiency. For many projects, I use the same code to group and analyze data. For example, to categorize AGE
, I use this code:
#creating new variable based on Race
a1 = (TAT_v1["AgeY"] < .16)
a2 = (TAT_v1["AgeY"] >=.16) & (TAT_v1["AgeY"] <2)
a3 = (TAT_v1["AgeY"] >=2) & (TAT_v1["AgeY"] <13)
a4 = (TAT_v1["AgeY"] >=13)
TAT_v1['Age_Group'] = np.select([a1, a2, a3, a4], ['<4 Weeks', '1 to 24mos', '2 to 12yo', '13yo+'])
aga1 = (TAT_v1["Age_Group"] == '<4 Weeks')
aga2 = (TAT_v1["Age_Group"] == '1- 24mos')
aga3 = (TAT_v1["Age_Group"] == '2 to 12yo')
aga4 = (TAT_v1["Age_Group"] == '13yo+')
TAT_v1['Age_Group_N'] = np.select([aga1, aga2, aga3, aga4], [0,1,2,3])
Questions:
1) Is it possible to save this piece of code as a function within my Jupyter Notebook so I am able to call in it whenever I need it for different projects (e.g. I do not want to have to copy and paste the code above in and then have to worry about changing the dateframe name whenever I want to create an age category).
2) How would I code the function?
3) How would I call in the function?
I've been doing some research here but haven't found a good resource yet to help my specific function/questions.
https://www.datacamp.com/community/tutorials/functions-python-tutorial