Overview: I am creating a recommendation system that compares a course already taken by a student to a catalog of available courses the student has not yet taken. The recommendation system will return 3 courses of recommended courses.
Issue: Using a custom recommendation function that returns 3 values in a for loop that iterates through a transcript to compare already taken classes. The loop essentially finds/recommends the 3 classes that the student should take next. The issue is all the classes appear in one column cell and I have not found an easy way to break the column into separate rows.
Deeper dive:
I have a function (c_recommend) that returns 3 recommendations in the form of a series:
output: Series
INDEX | Program Title |
---|---|
123 | program 1 |
456 | program 2 |
789 | program 3 |
I then use this function(c_recommend) inside a for loop to iterate over the rows of a transcript to find the course title to compare to the catalog of classes.
## created empty list
results = list()
## run through the transcript
for i in transcript.index:
## append to the list the name of the student, the course already taken, the recommended courses (3 will appear)
results.append([transcript['student'].loc[i],transcript['Course'].loc[i],c_recommend(transc['Course'].loc[i])])
output: List
Student | Taken Class | Recommended Classes |
---|---|---|
111 | program 1 | program 2, program 3, program 4 |
222 | program 2 | program 5, program 1, program 3 |
333 | program 3 | program 2, program 1, program 4 |
The recommended classes are all bunched into one row due to the fact that the c_recommend function runs and returns three values. I need a way to separate those 3 values out into their own columns like so:
desired output:
Student | Taken Class | Recommended Classes | Reco Class 2 | Reco Class 3 |
---|---|---|---|---|
111 | program 1 | program 2 | program 3 | program 4 |
222 | program 2 | program 5 | program 1 | program 3 |
333 | program 3 | program 2 | program 1 | program 4 |
I have tried converting the list to a pandas dataframe and separating, using regex to split the commas, using nested loops. Alas, I have failed and the columns does not separate :( Ideally after this issue is fixed, I would like to convert this to a pandas DF. Maybe there is an easier way to handle this with pandas?
I would appreciate all and any insight even if that means rewriting my function.
TIA!