I'm new to Python and a bit stuck. I have a dataframe of journal articles and their subject headings. The headings were returned from the API in a string where the subheadings modify the descriptor.
For example, one of the subject headings returned from the API is: "Cardiovascular Diseases/*drug therapy/epidemiology"
It describes an article primarily about drug therapy for cardiovascular diseases AND epidemiology for cardiovascular diseases. In this instance, I'd like to create a column in the dataframe for each of these. I'd like the column to include the initial term + the modifier. Some articles have only 1 term without a modifier, some have 1 term + many subheadings.
Current Dataframe:
+-----------------+------+----------------------------------------------------+ | Article Title | ID | Subject | +-----------------+------+----------------------------------------------------+ | an article | 123 | Cardiovascular Diseases/*drug therapy/epidemiology | | another article | 324 | Adult | | One more | 234 | United Kingdom/epidemiology | +-----------------+------+----------------------------------------------------+ What I want:+-----------------+------+----------------------------------------------------+--------------------------------------+----------------------------------------+--------------+ | Article Title | ID | Subject | Modifier 1 | Modifier 2 | Modifier 3 | +-----------------+------+----------------------------------------------------+--------------------------------------+----------------------------------------+--------------+ | an article | 123 | Cardiovascular Diseases/*drug therapy/epidemiology | Cardiovascular diseases/drug therapy | cardiovascular diseases/epidemiology | | | another article | 324 | Adult | Adult | | | | One more | 234 | United Kingdom/epidemiology | United Kingdom/epidemiology | | | +-----------------+------+----------------------------------------------------+--------------------------------------+----------------------------------------+--------------+
My initial attempt was just aiming to separate the initial heading from the modifiers (below). I'm having a hard time wrapping my head doesn't work for multiple subheadings:
for term in df['subjects'] :
head, sep, tail = term.partition('/')
descriptor.append(head)
qualifier.append(tail)