1

I am extremely new to Python. I have a huge dataframe that contains two variables in list format. It has a dimension of 1416631 x 2.

I am trying to extract the first element of the list to create another variable. However, the current code has been running for over an hour to no avail.

Here is a snippet of the dataframe MH with two variables, col and PMID (which is currently empty):

col                             PMID
[1, Aged, Adult, Child]
[53, Humans, Kidney Injury]
[22, Diagnostic Imaging, Aged]

This is what I want it to look like (2 variables: PMID and col):

col                             PMID
[Aged, Adult, Child]            1
[Humans, Kidney Injury]         53
[Diagnostic Imaging, Aged]      22

Here is my code:

# extract PMID
for index, row in MH.iterrows():
    MH["PMID"][index] = MH["col"][index][0]

This code works on a smaller dataframe, but doesn't stop running on my larger dataframe.

Please advise. Thanks

jpp
  • 159,742
  • 34
  • 281
  • 339
sweetmusicality
  • 937
  • 1
  • 10
  • 27

1 Answers1

1

Here is one way:

import pandas as pd

df = pd.DataFrame({'col': [[1, 'Aged', 'Adult', 'Child'],
                           [53, 'Humans', 'Kidney Injury'],
                           [22, 'Diagnostic Imaging', 'Aged']]})

df['PMID'], df['col'] = list(zip(*df['col'].apply(lambda x: (x[:1][0], x[1:])).values))

#                           col  PMID
# 0        [Aged, Adult, Child]     1
# 1     [Humans, Kidney Injury]    53
# 2  [Diagnostic Imaging, Aged]    22

Explanation

  • pd.Series.apply allows you to apply any function, including an anonymous lambda function on series.
  • The tuple (x[:1][0], x[1:]) splits your column of lists into the format you specified.
  • zip(*x.values) unpacks an array of tuples into an array of 2 columns, which are assigned to columns "PMID" and "col".
jpp
  • 159,742
  • 34
  • 281
  • 339
  • wow. what an elegant solution and extremely fast! it ran within 1 minute! wow. may I bother you with what the syntax means? for instance, what do `zip` and `*` and `lambda x` do? – sweetmusicality Feb 21 '18 at 01:17
  • 1
    @sweetmusicality, good questions! I've added some links to my explanation to help you. – jpp Feb 21 '18 at 01:21