0

Currently I have the following:

a data file called "world_bank_projects.json"

  projects = json.load((open('data/world_bank_projects.json'))

Which I made a dataframe on the column "mjtheme_namecode"

proj_norm = json_normalize(projects, 'mjtheme_namecode')

After which I removed the duplicated entries

proj_norm_no_dup = proj_norm.drop_duplicates()

enter image description here

However, when I tried to sort the dataframe by the 'code' column, it somehow doesn't work:

proj_norm_no_dup.sort_values(by = 'code')

enter image description here

My question is why doesn't the sort function sort 10 and 11 to the bottom of the dataframe? it sorted everything else correctly.

Edit1: mjtheme_namecode is a list of dictionaries containing the keys 'code' and 'name'. Example: 'mjtheme_namecode': [{'code': '5', 'name': 'Trade and integration'}, {'code': '4', 'name': 'Financial and private sector development'}]

After normalization, the 'code' column is a series type.

type(proj_norm_no_dup['code'])
pandas.core.series.Series
FateCoreUloom
  • 373
  • 1
  • 2
  • 8
  • Please provide a [mcve]. What is the dtype of the `"code"` column? – AMC Mar 14 '20 at 19:29
  • Hi, you are looking for [natural sorting](http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html). please see [this](https://stackoverflow.com/questions/5967500/how-to-correctly-sort-a-string-with-a-number-inside) answer – DavidDr90 Mar 14 '20 at 19:53
  • 1
    Does this answer your question? [How to correctly sort a string with a number inside?](https://stackoverflow.com/questions/5967500/how-to-correctly-sort-a-string-with-a-number-inside) – DavidDr90 Mar 14 '20 at 20:01

0 Answers0