-3

The data:

The data

I want to remove duplicates from "Borough" without deleting the rows, I only need only one value for each

g2 = dfff.groupby(['Postcode'])["Borough"].agg( ','.join)
g3 = dfff.groupby(['Postcode'])["Neighbourhood"].agg( ','.join)
df2=pd.DataFrame(g2)
df3=pd.DataFrame(g3)
df4 = pd.merge(df2, df3, on='Postcode')
halfer
  • 19,824
  • 17
  • 99
  • 186
  • 1
    Welcome to SO. Please review [ask] and create a [mcve]. How can you expect anyone to help you if you aren't willing to put any effort into even asking a question? – user3483203 Oct 04 '19 at 17:39
  • `df.drop_duplicate()`? This should do for your case. Or explain in detail what output you are expecting. – Poojan Oct 04 '19 at 17:41
  • i tried this but its delete the whole rows , i need the rows that have ("Borough" : newyork , newyork ) to have just one value ( one place) not the two of them – Amr Mahmoud Oct 04 '19 at 17:44
  • [Stack Overflow Discourages Screenshots](https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors). It is likely, the question will be downvoted, for containing unnecessary screenshots. By using screenshots, you are discouraging anyone from assisting you. No one wants to retype your stuff, from a screenshot, and screenshots are often, not readable. – Trenton McKinney Oct 04 '19 at 17:53
  • Please [provide a reproducible copy of the DataFrame with `to_clipboard`](https://stackoverflow.com/questions/52413246/how-do-i-provide-a-reproducible-copy-of-my-existing-dataframe) – Trenton McKinney Oct 04 '19 at 17:54
  • Dont have the answer , to type it , dont need to copy all the dataframe its not necessary , i just need the method – Amr Mahmoud Oct 04 '19 at 17:56

1 Answers1

1

Try this:

# setup
df = pd.DataFrame({
    "data": ['scarborough, scarborough, scarborough', 'london,london', 'north york, north york', 'test,test']
})

# logic
def custom_dedup(s):
    return [*set([_.strip() for _ in s.split(',')])][0]

df['data'].apply(custom_dedup)

How it works

  1. split(): split the string on commas, which yields a list
  2. strip(): remove outer spaces from the each string in the list
  3. set(): get the unique elements from that list
  4. ...[0]: we assume there's only one element per set, so take the first element

Input:

    data
0   scarborough, scarborough, scarborough
1   london,london
2   north york, north york
3   test,test

Output:

0    scarborough
1         london
2     north york
3           test
Ian
  • 3,605
  • 4
  • 31
  • 66