-2

I have a Dataframe as follows:

import pandas as pd
df = pd.DataFrame({'Target': [0 ,1, 2], 
                   'Source': [1, 0, 3],
                    'Count': [1, 1, 1]})

I have to count how many pairs of Sources and Targets there are. (1,0) and (0,1) will be treated as duplicate, hence the count will be 2.

I need to do it several times as I have 79 nodes in total. Any help will be much appreciated.

Lilly96
  • 13
  • 3
  • Refrain from showing your dataframe as an image. Your question needs a minimal reproducible example consisting of sample input, expected output, actual output, and only the relevant code necessary to reproduce the problem. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. – itprorh66 Sep 23 '22 at 19:19

1 Answers1

1
import pandas as pd
# instantiate without the 'count' column to start over
In[1]: df = pd.DataFrame({'Target': [0, 1, 2], 
                          'Source': [1, 0, 3]})

Out[1]:      Target  Source
        0    0       1
        1    1       0
        2    2       3

To count pairs regardless of their order is possible by converting to numpy.ndarray and sorting the rows to make them identical:

In[1]: array = df.values
In[2]: array.sort(axis=1)
In[3]: array

Out[3]: array([[0, 1],
               [0, 1],
               [2, 3]])

And then turn it back to a DataFrame to perform .value_counts():

In[1]: df_sorted = pd.DataFrame(array, columns=['value1', 'value2'])
In[2]: df_sorted.value_counts()

Out[2]: value1  value2
        0       1         2
        2       3         1
        dtype: int64
Nikita Shabankin
  • 609
  • 8
  • 17
  • Glad to be of help! You may now select the answer as correct or wait for more answers as someone could come up with a better solution, and then choose the most suitable one — this is a common practice here and would benefit you and other users. – Nikita Shabankin Sep 24 '22 at 01:59