5

I'd like to plot venn diagrams based on my pandas data frame. I understand that matplotlib_venn accepts sets as input. My dataset contain client id and two other columns with information if the client was in campaign or not.

df_dataset = pd.read_csv('...path...',delimiter=',',decimal=',')
campaign_a = df_dataset[(df_dataset['CAM_A'] == 1)] 
campaign_b = df_dataset[(df_dataset['CAM_B'] == 1)]

plt.figure(figsize=(4,4))
set1 = set(campaign_a['CLI_ID'])
set2 = set(campaign_b['CLI_ID'])

venn3([set1, set2], ('Set1', 'Set2'))
plt.show()

However I get an error:

File "C:\Python27\Lib\site-packages\matplotlib_venn_venn3.py", line 44, in compute_venn3_areas areas = np.array(np.abs(diagram_areas), float)

TypeError: bad operand type for abs(): 'set'

MERose
  • 4,048
  • 7
  • 53
  • 79
HonzaB
  • 7,065
  • 6
  • 31
  • 42
  • There is probably no overlap between your sets. Can you check `len(set1 & set2)`, `len(set1 & set3)`, and `len(set2 & set3)`? – IanS Jun 09 '16 at 12:50
  • At the end, I found different approach. Instead of insert dataset, I only put the numbers, following this example: http://matthiaseisen.com/pp/patterns/p0144/ – HonzaB Jun 09 '16 at 13:14

3 Answers3

4

This error is a result of trying to force 2 sets into venn3. You need to import venn2 from the same library.

from matplotlib_venn import venn2

df_dataset = pd.read_csv('...path...',delimiter=',',decimal=',')
campaign_a = df_dataset[(df_dataset['CAM_A'] == 1)] 
campaign_b = df_dataset[(df_dataset['CAM_B'] == 1)]

plt.figure(figsize=(4,4))
set1 = set(campaign_a['CLI_ID'])
set2 = set(campaign_b['CLI_ID'])

venn2([set1, set2], ('Set1', 'Set2'))
plt.show()
2

Simple way to create venn diagrams for small number of sets. Hope this helps.

import matplotlib.pyplot as plt
from matplotlib_venn import venn2
from matplotlib_venn import venn3

set1 = set()
set2 = set()
set3 = set()
set4 = set()
set_array = []
set_names = ['Set1', 'Set2', 'Set3', 'Set4']

set1.add('a')
set1.add('b')

set2.add('b')
set2.add('c')

set3.add('c')
set3.add('d')

set4.add('d')
set4.add('e')

set_array.append(set1)
set_array.append(set2)
set_array.append(set3)
set_array.append(set4)

# venn2([set1, set2], ('Set1', 'Set2')) # venn2 works for two sets
venn3(set_array[0:3], set_names[0:3])   # venn3 works for three sets
plt.show()

This generates the following output:

sample

MERose
  • 4,048
  • 7
  • 53
  • 79
Ankush
  • 256
  • 2
  • 1
1

I believe you need to pass 3 sets. Based on the code here, if you pass three subsets then they are transformed into a tuple before being passed to compute_venn3_areas, where np.abs can handle them. The case when you pass only 2 sets looks like an unhandled error.

IanS
  • 15,771
  • 9
  • 60
  • 84