0

I found the below Upset plot code here. The Orthogroups.GeneCount.tsv can be found here

enter image description here

import pandas as pd
from upsetplot import UpSet
from upsetplot import from_memberships
from matplotlib import pyplot as plt
dic={'group':[],'sp1':[],'sp2':[],'sp3':[],'sp4':[],'sp5':[],'sp6':[],'total':[]}
with open ("/Downloads/Orthogroups.GeneCount.tsv","r") as f:
    f.readline()
    line = f.readline()
    while line:
        parts = line.strip().split("\t")
        dic['group'].append(parts[0])
        dic['sp1'].append(parts[1])
        dic['sp2'].append(parts[2])
        dic['sp3'].append(parts[3])
        dic['sp4'].append(parts[4])
        dic['sp5'].append(parts[5])
        dic['sp6'].append(parts[6])
        dic['total'].append(parts[7])
        line = f.readline()
df = pd.DataFrame(data=dic).set_index("group")
cols = df.columns[df.dtypes.eq('object')]  # based on https://stackoverflow.com/a/36814203/8508004
df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')  # based on https://stackoverflow.com/a/36814203/8508004
group_dict = {}

for index, row in df.iterrows():
    for sp, count in row.items():
        if sp != "total" and count != 0:
            group_dict.setdefault(index, []).append(sp)


x = from_memberships(group_dict.values()).sort_values(ascending=False)
UpSet(x, subset_size='count', show_counts=True, element_size=10).plot()
plt.title("UpSet with catplots, for orientation='horizontal'")
plt.show()

How is it possible to prevent the labels are overlapping?

I appreciate any help you can provide.

Best wishes,

user3523406
  • 309
  • 2
  • 6
  • 17

0 Answers0