0

I have a spreadsheet with hundreds of variables, that I need to visualize. To do so I'm using df.melt() to convert the dataset to long format. Which is working well to visualize all the variables in subplots.

The only problem is that it creates a lot of charts. I'm wondering if there is a way to "melt" only a certain number of variables at a time, something that would allow you to select which columns you want to melt in the dataframe. For example (pseudo code):

df.melt(id_vars=[x,y,z], var_name = 'Metric', cols = [4:15])

I know "cols =" is not a valid argument but is there something I could do similar to that to just visualize chunks of the metrics at a time?

Here is an example of my existing code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

pd.set_option("display.max_columns", 200)
data = {'PTID': [11111, 11111, 11111, 11111, 22222, 22222, 22222, 22222, 33333, 33333, 33333, 33333, 44444, 44444, 44444, 44444, 55555, 55555, 55555, 55555],
        'Period' : ['Baseline','p1','p2','p3','Baseline','p1','p2','p3','Baseline','p1','p2','p3',  'Baseline','p1','p2','p3', 'Baseline','p1','p2','p3']     
        'ALK PHOS': [46.0, 94.0, 21.0, 18.0, 46.0, 94.0, 21.0, 18.0, 46.0, 94.0, 21.0, 18.0, 46.0, 94.0, 21.0, 18.0, 46.0, 94.0, 21.0, 18.0],
        'AST (SGOT)': [23.0, 82.0, 13.0, 17.0, 23.0, 82.0, 13.0, 17.0, 23.0, 82.0, 13.0, 17.0, 23.0, 82.0, 13.0, 17.0, 23.0, 82.0, 13.0, 17.0],
        '% Saturation- Iron': [34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0]}

dfm = df.melt(id_vars=['PTID','Period'], var_name='Metric',value_name='Value')

dfg = dfm.groupby(['PTID','Name','Period', 'Metric'])['Value'].mean().reset_index()

p = sns.relplot(data=dfg, col='Metric', x='Period', y='Value', kind='scatter', hue = 'Name',col_wrap=5, marker='o', palette='tab10',facet_kws={'sharey': False, 'sharex': True},)
p.map(sns.lineplot, 'Period', 'Value',  linestyle='--', color='gray', ci = None)
John Conor
  • 722
  • 6
  • 20
  • Preselect the relevant columns with `.iloc`: `df.iloc[:, index of all columns to use].melt(...)` – Trenton McKinney Nov 23 '21 at 18:26
  • It's close, though I need to the initial columns (PTID, Period) to stay static so for example it would be df = df[0:2] + df:[5:10]... dfn = df[0:2] + df:[n:n]. Though now I think about it maybe something like that would work and I just need to create a for loop to make df's in a similar fashion and pass to the seaborn chart. – John Conor Nov 23 '21 at 18:39
  • `df[0:2]` is row slicing, not column slicing. You need to use `iloc`. Use this [answer](https://stackoverflow.com/a/38828498/7758804) for multiple slices. e.g. `df.iloc[:, np.r_[0:2, 5:10]]` – Trenton McKinney Nov 23 '21 at 18:46
  • It's close but it's combnding rows instead of columns... even when I set the axis = 1. cdf = pd.concat([df.iloc[0:3], df.iloc[9:20]],axis = 1) print(cdf.shape) cdf – John Conor Nov 23 '21 at 18:58
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/239507/discussion-between-john-conor-and-trenton-mckinney). – John Conor Nov 23 '21 at 19:05
  • 1
    Got it... needed to upate the slices so using columns not rows: Ah ha! Got it... cdf = pd.concat([df.iloc[:,0:3], df.iloc[:,9:20]],axis = 1,) – John Conor Nov 23 '21 at 19:11
  • 2
    I closed the question because **I'm wondering if there is a way to "melt" only a certain number of variables at a time, something that would allow you to select which columns you want to melt in the dataframe.** is answered by preselecting the columns with `pd.concat([df.iloc[:,0:3], df.iloc[:,9:20]],axis = 1,)`, or specifying `id_vars=` **and** `value_vars=`, which are both covered in the duplicates. A question should be about one and only one thing, not multiple things. If there is a different question to be answered, beyond the primary question, it should be a new question. – Trenton McKinney Nov 23 '21 at 20:18
  • 1
    Ok... well I got to where I needed to be so thanks again! – John Conor Nov 23 '21 at 20:22

0 Answers0