2

I have data that are multidimensional compositional data (all dimensions sum to 1 or 100). I have learned how to use three of the variables to create a 2d ternary plot.

2d Ternary Plot

I would like to add a fourth dimension such that my plot looks like this.

Pyramid Ternary Plot

I am willing to use python or R. I am using pyr2 to create the ternary plots in python using R right now, but just because that's an easy solution. If the ternary data could be transformed into 3d coordinates a simple wire plot could be used. This post shows how 3d compositional data can be transformed into 2d data so that normal plotting method can be used. One solution would be to do the same thing in 3d.

Here is some sample Data:

          c1        c2        c3        c4
0   0.082337  0.097583  0.048608  0.771472
1   0.116490  0.065047  0.066202  0.752261
2   0.114884  0.135018  0.073870  0.676229
3   0.071027  0.097207  0.070959  0.760807
4   0.066284  0.079842  0.103915  0.749959
5   0.016074  0.074833  0.044532  0.864561
6   0.066277  0.077837  0.058364  0.797522
7   0.055549  0.057117  0.045633  0.841701
8   0.071129  0.077620  0.049066  0.802185
9   0.089790  0.086967  0.083101  0.740142
10  0.084430  0.094489  0.039989  0.781093
bart cubrich
  • 1,184
  • 1
  • 14
  • 41

3 Answers3

3

Well, I solved this myself using a wikipedia article, an SO post, and some brute force. Sorry for the wall of code, but you have to draw all the plot outlines and labels and so forth.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d, Axes3D 
from itertools import combinations
import pandas as pd

def plot_ax():               #plot tetrahedral outline
    verts=[[0,0,0],
     [1,0,0],
     [0.5,np.sqrt(3)/2,0],
     [0.5,0.28867513, 0.81649658]]
    lines=combinations(verts,2)
    for x in lines:
        line=np.transpose(np.array(x))
        ax.plot3D(line[0],line[1],line[2],c='0')

def label_points():  #create labels of each vertices of the simplex
    a=(np.array([1,0,0,0])) # Barycentric coordinates of vertices (A or c1)
    b=(np.array([0,1,0,0])) # Barycentric coordinates of vertices (B or c2)
    c=(np.array([0,0,1,0])) # Barycentric coordinates of vertices (C or c3)
    d=(np.array([0,0,0,1])) # Barycentric coordinates of vertices (D or c3)
    labels=['a','b','c','d']
    cartesian_points=get_cartesian_array_from_barycentric([a,b,c,d])
    for point,label in zip(cartesian_points,labels):
        if 'a' in label:
            ax.text(point[0],point[1]-0.075,point[2], label, size=16)
        elif 'b' in label:
            ax.text(point[0]+0.02,point[1]-0.02,point[2], label, size=16)
        else:
            ax.text(point[0],point[1],point[2], label, size=16)

def get_cartesian_array_from_barycentric(b):      #tranform from "barycentric" composition space to cartesian coordinates
    verts=[[0,0,0],
         [1,0,0],
         [0.5,np.sqrt(3)/2,0],
         [0.5,0.28867513, 0.81649658]]

    #create transformation array vis https://en.wikipedia.org/wiki/Barycentric_coordinate_system
    t = np.transpose(np.array(verts))        
    t_array=np.array([t.dot(x) for x in b]) #apply transform to all points

    return t_array

def plot_3d_tern(df,c='1'): #use function "get_cartesian_array_from_barycentric" to plot the scatter points
#args are b=dataframe to plot and c=scatter point color
    bary_arr=df.values
    cartesian_points=get_cartesian_array_from_barycentric(bary_arr)
    ax.scatter(cartesian_points[:,0],cartesian_points[:,1],cartesian_points[:,2],c=c)





#Create Dataset 1
np.random.seed(123)
c1=np.random.normal(8,2.5,20)
c2=np.random.normal(8,2.5,20)
c3=np.random.normal(8,2.5,20)
c4=[100-x for x in c1+c2+c3]   #make sur ecomponents sum to 100

#df unecessary but that is the format of my real data
df1=pd.DataFrame(data=[c1,c2,c3,c4],index=['c1','c2','c3','c4']).T
df1=df1/100


#Create Dataset 2
np.random.seed(1234)
c1=np.random.normal(16,2.5,20)
c2=np.random.normal(16,2.5,20)
c3=np.random.normal(16,2.5,20)
c4=[100-x for x in c1+c2+c3]

df2=pd.DataFrame(data=[c1,c2,c3,c4],index=['c1','c2','c3','c4']).T
df2=df2/100


#Create Dataset 3
np.random.seed(12345)
c1=np.random.normal(25,2.5,20)
c2=np.random.normal(25,2.5,20)
c3=np.random.normal(25,2.5,20)
c4=[100-x for x in c1+c2+c3]

df3=pd.DataFrame(data=[c1,c2,c3,c4],index=['c1','c2','c3','c4']).T
df3=df3/100

fig = plt.figure()
ax = Axes3D(fig) #Create a 3D plot in most recent version of matplot

plot_ax() #call function to draw tetrahedral outline

label_points() #label the vertices

plot_3d_tern(df1,'b') #call function to plot df1

plot_3d_tern(df2,'r') #...plot df2

plot_3d_tern(df3,'g') #...

enter image description here

bart cubrich
  • 1,184
  • 1
  • 14
  • 41
2

The accepted answer explains how to do this in python but the question was also asking about R.

I've provided an answer in this thread on how to do this 'manually' in R.

Otherwise, you can use the klaR package directly for this:

df <- matrix(c(
  0.082337, 0.097583, 0.048608, 0.771472,
  0.116490, 0.065047, 0.066202, 0.752261,
  0.114884, 0.135018, 0.073870, 0.676229,
  0.071027, 0.097207, 0.070959, 0.760807,
  0.066284, 0.079842, 0.103915, 0.749959,
  0.016074, 0.074833, 0.044532, 0.864561,
  0.066277, 0.077837, 0.058364, 0.797522,
  0.055549, 0.057117, 0.045633, 0.841701,
  0.071129, 0.077620, 0.049066, 0.802185,
  0.089790, 0.086967, 0.083101, 0.740142,
  0.084430, 0.094489, 0.039989, 0.781094
), byrow = TRUE, nrow = 11, ncol = 4)

# install.packages(c("klaR", "scatterplot3d"))
library(klaR)
#> Loading required package: MASS

quadplot(df)

Created on 2020-08-14 by the reprex package (v0.3.0)

Droplet
  • 935
  • 9
  • 12
1

I recently published a python library called python-quaternary to do what you need. You can download it using pip install: https://github.com/sachour/python-quaternary. It is not 100% complete, still working on it, so we can work together to determine what features are needed to make it more useful and user-friendly. Hope it helps, Sofiane