0

I need some help with an error message returning when using the chordDiagram() function from the circlize package.

I am working with fisheries landings. Fishing vessels start their trip in one port (homeport PORT_DE), and land their catch (scallops in this case) in another port (landing port PORT_LA ). I am trying to draw a chord diagram using circlize package to visualise the flow of landings between ports. I have 161 unique ports and the port names are stored as character strings.

Before calling the chordDiagram() function to draw the chord diagram, I store the relevant columns in a dummy object (m).

# Store relevant column
m <- data.frame(PORT_DE = VMS_by_trips$PORT_DE_Label, 
            PORT_LA = VMS_by_trips$PORT_LA_Label, 
            SCALLOP_W = VMS_by_trips$Trip_SCALLOP_W)

head(m)
# PORT_DE  PORT_LA SCALLOP_W
# 1  Arbroath Arbroath  2.147143
# 2  Eyemouth Aberdeen  8.791970
# 3    Buckie Aberdeen  2.025833
# 4  Montrose Aberdeen  8.268540
# 5  Aberdeen Aberdeen  1.358286
# 6 Peterhead Aberdeen  0.797500

I then create an adjacency matrix using dcast() and rename rows.

require(reshape2)
m <- as.matrix(dcast(m, PORT_DE ~ PORT_LA, value.var = "SCALLOP_W", fun.aggregate = sum))
dim(m) #adjecency matrix represents port pairs
#[1] 153 138

row.names(m) <- m[,1]
m <- m[,2:dim(m)[2]]
class(m) <- "numeric"

Finally, I call the plot function chordDiagram() .

library(circlize) 
chordDiagram(m) 

Unfortunately, this results in an error message.

Error in `[.data.frame`(df, c(1, 2, 5)) : undefined columns selected

If I replace the row and column names with numbers, the function runs, and the correct plot is returned.

row.names(m) <- 1:153
colnames(m) <- 1:137

Any ideas how to run the function with the actual port names?

I have already tried to remove special characters, replace " " spaces with "_" underscores, keep a smaller number of characters, keep only a few port pairs. Unfortunately the same error keeps appearing. Any help appreciated.

Please note that since posting this question, I have managed to create the visualisation needed. Here is a link to another related question, which also includes the code to adjust various settings of a chord diagram.

Adjust highlight.sector() width and placement - Chord diagram (circlize package) in R

Andronikos K.
  • 105
  • 2
  • 9
  • `chordDiagram` will work on a data frame as well as a matrix.. see `?chordDiagramFromDataFrame` – guyabel Apr 11 '17 at 02:29
  • @gjabel Thank you for prompt response. I will have a look and respond here if I manage to resolve myself. – Andronikos K. Apr 11 '17 at 08:11
  • I tried your six-line data (by `head(m)`) with the code you attached, there is no error occurs. Can you attach the full dataset? On the other hand, `chordDiagram()` can be applied directly to the data frame and you don't need to convert it to a matrix. – Zuguang Gu Apr 11 '17 at 08:19
  • @ZuguangGu Thank you for responding to my questions via email. I have now posted the answer to my question below for the benefit of the stackoverflow community. – Andronikos K. May 26 '17 at 16:58

1 Answers1

0

With thanks to @ZuguangGu, the reason for the error message was the NAs in my column names. If you remove them first, then the chord diagram plots just fine. Following the same notation, please see below.

#create adjacency matrix
m <- data.frame(PORT_DE = VMS_by_trips$PORT_DE_Label, 
                PORT_LA = VMS_by_trips$PORT_LA_Label, 
                SCALLOP_W = VMS_by_trips$Trip_SCALLOP_W)


#Check for NA values in your dataset
which(is.na(m[, 1]))
which(is.na(m[, 2]))

#Remove the rows which have NA values, there will not be errors any more.
df = m
df = df[!(is.na(df[[1]]) | is.na(df[[2]])), ]

require(reshape2)
m <- dcast(df, PORT_DE ~ PORT_LA, value.var = "SCALLOP_W", fun.aggregate = sum)
row.names(m) <- m[,1]
m <- as.matrix(m[, -1])

# remove self-links
m2 = m
cn = intersect(rownames(m2), colnames(m2)) 
for(i in seq_along(cn)) {
  m2[cn[i], cn[i]] = 0
}

# Export 3 versions of the chord diagram in a PDF

library(circlize) 

pdf("test.pdf")

# Use all data
chordDiagram(m)
title("using all data")

#remove self-links
chordDiagram(m2)
title("remove self-links")

#here reduce = 0.01 means to remove ports which have capacity less than 0.01 of capacity of all ports.
chordDiagram(m2, reduce = 0.01)
title("remove self-links and small sectors")

dev.off()
Andronikos K.
  • 105
  • 2
  • 9