-2

I have this data:

list(nodes = structure(list(name = c(NA, NA, "1.1.1. Formação Florestal", 
"1.1.2. Formação Savanica", NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, "3.1. Pastagem", NA, NA, NA, "3.2.1. Cultura Anual e Perene", 
NA, "3.3. Mosaico de Agricultura e Pastagem", NA, NA, "4.2. Infraestrutura Urbana", 
"4.5. Outra Área não Vegetada", NA, NA, NA, NA, NA, NA, NA, "5.1 Rio ou Lago ou Oceano"
)), class = "data.frame", row.names = c(NA, -33L)), links = structure(list(
    source = c(3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 15L, 15L, 
    15L, 15L, 15L, 15L, 15L, 19L, 19L, 19L, 19L, 21L, 21L, 21L, 
    21L, 21L, 21L, 24L, 25L, 25L, 25L, 33L), target = c(3L, 21L, 
    4L, 21L, 15L, 3L, 25L, 4L, 33L, 19L, 15L, 21L, 3L, 25L, 4L, 
    33L, 15L, 19L, 4L, 21L, 4L, 21L, 25L, 33L, 15L, 3L, 4L, 25L, 
    4L, 33L, 33L), value = c(0.544859347827813, 0.00354385993588971, 
    0.494359662221154, 4.67602736159475, 2.20248911690968, 0.501437742068369, 
    0.00354375594818463, 24.8427814053755, 0.439418727642527, 
    0.0079740332093807, 11.8060486886398, 2.76329829691466, 0.000886029792298199, 
    0.00177186270758855, 3.35504921147758, 0.14263144351167, 
    1.12170804870686, 0.0478454594554582, 0.217079959877658, 
    0.00620223918980076, 1.79754946594068, 9.02868098124075, 
    0.00442981113709027, 0.242743895018645, 0.498770814980772, 
    0.00265782877794886, 0.000885894856554407, 0.379188333632346, 
    0.00265793188317263, 0.00265771537700804, 0.39158027235054
    )), row.names = c(NA, -31L), class = "data.frame"))

and I'm trying to produce a sankey diagram using the networkD3package with this simple code:

sankeyNetwork(Links = landuse$links, Nodes = landuse$nodes, Source = "source",
              Target = "target", Value = "value", NodeID = "name",
              units = "km²", fontSize = 12, nodeWidth = 30)

I received this message:

Warning message:
It looks like Source/Target is not zero-indexed. This is required in JavaScript and so your plot may not render.

But even if I zero-indexed the target/source nothing is redering in dev. I have the data in the same format like in this example, so I would like to know the possible problem.

EDIT:

I have auto-references and circular-references. Is it possible to do the diagram with this type of data using the package?

CJ Yetman
  • 8,373
  • 2
  • 24
  • 56
Artur_Indio
  • 736
  • 18
  • 35

2 Answers2

1

Well, because of how it is built sankeyNetwork, you need to start from 0 in your links. As you can see from landuse, your data start from 3.

You can reindex link to start from 0:

landuse$links$source <- landuse$links$source-3
landuse$links$target <- landuse$links$target-3
sankeyNetwork(Links = landuse$links, Nodes = landuse$nodes, Source = "source",
               Target = "target", Value = "value", NodeID = "name",
               units = "km²", fontSize = 12, nodeWidth = 30)

For sure, it is does not look as pretty as the sankey you link in your question. Why? Because of your data

  1. You have "autoreferences": links where the source and the target is the same node. That generates those weirds semicircles starting and ending in the same node
  2. You have "circular references": links where the source 'X' goes to target 'Y', source 'Y' going to target 'Z' and then source 'Z' going to target 'Z'. That generates those wierd curves
  3. Some of you values are several orders smaller than other, so those little one are badly visualized.

You need maybe sanity check your data:

  1. Are you really interested in "autoreferences". If not, delete them
  2. Are you comfortable with circular references or you will prefer to duplicate nodes to show a linear sankey?
  3. Are you interested in show very small nodes? If not, delete them
LocoGris
  • 4,432
  • 3
  • 15
  • 30
  • Thanks Jonny. So the problem is probably the package limitations, look...I can do the diagram with another language [link](https://ibb.co/J7JP2tR). – Artur_Indio Mar 06 '19 at 15:30
  • Cool, could you share the code and the package as a answer? I would love to use that one! – LocoGris Mar 06 '19 at 15:31
  • Unfortunately not. In fact I wanted to say "We can do", not I, I can´t share the code. But you can see that diagram in the Brazilian website project [MapBiomas](http://mapbiomas.org/#). – Artur_Indio Mar 06 '19 at 15:39
  • One question, how did you transform the RData file into code? Or did you type?I would like to know to get better my questions. – Artur_Indio Mar 06 '19 at 16:01
  • 1
    `dput(NAMEOFYOURVARIABLE)` Best! – LocoGris Mar 06 '19 at 16:03
1

Based on the example you provided a link to in one of your comments (here), you don't actually want auto and circular references, but instead what you want is two distinct nodes for each thing, one for the left column and one for the right column (e.g. "Formação Florestal" in the left/1985 column and "Formação Florestal" in the right/2017 column).

You can achieve that with the data you provided by distinguishing the source and target nodes that have the same index as separate nodes, like so...

landuse <- list(
  nodes = data.frame(
    name = c(
      NA, NA, "1.1.1. Formação Florestal", "1.1.2. Formação Savanica", NA, NA,
      NA, NA, NA, NA, NA, NA, NA, NA, "3.1. Pastagem", NA, NA, NA, 
      "3.2.1. Cultura Anual e Perene", NA, 
      "3.3. Mosaico de Agricultura e Pastagem", NA, NA, 
      "4.2. Infraestrutura Urbana", "4.5. Outra Área não Vegetada", NA, NA, NA,
      NA, NA, NA, NA,"5.1 Rio ou Lago ou Oceano"
    ),
    stringsAsFactors = FALSE
  ),
  links = data.frame(
    source = c(
      3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 
      19L, 19L, 19L, 19L, 21L, 21L, 21L, 21L, 21L, 21L, 24L, 25L, 25L, 25L, 33L
    ),
    target = c(
      3L, 21L, 4L, 21L, 15L, 3L, 25L, 4L, 33L, 19L, 15L, 21L, 3L, 25L, 4L, 33L,
      15L, 19L, 4L, 21L, 4L, 21L, 25L, 33L, 15L, 3L, 4L, 25L, 4L, 33L,33L
    ),
    value = c(
      0.544859347827813, 0.00354385993588971, 0.494359662221154, 
      4.67602736159475, 2.20248911690968, 0.501437742068369,
      0.00354375594818463, 24.8427814053755, 0.439418727642527,
      0.0079740332093807, 11.8060486886398, 2.76329829691466,
      0.000886029792298199, 0.00177186270758855, 3.35504921147758,
      0.14263144351167, 1.12170804870686, 0.0478454594554582,
      0.217079959877658, 0.00620223918980076, 1.79754946594068,
      9.02868098124075, 0.00442981113709027, 0.242743895018645,
      0.498770814980772, 0.00265782877794886, 0.000885894856554407,
      0.379188333632346, 0.00265793188317263, 0.00265771537700804,
      0.39158027235054
    ),
    stringsAsFactors = FALSE
  )
)

# create a links data frame where the right and left column versions of each node
# are distinguishble
links <- 
  data.frame(source = paste0(landuse$nodes$name[landuse$links$source], " (1985)"),
             target = paste0(landuse$nodes$name[landuse$links$target], " (2017)"),
             value = landuse$links$value,
             stringsAsFactors = FALSE)

# build a nodes data frame from the new links data frame
nodes <- data.frame(name = unique(c(links$source, links$target)), 
                    stringsAsFactors = FALSE)

# change the source and target variables to be the zero-indexed position of
# each node in the new nodes data frame
links$source <- match(links$source, nodes$name) - 1
links$target <- match(links$target, nodes$name) - 1

# remove the year indicator from the node names
nodes$name <- substring(nodes$name, 1, nchar(nodes$name) - 7)

# plot it
library(networkD3)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
              Target = "target", Value = "value", NodeID = "name",
              units = "km²", fontSize = 12, nodeWidth = 30)

enter image description here

CJ Yetman
  • 8,373
  • 2
  • 24
  • 56