1

I would like to plot my data in ggplot with using points. It creates this plot: enter image description here

As you can see that it isn't good so I decided to use log scale to get better outcome and my data has 0's that create infinite. I used this script to convert infinite to 0;

test.data$d.log[is.infinite(test.data$d.log)] <- 0
test.data$f.log[is.infinite(test.data$f.log)] <- 0 
test.data=test.data[complete.cases(test.data), ]  

and my data (test.data) look like this;

                friend_ratio degree_ratio       f.log    d.log
oncevatan81        0.7763884     23.66667 -0.25310235 3.164068
hatunkotu          0.4991004      0.00000 -0.69494803 0.000000
TwitineGeldim      0.9838102     45.00000 -0.01632226 3.806662
Kralice_Hanim      0.9278909      0.00000 -0.07484108 0.000000
buguzelmi          0.7362599   2302.00000 -0.30617214 7.741534
DogrulariYaziyo    0.8489903      0.00000 -0.16370754 0.000000

You can download sample data from here: https://drive.google.com/open?id=0B1HBIov_NABWWXRobmZwV0Z2Tmc

I use this script to plot;

p<-ggplot(data=test.data, aes(x=f.log, y=d.log)) +
        stat_binhex(aes(x= f.log, y=d.log,alpha=..count..),fill="#000000" )+ 
        guides(fill=FALSE,colour=FALSE) +
        geom_hline(yintercept = 0, size = 0.5,color="red",linetype = 2) +
        geom_vline(xintercept = 0, size = 0.5,color="red",linetype = 2) +
        theme_bw()

and it creates this plot; enter image description here

As you can see that it creates one hexagon for one dot on upper left corner and it not a right representation of data.

My question is that can I do inf cleaning inside of the scale_x_log10( ) function in this code;

p<-ggplot(data=test.data, aes(x=friend_ratio, y=degree_ratio)) +
        scale_x_log10(breaks = trans_breaks("log10", function(x) 10^x),
                      labels = trans_format("log10", math_format(10^.x)))+
        scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
                      labels = trans_format("log10", math_format(10^.x)))+
        geom_hex(aes(x= friend_ratio, y=degree_ratio))+
        geom_hline(yintercept = 1, size = 0.5,color="red",linetype = 2)+
        geom_vline(xintercept = 1, size = 0.5,color="red",linetype = 2)+
        theme_bw() 
eabanoz
  • 251
  • 3
  • 17
  • I'm very confused. I don't see what you don't like about the first plot. Is it the points on top and bottom? Setting infinite values to 0 seems like a very strange choice. Can you just subset your data? This can be done easily inside ggplot: `ggplot(data = subset(test.data, is.finite(d.log) & is.finite(f.log)), ...` – Gregor Thomas Oct 09 '15 at 23:45
  • And if you want the smallest bins to be completely transparent, you could add `scale_alpha_continuous(range = c(0, 1))` (it defaults to a minimum of 0.1). You could also log-transform that scale: `scale_alpha_continuous(range = c(0, 1), trans = "log")` – Gregor Thomas Oct 09 '15 at 23:53
  • Sorry bad explanation. I need to aggregate dots to see the data clearly and yes they are the point on top and bottom so this is an other reason to use log scale. I try to subset data but it reduce almost %75 of all data so it didn't give any idea about data. – eabanoz Oct 09 '15 at 23:57
  • Thank you advice for scale_alpha_continuous(range = c(0, 1), trans = "log"). It looks better. – eabanoz Oct 10 '15 at 00:01
  • But surely changing infinity to 0 doesn't give you a good idea of your data either. And in your first figure you're apparently already using log scales for x and y axes. You're getting the same figure if you log your data and use a linear scale. – Gregor Thomas Oct 10 '15 at 00:01
  • You are right about it. I shouldn't convert inf to 0. Thanks again. – eabanoz Oct 10 '15 at 00:03

1 Answers1

2

Turning my comment to an answer, you can use a log scale for the fill transparency with

scale_alpha_continuous(range = c(0, 1), trans = "log")

Specifying that the range starts at 0 will make the smallest bin completely transparent, which means you won't see hexagons for small numbers of points.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • I would like to ask that after applying scale_alpha_continuous(range = c(0, 1), trans = "log") into the plot, count numbers change to 1; 20.085, 403.428 and 8103.083 instead of 20K,40K and 60K. Is there a way to keep same alpha setting with old count numbers? – eabanoz Oct 10 '15 at 00:11
  • @eabanoz See csgillespie's answer here: http://stackoverflow.com/a/14258838/903061 – Gregor Thomas Oct 10 '15 at 00:17