1

I asked this question to know how it is possible to plot many graphs in the same plot. Following to the answer which I liked and accepted, it is possible to use ggplot() function.

Now using ggplot(), I receive the following message which notifies that there are missing values were deleted for the plot:

Warning message:
Removed 33 row(s) containing missing values (geom_path).

From the produced plot and visualizing, I am satisfied with data after that ggplot() removed the 33 rows.

I know how to delete rows of NA but here I don't understand if ggplot() deleted rows where there exist NA for at least one variable OR removed rows where all variables are NA, knowing that I have 7 variables and there are some rows where all variables are completely NA while many rows contain NA for only some variables.

Question: Although the rows are already deleted for the plot, how it is possible to remove these rows "the detected 33 rows" completely from data?

zx8754
  • 52,746
  • 12
  • 114
  • 209

2 Answers2

1

ggplot removes rows with NA for columns that are used as input aes to ggplot, if input is x and y columns, but dataframe has y column as well, it will only drop rows if x or y has NA.

Here is an example:

library(ggplot2)

x <- head(mtcars)

# add NA to some column we don't use for ggplot
x$am[ 1 ] <- NA

ggplot(x, aes(cyl, mpg)) + geom_point()
# no warnings

# now add NA to column that we use for plotting
x$cyl[ 1 ] <- NA

ggplot(x, aes(cyl, mpg)) + geom_point()
# Warning message:
#   Removed 1 rows containing missing values (geom_point). 

# to avoid that warning, we can explicitly set it to remove NA
ggplot(x, aes(cyl, mpg)) + geom_point(na.rm = TRUE)
# no warnings

To remove rows from the data, check if the selected columns have NA:

x_clean <- x[ !(is.na(x$cyl) | is.na(x$mpg)), ]
ggplot(x_clean , aes(cyl, mpg)) + geom_point()
# no warnings

Edit 1: To apply to your data based on comments, try below, see filter:

Data <- bind_rows(...)
Data %>%
  mutate(data = paste0('Data',data)) %>%
  pivot_longer(-c(data,Time)) %>%
  filter(!(is.na(Time) | is.na(value))) %>% 
  ggplot(aes(x = factor(Time), y =value), group = name, color = name))+
  geom_line()+
  facet_wrap(.~data,scales = 'free', ncol = 1) +
  xlab('Time')

Edit 2: To "know" what data is going into ggplot why not keep filtered clean data as a separate object instead of piping, see:

Data <- bind_rows(...)
cleanData <- Data %>% 
  mutate(data = paste0('Data',data)) %>%
  pivot_longer(-c(data,Time)) %>%
  filter(!(is.na(Time) | is.na(value)))
  
ggplot(cleanData, aes(x = factor(Time), y =value), group = name, color = name)+
  geom_line()+
  facet_wrap(.~data,scales = 'free', ncol = 1) +
  xlab('Time')
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • Thanks a lot. I see that you understood my question which asks to remove only the columns detected by `ggplot()`. Now I used to plot my data the following: `Data <- bind_rows(d10[,-1],d11[,-1],d12[,-1],d13[,-1],d14[,-1],d15[,-1],d16[,-1],d17[,-1],d18[,-1], .id = 'data') Data %>% mutate(data=paste0('Data',data)) %>% pivot_longer(-c(data,Time)) %>% ggplot(aes(x=factor(Time),y=value,group=name,color=name))+ geom_line(na.rm=TRUE)+ facet_wrap(.~data,scales = 'free',ncol=1)+ xlab('Time')` – Sophie Allan Sep 29 '20 at 11:45
  • But I can't remove the rows of data as you suggested above. I mean to say `x_clean <- x[ !(is.na(x$cyl) | is.na(x$mpg)), ]` – Sophie Allan Sep 29 '20 at 11:45
  • @SophieAllan see Edit, we need to filter before plotting. – zx8754 Sep 29 '20 at 11:58
  • I will follow your first suggestion. So I will add `na.rm = TRUE` and this will delete the rows for the plot but how can I delete these rows additionally from data for future analysis. What can I use to delete the same rows of data which are removed using `na.rm = TRUE`. Thanks a lot in advance – Sophie Allan Sep 29 '20 at 12:22
  • @SophieAllan Is this question resolved? Accept if resolved. – zx8754 Sep 29 '20 at 20:43
  • not yet! I still don't know how to return the data plotted by ggplot(). I mean, I don't know how to return the data which ggplot() used after deleting the rows? – Sophie Allan Sep 29 '20 at 20:52
  • @SophieAllan then do not pipe into ggplot, keep data separate `x <- data %>% clean_up` then use x for ggplot: `ggplot(x, aes(...)) + geom_xxx` this way we know x. – zx8754 Sep 29 '20 at 20:54
  • Would you please clarify your last suggestion because I am trying it with no success! – Sophie Allan Sep 29 '20 at 21:06
  • The second edit you added gives the following error: `Error in ggplot.default(., cleanData, aes(x = factor(Time), y = value, : object 'cleanData' not found`. Additionally, there is an extra bracket `)` next to `y =value)` which must be deleted – Sophie Allan Sep 30 '20 at 13:58
  • @SophieAllan typos fixed, it was just an idea, have your plotting data prepared and stored in an object (cleanData) before plotting. – zx8754 Sep 30 '20 at 16:28
  • I am very grateful for your interest and patience but really this answer is not producing the data which I am looking for! What I am looking for is to **return and extract the data of ggplot()**. This data will not contain the deleted row of normal or out of bounds limits so there will be no need to filter. Filtering as in the edit above will **delete more rows _than_ the number of deleted rows (33) in ggplot()** . BTW I think it my goalcould be achieved as in the answer of this question (https://stackoverflow.com/questions/25378184/extract-data-from-a-ggplot) but I don't know how! – Sophie Allan Oct 01 '20 at 08:26
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222348/discussion-between-zx8754-and-sophie-allan). – zx8754 Oct 01 '20 at 08:28
0

Those rows could have NA values, or they could be out of bounds of the axis limits you set. ggplot() generates the same warning in both cases. Here is an example of the latter.

This is the built-in mtcars data set. Notice that there are no missing values:

mtcars
                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

If I build the following plot, I get the ggplot warning about rows with missing values.

library(ggplot2)
ggplot(mtcars, aes(x = wt, y = qsec)) + 
  geom_point() +
  scale_x_continuous(limits = c(2, 4)) +
  scale_y_continuous(limits = c(16, 22))
Warning message:
Removed 14 rows containing missing values (geom_point).

enter image description here

The 14 rows with "missing values" are the 14 rows with data out of bounds of the axis limits. Here they are.

library(dplyr)
mtcars %>%
  filter(wt < 2 | wt > 4 | qsec < 16 | qsec > 22)
                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8

Before attempting to remove "missing values" from your data, check to see if your plotting parameters exclude some of the data.

Ben Norris
  • 5,639
  • 2
  • 6
  • 15
  • Thanks for your answer. Assuming that the 33 rows were deleted by ggplot() due to the bounds of the axis limits, how it is possible to know them and delete them additionally from data in order to use them for further analysis? I don't want to delete any row! I need to delete the rows detected by ggplot() and I receive a warning because of them. BTW, filter() as suggested in the answer above didn't work! – Sophie Allan Sep 29 '20 at 13:02