1

I'm trying to save the equivalent of head of an sf object in a list. When you use head on a sf class it prints slightly misleading information to the console about the perimeter of the data, see bbox information is different from the original:

library(sf)
library(tidyverse)
nc <- st_read(system.file("shape/nc.shp", package = "sf"))
# Simple feature collection with 100 features and 14 fields
# geometry type:  MULTIPOLYGON
# dimension:      XY
# bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
# geographic CRS: NAD27


head(nc)
# Simple feature collection with 6 features and 14 fields
# geometry type:  MULTIPOLYGON
# dimension:      XY
# bbox:           xmin: -81.74107 ymin: 36.07282 xmax: -75.77316 ymax: 36.58965
# geographic CRS: NAD27
#    AREA PERIMETER CNTY_ CNTY_ID        NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79                       geometry
# 1 0.114     1.442  1825    1825        Ashe 37009  37009        5  1091     1      10  1364     0      19 MULTIPOLYGON (((-81.47276 3...
# 2 0.061     1.231  1827    1827   Alleghany 37005  37005        3   487     0      10   542     3      12 MULTIPOLYGON (((-81.23989 3...
# 3 0.143     1.630  1828    1828       Surry 37171  37171       86  3188     5     208  3616     6     260 MULTIPOLYGON (((-80.45634 3...
# 4 0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1     123   830     2     145 MULTIPOLYGON (((-76.00897 3...
# 5 0.153     2.206  1832    1832 Northampton 37131  37131       66  1421     9    1066  1606     3    1197 MULTIPOLYGON (((-77.21767 3...
# 6 0.097     1.670  1833    1833    Hertford 37091  37091       46  1452     7     954  1838     5    1237 MULTIPOLYGON (((-76.74506 3...

That's because I think it calculates bbox for the first 6 observation instead of the whole dataframe. As an alternative you can run the following print (see bbox is the same for the full dataset):

print(nc, n = getOption("sf_max_print", default = 6))
# Simple feature collection with 100 features and 14 fields
# geometry type:  MULTIPOLYGON
# dimension:      XY
# bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
# geographic CRS: NAD27
# First 6 features:
#    AREA PERIMETER CNTY_ CNTY_ID        NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79                       geometry
# 1 0.114     1.442  1825    1825        Ashe 37009  37009        5  1091     1      10  1364     0      19 MULTIPOLYGON (((-81.47276 3...
# 2 0.061     1.231  1827    1827   Alleghany 37005  37005        3   487     0      10   542     3      12 MULTIPOLYGON (((-81.23989 3...
# 3 0.143     1.630  1828    1828       Surry 37171  37171       86  3188     5     208  3616     6     260 MULTIPOLYGON (((-80.45634 3...
# 4 0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1     123   830     2     145 MULTIPOLYGON (((-76.00897 3...
# 5 0.153     2.206  1832    1832 Northampton 37131  37131       66  1421     9    1066  1606     3    1197 MULTIPOLYGON (((-77.21767 3...
# 6 0.097     1.670  1833    1833    Hertford 37091  37091       46  1452     7     954  1838     5    1237 MULTIPOLYGON (((-76.74506 3...

How can I save this print object in a list (which is going to be used in a function afterwards) without printing it to the console?

When I crudely put it in a list without trying to suppress the output it prints as expected:

a <- lst(dim(nc), head_sf = print(nc, n = getOption("sf_max_print", default = 6)))
# Simple feature collection with 100 features and 14 fields
# geometry type:  MULTIPOLYGON
# dimension:      XY
# bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
# geographic CRS: NAD27
# First 6 features:
#    AREA PERIMETER CNTY_ CNTY_ID        NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79                       geometry
# 1 0.114     1.442  1825    1825        Ashe 37009  37009        5  1091     1      10  1364     0      19 MULTIPOLYGON (((-81.47276 3...
# 2 0.061     1.231  1827    1827   Alleghany 37005  37005        3   487     0      10   542     3      12 MULTIPOLYGON (((-81.23989 3...
# 3 0.143     1.630  1828    1828       Surry 37171  37171       86  3188     5     208  3616     6     260 MULTIPOLYGON (((-80.45634 3...
# 4 0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1     123   830     2     145 MULTIPOLYGON (((-76.00897 3...
# 5 0.153     2.206  1832    1832 Northampton 37131  37131       66  1421     9    1066  1606     3    1197 MULTIPOLYGON (((-77.21767 3...
# 6 0.097     1.670  1833    1833    Hertford 37091  37091       46  1452     7     954  1838     5    1237 MULTIPOLYGON (((-76.74506 3...

I thought invisible or sink would suppress the output but I can't figure it out, none of these work:

a <- invisible(lst(dim(nc), head_sf = print(nc, n = getOption("sf_max_print", default = 6))))
a <- lst(dim(nc), head_sf = invisible(print(nc, n = getOption("sf_max_print", default = 6))))
a <- lst(dim(nc), invisible(head_sf = print(nc, n = getOption("sf_max_print", default = 6))))

Any suggestions? thanks

EDIT: Using invisible and capture.output gets pretty much what I wanted (the output wasn't correct when I originally posted this - mistake on my part)

    a <- lst(dim(nc), invisible(capture.output(head_sf = print(nc, n = getOption("sf_max_print", default = 6)))))
    a
    $`dim(nc)`
[1] 100  15

$`invisible(...)`
 [1] "Simple feature collection with 100 features and 14 fields"                                                                               
 [2] "geometry type:  MULTIPOLYGON"                                                                                                            
 [3] "dimension:      XY"                                                                                                                      
 [4] "bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965"                                                           
 [5] "geographic CRS: NAD27"                                                                                                                   
 [6] "First 6 features:"                                                                                                                       
 [7] "   AREA PERIMETER CNTY_ CNTY_ID        NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79                       geometry"
 [8] "1 0.114     1.442  1825    1825        Ashe 37009  37009        5  1091     1      10  1364     0      19 MULTIPOLYGON (((-81.47276 3..."
 [9] "2 0.061     1.231  1827    1827   Alleghany 37005  37005        3   487     0      10   542     3      12 MULTIPOLYGON (((-81.23989 3..."
[10] "3 0.143     1.630  1828    1828       Surry 37171  37171       86  3188     5     208  3616     6     260 MULTIPOLYGON (((-80.45634 3..."
[11] "4 0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1     123   830     2     145 MULTIPOLYGON (((-76.00897 3..."
[12] "5 0.153     2.206  1832    1832 Northampton 37131  37131       66  1421     9    1066  1606     3    1197 MULTIPOLYGON (((-77.21767 3..."
[13] "6 0.097     1.670  1833    1833    Hertford 37091  37091       46  1452     7     954  1838     5    1237 MULTIPOLYGON (((-76.74506 3..." 
user63230
  • 4,095
  • 21
  • 43
  • 1
    perhaps using `quote` that suppresses print to console, then `eval` on `a_quote$head_sf`: `a_quote <- quote(lst(dim(nc), head_sf = print(nc, n = getOption("sf_max_print", default = 6))))`; `eval(a_quote$head_sf)`? – Chris Jul 14 '20 at 22:17
  • @Chris thanks, thats useful but when I try it within a function its not working as expected `mylist <- lst(nc); b <- lapply(mylist, function(x) { a_quote <- quote(lst(dims = dim(x), head_sf = print(x, n = getOption("sf_max_print", default = 6)))) eval(a_quote$dims) eval(a_quote$head_sf) } );b` when I call `b` it automatically prints but doesn't evaluate `dims` or `head_sf`? – user63230 Jul 16 '20 at 10:38
  • 1
    Seems like a sort of scoping thing. Quote wraps everything up for later processing and suppresses the print to console, and preserves the proper bbox. Now, throwing all this inside `lapply` (head aches, eyes bleed), perhaps wants a `do.call` on `eval`(s), or perhaps this is `mapply` land. In sum, we're looking for something before or after (or between) `=c)))) ;` and `eval(a_quo` . OK, I'll play around, sorry this isn't more conclusive. – Chris Jul 17 '20 at 01:32

1 Answers1

1

Probably a bad approach, not knowing what further processing the return values will be subjected to (I get the sense that separate dims, head_sf are undesirable), and especially bad if you abhor loops, but uses variant of your invisible(capture.output as seen suppress auto console output:

library(sf)
library(tidyverse)
nc <- st_read(system.file("shape/nc.shp", package = "sf"))
nc1 <- nc
nc2 <- nc
nc3 <- nc
nc_lst <- list(nc1, nc2, nc3)

prepare_sf_head_for_further_process_loop3 <- function(nc_lst) { 
a_quote <- list()
for(i in 1:length(nc_lst)) {
invisible(capture.output(a_quote[[i]] <- list(dims = dim(nc_lst[[i]]), head_sf = print(nc_lst[[i]], n = getOption('sf_max_print', default = 6)))))
}
return(a_quote)
}


>b3 <- prepare_sf_head_for_further_process_loop3(nc_lst)
>b3
[[1]]
[[1]]$dims
[1] 100  15

[[1]]$head_sf
Simple feature collection with 100 features and 14 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
geographic CRS: NAD27
First 10 features:
    AREA PERIMETER CNTY_ CNTY_ID        NAME  FIPS FIPSNO CRESS_ID BIR74 SID74
1  0.114     1.442  1825    1825        Ashe 37009  37009        5  1091     1
2  0.061     1.231  1827    1827   Alleghany 37005  37005        3   487     0
3  0.143     1.630  1828    1828       Surry 37171  37171       86  3188     5
4  0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1
5  0.153     2.206  1832    1832 Northampton 37131  37131       66  1421     9
6  0.097     1.670  1833    1833    Hertford 37091  37091       46  1452     7
7  0.062     1.547  1834    1834      Camden 37029  37029       15   286     0
8  0.091     1.284  1835    1835       Gates 37073  37073       37   420     0
9  0.118     1.421  1836    1836      Warren 37185  37185       93   968     4
10 0.124     1.428  1837    1837      Stokes 37169  37169       85  1612     1
   NWBIR74 BIR79 SID79 NWBIR79                       geometry
1       10  1364     0      19 MULTIPOLYGON (((-81.47276 3...
2       10   542     3      12 MULTIPOLYGON (((-81.23989 3...
3      208  3616     6     260 MULTIPOLYGON (((-80.45634 3...
4      123   830     2     145 MULTIPOLYGON (((-76.00897 3...
5     1066  1606     3    1197 MULTIPOLYGON (((-77.21767 3...
6      954  1838     5    1237 MULTIPOLYGON (((-76.74506 3...
7      115   350     2     139 MULTIPOLYGON (((-76.00897 3...
8      254   594     2     371 MULTIPOLYGON (((-76.56251 3...
9      748  1190     2     844 MULTIPOLYGON (((-78.30876 3...
10     160  2038     5     176 MULTIPOLYGON (((-80.02567 3...

>class(nc_lst[[1]]
[1] "sf"         "data.frame"
>class(b3[[1]]$head_sf)
[1] "sf"         "data.frame"

Dang, don't know why getOption('sf_max_print', default = 6) didn't restrict to 6. dplyer has controlled quote/unquote, but I haven't worked out application to this as yet. These are my bad suggestions so far.

Additionally, from tidyverse What Hadley suggests, taking Hadley's and swapping head and print,

glimpse_head2 <- function(x, n = 6) {
head(print(x, n))
invisible(x)
} 

While this doesn't get the dim, seems far simpler than my above and provides the desired bbox values. And still doesn't restrict to 6, because the head method for sfg is hardcoded to 10L:

> getS3method('head', 'sfg')
function (x, n = 10L, ...) 
{
    structure(head(unclass(x), n = n, ...), class = class(x))
}
<bytecode: 0x5646a6a38228>
<environment: namespace:sf>
Chris
  • 1,647
  • 1
  • 18
  • 25
  • your edit is very interesting although it doesnt fully solve the problem like your solution does – user63230 Jul 20 '20 at 12:21
  • agreed, was bedeviled by the the 10L, so while sfg has a head method, it isn't exported and strictly that's as far as I understand the implications. – Chris Jul 20 '20 at 13:24