0

I have a data frame with a string column in which I need to condense all equal values into one value. An example of my data set:

                      CommonName Month                      Site season period
23                Gambel's Quail   Oct McDowell Sonoran Preserve Autumn      4
24              American Kestrel   Nov McDowell Sonoran Preserve Autumn      4
25        Black-throated Sparrow   Nov McDowell Sonoran Preserve Autumn      4
26              Brewer's Sparrow   Nov McDowell Sonoran Preserve Autumn      4
27                  Common Raven   Nov McDowell Sonoran Preserve Autumn      4
28                Gilded Flicker   Nov McDowell Sonoran Preserve Autumn      4
29             Loggerhead Shrike   Nov McDowell Sonoran Preserve Autumn      4
30             Loggerhead Shrike   Nov McDowell Sonoran Preserve Autumn      4
31          Northern Mockingbird   Nov McDowell Sonoran Preserve Autumn      4
32               Red-tailed Hawk   Nov McDowell Sonoran Preserve Autumn      4
33         White-crowned Sparrow   Nov McDowell Sonoran Preserve Autumn      4
107             Acorn Woodpecker   Oct McDowell Sonoran Preserve Autumn      4
108                 Say's Phoebe   Nov McDowell Sonoran Preserve Autumn      4
236               Abert's Towhee   Nov        Brown's Ranch Wash Autumn      4
237                  Cactus Wren   Nov        Brown's Ranch Wash Autumn      4
238                Canyon Towhee   Nov        Brown's Ranch Wash Autumn      4
239        Curve-billed Thrasher   Nov        Brown's Ranch Wash Autumn      4
240               Gambel's Quail   Nov        Brown's Ranch Wash Autumn      4

This data spans multiple years so it is possible for a species to be listed multiple times. This is what I would like to avoid because I am only trying to determine the occurrence of the species within each site and season. So in this example, I would like to only have one data point for Loggerhead Shrike and Gambel's Quail while everything else would remain the same. I appreciate your help. I have been unsuccessfully looking for similar questions but I do not know exactly what this process would be called.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Logan
  • 41
  • 1
  • 3

1 Answers1

0

To " to determine the occurrence of the species within each site and season", try following:

> with(ddf, table(CommonName, Site, Season))

, , Season = Autumn

                        Site
CommonName               BrownsRanch McDowell
  Aberts Towhee                    1        0
  Acorn Woodpecker                 0        1
  American Kestrel                 0        1
  Black-throated Sparrow           0        1
  Brewers Sparrow                  0        1
  Cactus Wren                      1        0
  Canyon Towhee                    1        0
  Common Raven                     0        1
  Curve-billed Thrasher            1        0
  Gambels Quail                    1        1
  Gilded Flicker                   0        1
  Loggerhead Shrike                0        2
  Northern Mockingbird             0        1
  Red-tailed Hawk                  0        1
  Says Phoebe                      0        1
  White-crowned Sparrow            0        1

Or:

> with(ddf, table(CommonName, Season, Site))
, , Site = BrownsRanch

                        Season
CommonName               Autumn
  Aberts Towhee               1
  Acorn Woodpecker            0
  American Kestrel            0
  Black-throated Sparrow      0
  Brewers Sparrow             0
  Cactus Wren                 1
  Canyon Towhee               1
  Common Raven                0
  Curve-billed Thrasher       1
  Gambels Quail               1
  Gilded Flicker              0
  Loggerhead Shrike           0
  Northern Mockingbird        0
  Red-tailed Hawk             0
  Says Phoebe                 0
  White-crowned Sparrow       0

, , Site = McDowell

                        Season
CommonName               Autumn
  Aberts Towhee               0
  Acorn Woodpecker            1
  American Kestrel            1
  Black-throated Sparrow      1
  Brewers Sparrow             1
  Cactus Wren                 0
  Canyon Towhee               0
  Common Raven                1
  Curve-billed Thrasher       0
  Gambels Quail               1
  Gilded Flicker              1
  Loggerhead Shrike           2
  Northern Mockingbird        1
  Red-tailed Hawk             1
  Says Phoebe                 1
  White-crowned Sparrow       1

I changed some of the Season entries to spring:

> with(ddf, table(CommonName, Season, Site))
, , Site = BrownsRanch

                        Season
CommonName               Autumn Spring
  Aberts Towhee               0      1
  Acorn Woodpecker            0      0
  American Kestrel            0      0
  Black-throated Sparrow      0      0
  Brewers Sparrow             0      0
  Cactus Wren                 0      1
  Canyon Towhee               1      0
  Common Raven                0      0
  Curve-billed Thrasher       1      0
  Gambels Quail               1      0
  Gilded Flicker              0      0
  Loggerhead Shrike           0      0
  Northern Mockingbird        0      0
  Red-tailed Hawk             0      0
  Says Phoebe                 0      0
  White-crowned Sparrow       0      0

, , Site = McDowell

                        Season
CommonName               Autumn Spring
  Aberts Towhee               0      0
  Acorn Woodpecker            0      1
  American Kestrel            1      0
  Black-throated Sparrow      1      0
  Brewers Sparrow             1      0
  Cactus Wren                 0      0
  Canyon Towhee               0      0
  Common Raven                1      0
  Curve-billed Thrasher       0      0
  Gambels Quail               1      0
  Gilded Flicker              1      0
  Loggerhead Shrike           0      2
  Northern Mockingbird        0      1
  Red-tailed Hawk             0      1
  Says Phoebe                 0      1
  White-crowned Sparrow       0      1

To remove extra rows where name, site and season are same:

> ddf[!duplicated(paste(ddf$CommonName,ddf$Site,ddf$Season)),]
               CommonName Month        Site Season period
1           Gambels Quail   Oct    McDowell Autumn      4
2        American Kestrel   Nov    McDowell Autumn      4
3  Black-throated Sparrow   Nov    McDowell Autumn      4
4         Brewers Sparrow   Nov    McDowell Autumn      4
5            Common Raven   Nov    McDowell Autumn      4
6          Gilded Flicker   Nov    McDowell Autumn      4
7       Loggerhead Shrike   Nov    McDowell Autumn      4
9    Northern Mockingbird   Nov    McDowell Autumn      4
10        Red-tailed Hawk   Nov    McDowell Autumn      4
11  White-crowned Sparrow   Nov    McDowell Autumn      4
12       Acorn Woodpecker   Oct    McDowell Autumn      4
13            Says Phoebe   Nov    McDowell Autumn      4
14          Aberts Towhee   Nov BrownsRanch Autumn      4
15            Cactus Wren   Nov BrownsRanch Autumn      4
16          Canyon Towhee   Nov BrownsRanch Autumn      4
17  Curve-billed Thrasher   Nov BrownsRanch Autumn      4
18          Gambels Quail   Nov BrownsRanch Autumn      4

Note that quotes (like Gambel's) are not good in string entries and have been removed from above data.

rnso
  • 23,686
  • 25
  • 112
  • 234
  • I used your advice and used the code with(ddf,table(CommonName,season,Site). I need to convert this to an xlsx file and I cannot figure out how to do it without screwing up this new format. It squeezez all of the data into the original format with common name, site, season, and frequency as column headings. Do you have any idea how to convert this to xlsx while keeping this new format approach with the separate seasons and making the Sites names into xlsx sheet names? Thanks for you previously insightful help – Logan Oct 03 '14 at 21:38
  • use library(XLConnect). See here: http://stackoverflow.com/questions/15676104/how-to-append-different-r-outputs-into-one-excel-spreadsheet/15676320#15676320 . You can loop through the data and save to different sheets. There are other posts also which you will find on searching this site. Otherwise start another post to get detailed answers. – rnso Oct 04 '14 at 01:34