0

I am having problems with accessing factors in R. I have a dataframe of tuple factor

test1
#[1] (34.0467, -118.2470) (34.0637, -118.2440) (34.0438, -118.2547)
#[4] (34.0523, -118.2676) (34.0584, -118.2810) (34.0583, -118.2616)
#39497 Levels: (0, 0) (0.0000, 0.0000) ... (34.6837, -118.1853)

How do I access just the first digit of the tuple?

thanks!

dput(test1) ... "(34.3256, -118.4307)", "(34.3256, -118.4798)", "(34.3256, -118.5033)", "(34.3257, -118.4244)", "(34.3258, -118.4343)", "(34.3262, -118.4104)", "(34.3262, -118.4112)", "(34.3266, -118.4234)", "(34.3266, -118.4269)", "(34.3266, -118.4323)", "(34.3269, -118.4278)", "(34.3272, -118.4365)", "(34.3273, -118.4342)", "(34.3274, -118.4321)", "(34.3274, -118.4331)", "(34.3275, -118.4247)", "(34.3275, -118.4298)", "(34.3276, -118.4115)", "(34.3277, -118.4071)", "(34.3285, -118.4266)", "(34.3286, -118.4277)", "(34.3287, -118.4286)", "(34.3292, -118.5048)", "(34.3293, -118.4246)", "(34.3298, -118.4300)", "(34.3327, -118.5062)", "(34.3374, -118.5042)", "(34.3760, -118.5254)", "(34.3767, -118.5263)", "(34.3775, -118.5270)", "(34.3805, -118.5293)", "(34.4638, -118.1995)", "(34.5095, -117.9273)", "(34.5304, -118.1418)", "(34.5453, -118.0405)", "(34.5650, -118.0856)", "(34.5693, -118.0228)", "(34.5957, -118.1784)", "(34.6818, -118.0954)", "(34.6837, -118.1853)"), class = "factor")

Can't get the beginning of that anyhow.

  • How exactly is `x=((1,2),(3,4),(5,6))` stored in R? Provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) (perhaps use `dput()`). Data.frames don't like to store non-atomic vectors as columns. – MrFlick Aug 11 '16 at 04:01
  • So you have literal strings that are coded as factors? What you shared is not a `dput` so it's not super helpful. Also you should edit your question to include that information so it can be properly formatted (rather than including as a comment). – MrFlick Aug 11 '16 at 04:04
  • dput is very large. basically it's longitudinal/lattitude csv file read.csv into a data frame. – bottledatthesource Aug 11 '16 at 04:06
  • dput(head(test)) "(34.5693, -118.0228)", "(34.5957, -118.1784)", "(34.6818, -118.0954)", "(34.6837, -118.1853)"), class = "factor") – bottledatthesource Aug 11 '16 at 04:31
  • I suppose you have read the data from a file, right? What was the exact command you used to do that? – Uwe Aug 11 '16 at 05:21
  • read.csv() is what i use – bottledatthesource Aug 11 '16 at 05:28

3 Answers3

1
test1 <- factor(c("(34.3242, -118.4494)", "(34.3242, -118.4914)", "(34.3243, -118.4167)"))

First, convert the factor vector to a character vector.

test1 <- as.character(test1)

Then, remove all (s and )s, and split the strings by ,.

test1 <- gsub("\\(|\\)", "", test1)
test1 <- strsplit(test1, ",")

After that, change the digits from character format to numeric format.

test1 <- lapply(test1, as.numeric)

Finally, get the first coordinate of each point (change 1 to 2, if you want the second one).

test1 <- unlist(lapply(test1, '[[', 1))

Here is the output.

> test1
[1] 34.3242 34.3242 34.3243
pe-perry
  • 2,591
  • 2
  • 22
  • 33
0

Just index again

x[1][1]
x[2][1]
Mir Henglin
  • 629
  • 5
  • 15
0

Try this

as.numeric(unlist(strsplit(gsub("[\\(\\)]", "",as.character(test1)),","))[c(T,F)])

Explanation

gsub is applicable only on character. So, as.character(test1) is converting test1 to character from factor. Then I am removing "(" & ")" from them like this

gsub("[\\(\\)]", "",as.character(test1))
#[1] "34.5693, -118.0228" "34.5957, -118.1784" "34.6818, -118.0954" "34.6837, -118.1853"

Later I split them into two parts depending on the separator , as

strsplit(gsub("[\\(\\)]", "",as.character(test1)),",")
#[[1]]
#[1] "34.5693"    " -118.0228"

#[[2]]
#[1] "34.5957"    " -118.1784"

#[[3]]
#[1] "34.6818"    " -118.0954"

#[[4]]
#[1] "34.6837"    " -118.1853"

Previous output is a list. unlist made output a vector.

unlist(strsplit(gsub("[\\(\\)]", "",as.character(test1)),","))
#[1] "34.5693"    " -118.0228" "34.5957"    " -118.1784" "34.6818"    " -118.0954"
#[7] "34.6837"    " -118.1853"

Basically [c(T,F)] is generating an alternating sequence of TRUE and FALSE for selection of first elements.

At last I made the output numeric using as.numeric

Output

#[1] 34.5693 34.5957 34.6818 34.6837
user2100721
  • 3,557
  • 2
  • 20
  • 29