-1

I have a variable, and for some reason R has added an extra "X" in the beginning of each. Is this a common occurrence that I could have avoided?

Anyhow, below is my data (currently the variable is stored in a list):

X1
X5
X33
X37
...

> str(rc1_output)
 chr [1:63, 1:3] "X1" "X5" "X33" "X37" "X52" "X645" "X646" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:63] "X1" "X5" "X33" "X37" ...
  ..$ : chr [1:3] "" "Entropy" "Subseq."

> dput(head(rc1_output))
structure(c("X1", "X5", "X33", "X37", "X52", "X645", "0", "0", 
"0", "0", "0", "0", "0.256010845762264", "0.071412419435563", 
"0.071412419435563", "0.071412419435563", "0.071412419435563", 
"0.071412419435563"), .Dim = c(6L, 3L), .Dimnames = list(c("X1", 
"X5", "X33", "X37", "X52", "X645"), c("", "Entropy", "Subseq."
)))

How can I loop through all rows of the variable and remove the X?

histelheim
  • 4,938
  • 6
  • 33
  • 63
  • When you say "below is my data" do you mean this is in a file? Or in an R object? – Spacedman Mar 04 '14 at 16:43
  • How is it stored in a list? Dump the list and paste the output, or a section of it. This is important. – Spacedman Mar 04 '14 at 16:45
  • Can you show us `str(x)`? A [*reproducible example*](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) would be helpful here. – Blue Magister Mar 04 '14 at 16:47
  • So, you have a 3 column by 63 row character matrix. Are all the values in "rc1_output" prefixed with an "X"? Or just those in the first column? How did you get this data into R in the first place? – A5C1D2H2I1M1N2O1R2T1 Mar 04 '14 at 17:03
  • How did you get the data into R to have it display in that format? Looks like a mess! – A5C1D2H2I1M1N2O1R2T1 Mar 04 '14 at 17:10
  • The data is output from the `TraMineR` package... – histelheim Mar 04 '14 at 18:05
  • 1
    At some point, either explicitly or implicitly, these data were probably read using `read.table` or `read.csv` with the `check.names` argument left at its default value of `TRUE`; the row names had `X` prepended (via `make.names()`), and were later converted to a column (possibly via `reshape2::melt()`). – Ben Bolker Mar 26 '14 at 19:23
  • @BenBolker, I've added a separate question for this here: http://stackoverflow.com/questions/22848403/why-is-an-extra-character-x-added-to-the-rownames-of-my-data-frame If you want, feel free to answer, and if not, I'll use the information you provided to answer the question myself. I think it's important to archive this. – histelheim Apr 03 '14 at 20:33

1 Answers1

2

Try substr or gsub:

x <- c("X1", "X354", "X234", "X2134")
substr(x, 2, nchar(x))
# [1] "1"    "354"  "234"  "2134"
gsub("^X", "", x)
# [1] "1"    "354"  "234"  "2134"

Update

It looks like just the first column (which is unnamed) and the rownames are affected. The same general approach applies:

> rc1_output[, 1] <- gsub("^X", "", rc1_output[, 1])
> rc1_output
           Entropy Subseq.            
X1   "1"   "0"     "0.256010845762264"
X5   "5"   "0"     "0.071412419435563"
X33  "33"  "0"     "0.071412419435563"
X37  "37"  "0"     "0.071412419435563"
X52  "52"  "0"     "0.071412419435563"
X645 "645" "0"     "0.071412419435563"

Repeat the process for rownames(rc1_output) if required, like this:

rownames(rc1_output) <- gsub("^X", "", rownames(rc1_output))

My guess, however, is that you can solve this problem more effectively at an earlier stage in your code somewhere. If we knew how this data came to be in this form in the first place, that would make it much easier to diagnose.

histelheim
  • 4,938
  • 6
  • 33
  • 63
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485