-1

In my dataset (which is on gun violence), each column has || in between the data points.

e.g Age

0::Male||1::Female||2::Male||

How do you separate the data points?

Thanks!

Zuoanqh
  • 942
  • 4
  • 10
  • 26
  • 1
    Usually, this is done with something called a "split". I'm unfamiliar with R, but I think there should be such a feature to split a string by "||". A search on that might be helpful. – Zuoanqh May 06 '18 at 01:45
  • 1
    I think you should show more of the appearance of the raw text file. – IRTFM May 06 '18 at 02:02

1 Answers1

0

read.table/read.delim allows you define a single character by which values in each line are separated (see argument sep in ?read.table).

Since you have ||, all we need to do is remove the resulting NA columns when reading the data.

Here is an example:

# Sample data
df <- read.table(text =
    "0::Male||1::Female||2::Male||
    0::Male||1::Female||2::Male||
    0::Male||1::Female||2::Male||
    0::Male||1::Female||2::Male||", sep = "|")

# Remove NA columns
df[, !sapply(df, function(x) all(is.na(x)))]
#           V1        V3      V5
#1     0::Male 1::Female 2::Male
#2     0::Male 1::Female 2::Male
#3     0::Male 1::Female 2::Male
#4     0::Male 1::Female 2::Male
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68