1

I have a character vector of file paths:

> tail(paths)
[1] "/home/username/data/dir/GCZ98/GCZ98_1998_12_16.asc.gz"
[2] "/home/username/data/dir/GCZ98/GCZ98_1998_12_20.asc.gz"
[3] "/home/username/data/dir/GCZ99/GCZ99_1999_12_21.asc.gz"
[4] "/home/username/data/dir/GCZ99/GCZ99_1999_12_23.asc.gz"
[5] "/home/username/data/dir/GCZ99/GCZ99_1999_12_27.asc.gz"
[6] "/home/username/data/dir/GCZ99/GCZ99_1999_12_28.asc.gz"

I would like to split this into a list of vectors, by parent folder name, ie:

> tail(desired)
$ "/home/username/data/dir/GCZ98"
[1] "/home/username/data/dir/GCZ98/GCZ98_1998_12_16.asc.gz"
[2] "/home/username/data/dir/GCZ98/GCZ98_1998_12_20.asc.gz"
$ "/home/username/data/dir/GCZ98"
[1] "/home/username/data/dir/GCZ99/GCZ99_1999_12_21.asc.gz"
[2] "/home/username/data/dir/GCZ99/GCZ99_1999_12_23.asc.gz"
[3] "/home/username/data/dir/GCZ99/GCZ99_1999_12_27.asc.gz"
[4] "/home/username/data/dir/GCZ99/GCZ99_1999_12_28.asc.gz"

I have tried using split and strsplit with little sucesss, but am struggling to try and find a regular expression which accomplishes my needs.

Thanks for any help

rwb
  • 4,309
  • 8
  • 36
  • 59

2 Answers2

4

You could combine split and dirname:

path <- c("/home/username/data/dir/GCZ98/GCZ98_1998_12_16.asc.gz",
          "/home/username/data/dir/GCZ98/GCZ98_1998_12_20.asc.gz",
          "/home/username/data/dir/GCZ99/GCZ99_1999_12_21.asc.gz",
          "/home/username/data/dir/GCZ99/GCZ99_1999_12_23.asc.gz",
          "/home/username/data/dir/GCZ99/GCZ99_1999_12_27.asc.gz",
          "/home/username/data/dir/GCZ99/GCZ99_1999_12_28.asc.gz")

## split by basedir
split(path, dirname(path))

# $`/home/username/data/dir/GCZ98`
# [1] "/home/username/data/dir/GCZ98/GCZ98_1998_12_16.asc.gz" "/home/username/data/dir/GCZ98/GCZ98_1998_12_20.asc.gz"
# 
# $`/home/username/data/dir/GCZ99`
# [1] "/home/username/data/dir/GCZ99/GCZ99_1999_12_21.asc.gz" "/home/username/data/dir/GCZ99/GCZ99_1999_12_23.asc.gz" "/home/username/data/dir/GCZ99/GCZ99_1999_12_27.asc.gz"
# [4] "/home/username/data/dir/GCZ99/GCZ99_1999_12_28.asc.gz"
sgibb
  • 25,396
  • 3
  • 68
  • 74
  • This is correct - however i would like to give @BetaBariumBorate the credit for giving this answer first – rwb Aug 30 '13 at 15:22
  • I'm guessing if Beta Barium Borate cared about points s/he'd have answered himself, s/he was shooting a comment because s/he wanted to help guide but didn't want to take the time to write out an answer.. – Tyler Rinker Aug 30 '13 at 15:36
2

A regex approach:

> split(paths, gsub("(.*)/[^/]+$", "\\1", paths))
$`/home/username/data/dir/GCZ98`
[1] "/home/username/data/dir/GCZ98/GCZ98_1998_12_16.asc.gz"
[2] "/home/username/data/dir/GCZ98/GCZ98_1998_12_20.asc.gz"

$`/home/username/data/dir/GCZ99`
[1] "/home/username/data/dir/GCZ99/GCZ99_1999_12_21.asc.gz"
[2] "/home/username/data/dir/GCZ99/GCZ99_1999_12_23.asc.gz"
[3] "/home/username/data/dir/GCZ99/GCZ99_1999_12_27.asc.gz"
[4] "/home/username/data/dir/GCZ99/GCZ99_1999_12_28.asc.gz"
Ferdinand.kraft
  • 12,579
  • 10
  • 47
  • 69