3

I would like to concatenate words (strings) with different separator every 10-th element, such that each word is separated by a comma until every 10th word then it's separated by a comma and a line break. The ultimate purpose is for printing neatly a list of words into a table.

I can write a loop but I am hoping for a more elegant solution as proposed in these related questions using gsub and regular expressions: here and here that involves inserting/replacing string after every n-th character but in my case my words have variable length (of characters).

Edit: I am looking for solution I can apply to any vector with variable number of words.

For reproducible data, I generate a vector of 40 random words using code from this source

MHmakeRandomString <- function(n, length) {
  randomString <- c(1:n)
  for (i in 1:n) {
    randomString[i] <- paste(sample(c(0:9, letters, LETTERS), length, replace=TRUE),
                             collapse="")}
  return(randomString)
}
set.seed(4)
word_vector <- MHmakeRandomString(n=40, length=5)
word_vector
# [1] "A0ihO" "gIUW4" "Kh6Xp" "sYAXL" "IZvuE" "PtQvw" "zeSEt" "YsCo0" "WfzbU" "5TTIz"
# [11] "oKTOO" "qaaTK" "y4QUd" "C4vNY" "lDplP" "Gjrg8" "UHzUT" "32ZcV" "c7xgl" "5Lr2H"
# [21] "fDgxt" "zFdYO" "hohuK" "vrNU4" "8oRg5" "IYcyl" "pblbO" "SHhq0" "yFjWa" "rzYLr"
# [31] "m2AXf" "QdhtM" "TWpkh" "4499K" "5Bcv8" "0DeqI" "6BdTy" "fJgKX" "tUZeh" "HPso5"

I usually do a paste(x, collapse) and then print to table using gridExtra

word_sep <- paste(word_vector, collapse=", ")
# [1] "z6LHb, 1ubB1, o9TZ2, 8s8bV, sZmcB, blirI, gMfo1, xXkkt, gFMrA, hXdaO, 
# lNP2Q, p9B9G, JXTsJ, qVsWS, ntiT8, d0QRv, uoR1D, L99Bg, THWQo, meuev, 
# IO0Au, 0yWmh, 72d3g, FJRDS, PtbJT, JaXVK, OPo9m, i0678, 6BpXZ, b6hzT, 
# BDQBk, ANC5h, 7QPgM, JJSxf, nnX7Z, rbEfm, XXl4Z, kHMuI, wFLyM, P8rlp"

library(gridExtra)
plot_grid(tableGrob(word_sep))

Current table output: In this case I have a really long list of words and specified table width so I need line breaks. Current table Output

My desired output would look like this hacked version:

word_sep2 <- paste(c(paste(MHmakeRandomString(n=10, length=5), collapse=", "), ",\n",
               paste(MHmakeRandomString(n=10, length=5), collapse=", "), ",\n",
               paste(MHmakeRandomString(n=10, length=5), collapse=", "), ",\n",
               paste(MHmakeRandomString(n=10, length=5), collapse=", ")), collapse="")
word_sep2
# [1] "0ahiL, 2pA5c, dKWuR, 79sw5, MeL1I, KpB1w, UNLSo, LlDlN, jNOcI, tv8R5,
# \norf60, avKFo, jZFxE, U7RQW, SSmxD, czlMt, 75zEB, 2jLwG, 08dmN, H3sVW,
# \nCZwQt, ggumo, wHUpj, Z7WGR, BHYLE, eWksX, Lbt3D, P1Brf, OpEvk, 1WFVa,
# \nEeFd4, afX7B, nyBzF, vbNLz, U7MU0, H4rx4, AKgv8, Kbzri, KKajp, Yg6EW"

plot_grid(tableGrob(word_sep2))

Desired table output: Desired table output

Djork
  • 3,319
  • 1
  • 16
  • 27

2 Answers2

3

You may use

gsub("((?:[^,]*,){10}) ", "\\1\n", word_sep)

See the online regex demo.

Details

  • ((?:[^,]*,){10}) - Group 1 (referred to with \1 from the replacement pattern) that matches 10 consecutive occurrences of
    • [^,]* - any 0+ chars other than ,
    • , - a comma
  • - a space

See the R demo:

MHmakeRandomString <- function(n, length) {
   randomString <- c(1:n)
   for (i in 1:n) {
     randomString[i] <- paste(sample(c(0:9, letters, LETTERS), length, replace=TRUE),
                              collapse="")}
   return(randomString)
}
set.seed(4)
word_vector <- MHmakeRandomString(n=40, length=5)
word_sep <- paste(word_vector, collapse=", ")
f <- gsub("((?:[^,]*,){10}) ", "\\1\n", word_sep)
cat(f, collapse="\n")
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

I gues you can do it with paste

paste(word_vector, rep(c(", ", ",\n"), c(9,1)), collapse = "", sep = "")
[1] "A0ihO, gIUW4, Kh6Xp, sYAXL, IZvuE, PtQvw, zeSEt, YsCo0, WfzbU, 5TTIz,\noKTOO, qaaTK, y4QUd, C4vNY, lDplP, Gjrg8, UHzUT, 32ZcV, c7xgl, 5Lr2H,\nfDgxt, zFdYO, hohuK, vrNU4, 8oRg5, IYcyl, pblbO, SHhq0, yFjWa, rzYLr,\nm2AXf, QdhtM, TWpkh, 4499K, 5Bcv8, 0DeqI, 6BdTy, fJgKX, tUZeh, HPso5,\n"

Here's what it looks like when printing it with cat:

res <- paste(word_vector, rep(c(", ", ",\n"), c(9,1)), collapse = "", sep = "")
cat(res)
# A0ihO, gIUW4, Kh6Xp, sYAXL, IZvuE, PtQvw, zeSEt, YsCo0, WfzbU, 5TTIz,
# oKTOO, qaaTK, y4QUd, C4vNY, lDplP, Gjrg8, UHzUT, 32ZcV, c7xgl, 5Lr2H,
# fDgxt, zFdYO, hohuK, vrNU4, 8oRg5, IYcyl, pblbO, SHhq0, yFjWa, rzYLr,
# m2AXf, QdhtM, TWpkh, 4499K, 5Bcv8, 0DeqI, 6BdTy, fJgKX, tUZeh, HPso5,
talat
  • 68,970
  • 21
  • 126
  • 157