Questions tagged [stringi]

stringi is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

's stringi package provides a platform independent way of manipulating strings. It is built on the library and has a syntax inspired by the package.

Repositories

Other resources

Related tags

298 questions
30
votes
7 answers

Error in R: (Package which is only available in source form, and may need compilation of C/C++/Fortran)

I'm trying to install the 'yaml' and 'stringi' packages in R-Studio, and it keeps giving me these errors: > install.packages("stringi") Package which is only available in source form, and may need compilation of C/C++/Fortran: ‘stringi’ These will…
wanax
  • 301
  • 1
  • 3
  • 4
26
votes
1 answer

gsub speed vs pattern length

I've been using gsub extensively lately, and I noticed that short patterns run faster than long ones, which is not surprising. Here's a fully reproducible code: library(microbenchmark) set.seed(12345) n = 0 rpt = seq(20, 1461, 20) msecFF =…
Alexey Ferapontov
  • 5,029
  • 4
  • 22
  • 39
24
votes
6 answers

package 'stringi' does not work after updating to R3.2.1

I saw a version of this question posted, but still did not see the answer. I am trying to use ggplot2 but get the following errors (everything worked this morning using R3.0.2 'frisbee sailing' with RStudio version 0.98.1102. I updated both R and…
Kodiakflds
  • 603
  • 1
  • 4
  • 15
20
votes
5 answers

How to install stringi from local file (ABSOLUTELY no Internet Access)

I am working on a remote server using RStudio. This server has no access to the Internet. I would like to install the package "stringi." I have looked at this stackoverflow article, but whenever I use the…
Katya Willard
  • 2,152
  • 4
  • 22
  • 43
19
votes
2 answers

R/regex with stringi/ICU: why is a '+' considered a non-[:punct:] character?

I'm trying to remove non-alphabet characters from a vector of strings. I thought the [:punct:] grouping would cover it, but it seems to ignore the +. Does this belong to another group of characters? library(stringi) string1 <- c( "this is a…
screechOwl
  • 27,310
  • 61
  • 158
  • 267
18
votes
5 answers

Subset string by counting specific characters

I have the following strings: strings <- c("ABBSDGNHNGA", "AABSDGDRY", "AGNAFG", "GGGDSRTYHG") I want to cut off the string, as soon as the number of occurances of A, G and N reach a certain value, say 3. In that case, the result should…
Nivel
  • 629
  • 4
  • 12
17
votes
2 answers

Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called 'stringi'

When I use library(Hmisc) I get the following error Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called 'stringi' Error: package 'ggplot2' could not be loaded As well, if I use…
Marta
  • 171
  • 1
  • 1
  • 3
15
votes
2 answers

Filter by multiple patterns with filter() and str_detect()

I would like to filter a dataframe using filter() and str_detect() matching for multiple patterns without multiple str_detect() function calls. In the example below I would like to filter the dataframe df to show only rows containing the letters a f…
user6571411
  • 2,749
  • 4
  • 16
  • 29
14
votes
6 answers

Overlapping matches in R

I have searched and was able to find this forum discussion for achieving the effect of overlapping matches. I also found the following SO question speaking of finding indexes to perform this task, but was not able to find anything concise about…
hwnd
  • 69,796
  • 4
  • 95
  • 132
12
votes
5 answers

Installation of packages ‘stringr’ and ‘stringi’ had non-zero exit status

Please help me to install stringr and stringi packages in R. The result is: install.packages("stringi") Installing package into ‘C:/Users/kozlovpy/Documents/R/win-library/3.2’ (as ‘lib’ is unspecified) пробую URL…
Pavel Kozlov
  • 131
  • 1
  • 1
  • 4
12
votes
2 answers

How to detect sentence boundaries with OpenNLP and stringi?

I want to break next string into sentences: library(NLP) # NLP_0.1-7 string <- as.String("Mr. Brown comes. He says hello. i give him coffee.") I want to demonstrate two different ways. One comes from package openNLP: library(openNLP) #…
SRRussel
  • 121
  • 4
12
votes
2 answers

Split keep repeated delimiter

I'm trying to use the stringi package to split on a delimiter (potentially the delimiter is repeated) yet keep the delimiter. This is similar to this question I asked moons ago: R split on delimiter (split) keep the delimiter (split) but the…
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
11
votes
6 answers

How to install stringi library from archive and install the local icu52l.zip

We're bumbling through making some R code work in a production environment and as part of that we're installing some R packages as follows: # Default directories and mirrors WORKING_DIR <- "/srv/foo/bar/baz" LIB_DIR <- paste( WORKING_DIR,…
Adam Taylor
  • 7,534
  • 8
  • 44
  • 54
10
votes
2 answers

Extract last word in a string after comma if there are multiple words else the first word

I have data where the words as follows location<- c("xyz, sss, New Zealand", "USA", "Pris,France") id<- c(1,2,3) df<-data.frame(location,id) I would like to extract the country name from the data. The tricky part is if i extract just the last…
user3570187
  • 1,743
  • 3
  • 17
  • 34
9
votes
2 answers

dplyr filter condition to distinguish between unicode symbol and its unicode representation

I am trying to filter the Symbol column based on whether it's of the form \uxxxx This is easy visually, that is, some look like $, ¢, £, and others like \u058f, \u060b, \u07fe. But I cannot seem to figure it out using stringi /…
stevec
  • 41,291
  • 27
  • 223
  • 311
1
2 3
19 20