regex - Confused with the locale settings in R -


just answered removing characters after euro symbol in r question. it's not working me r code works others on ubuntu.

this code.

x <- "services defined in sow @ price of € 15,896.80 (if executed fro" euro <- "\u20ac" gsub(paste(euro , "(\\s+)|."), "\\1", x) # ""  

i think changing locale settings, don't know how that.

i'm running rstudio on windows 8.

> sessioninfo() r version 3.2.0 (2015-04-16) platform: x86_64-w64-mingw32/x64 (64-bit) running under: windows 8 x64 (build 9200)  locale: [1] lc_collate=english_united states.1252  [2] lc_ctype=english_united states.1252    [3] lc_monetary=english_united states.1252 [4] lc_numeric=c                           [5] lc_time=english_united states.1252      attached base packages: [1] stats     graphics  grdevices utils     datasets  methods   [7] base       loaded via namespace (and not attached): [1] tools_3.2.0 

@anada's answer need add encoding parameter every time when use unicodes in regex. there way modify default encoding utf-8 on windows?

seems problem encoding.

consider:

x <- "services defined in sow @ price of € 15,896.80 (if executed fro" gsub(paste(euro , "(\\s+)|."), "\\1", x) # [1] "" gsub(paste(euro , "(\\s+)|."), "\\1", `encoding<-`(x, "utf8")) # [1] "15,896.80" 

Comments

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -