regex - Confused with the locale settings in R -
just answered removing characters after euro symbol in r question. it's not working me r code works others on ubuntu.
this code.
x <- "services defined in sow @ price of € 15,896.80 (if executed fro" euro <- "\u20ac" gsub(paste(euro , "(\\s+)|."), "\\1", x) # ""
i think changing locale settings, don't know how that.
i'm running rstudio on windows 8.
> sessioninfo() r version 3.2.0 (2015-04-16) platform: x86_64-w64-mingw32/x64 (64-bit) running under: windows 8 x64 (build 9200) locale: [1] lc_collate=english_united states.1252 [2] lc_ctype=english_united states.1252 [3] lc_monetary=english_united states.1252 [4] lc_numeric=c [5] lc_time=english_united states.1252 attached base packages: [1] stats graphics grdevices utils datasets methods [7] base loaded via namespace (and not attached): [1] tools_3.2.0
@anada's answer need add encoding
parameter every time when use unicodes in regex. there way modify default encoding utf-8
on windows?
seems problem encoding.
consider:
x <- "services defined in sow @ price of € 15,896.80 (if executed fro" gsub(paste(euro , "(\\s+)|."), "\\1", x) # [1] "" gsub(paste(euro , "(\\s+)|."), "\\1", `encoding<-`(x, "utf8")) # [1] "15,896.80"
Comments
Post a Comment