r - Why is rbindXts's dup parameter not exposed? -
i want rbind
bunch of xts objects, should not overlap, if overlap don't want add row twice: choose 1 or other. (i duplicated(index(x))
, delete them.)
(example code showing problem, , desired output, below).
poking around, found c source has dup
parameter; defaults false
, when set true
behaviour wanted:
.external("rbindxts", dup = t, x,y,z, package = "xts")
is there reason wasn't exposed in rbind()
interface? (by "good" reason mean along lines of known buggy, or bad performance on large data, or that.) or more practical reason, such no-one has had time write tests , documentation yet?
update: went code, , found wasn't using rbind.xts
, instead do.call.rbind()
function described here: https://stackoverflow.com/a/9729804/841830 due memory issue in rbind.xts.
(i found own (!) question 3 years ago, describes how delete duplicates: how remove row zoo/xts object, given timestamp )
update #2:
do.call.rbind
can modified use external
call:
do.call.rbind.xts.no_dup <- function(lst) { while(length(lst) > 1) { idxlst <- seq(from=1, to=length(lst), by=2) lst <- lapply(idxlst, function(i) { if(i==length(lst)) { return(lst[[i]]) } return(.external("rbindxts", dup = t, lst[[i]], lst[[i+1]], package = "xts")) }) } lst[[1]] }
i tested same test data shown here: https://stackoverflow.com/a/12029366/841830, , has same performance, , produces same 2.8 million row xts object, do.call.rbind
(which good). of course, test data has no duplicates, maybe not fair test?
x <- xts(1:10, sys.date()+1:10) y <- xts(50:55,sys.date() + (-1:-6)) z <- xts(20:25,sys.date() + (-2:+3)) rbind(x,y,z)
this gives following output (with *** showing undesired lines)
2015-07-02 55 2015-07-03 54 2015-07-04 53 2015-07-05 52 2015-07-06 51 2015-07-06 20 *** 2015-07-07 50 2015-07-07 21 *** 2015-07-08 22 2015-07-09 1 2015-07-09 23 *** 2015-07-10 2 2015-07-10 24 *** 2015-07-11 3 2015-07-11 25 *** 2015-07-12 4 2015-07-13 5 2015-07-14 6 2015-07-15 7 2015-07-16 8 2015-07-17 9 2015-07-18 10
whereas .external("rbindxts", dup = t, x,y,z, package = "xts")
gives:
2015-07-02 55 2015-07-03 54 2015-07-04 53 2015-07-05 52 2015-07-06 51 2015-07-07 50 2015-07-08 22 2015-07-09 1 2015-07-10 2 2015-07-11 3 2015-07-12 4 2015-07-13 5 2015-07-14 6 2015-07-15 7 2015-07-16 8 2015-07-17 9 2015-07-18 10
looking @ the commit added, seems experimental. it's not exposed practical reasons: it's untested/undocumented.
Comments
Post a Comment