r - Strange error when expanding data.table -
we stumbled upon strange behaviour trying expand data.table. following code works alright:
dt <- data.table(var1=1:2e3, var2=1:2e3, freq=1:2e3) system.time(dt.expanded <- dt[ ,list(freq=rep(1,freq)),by=c("var1","var2")]) ## user system elapsed ## 0.05 0.01 0.06
but using following data.table
set.seed(1) dt <- data.table(var1=sample(letters,1000,replace=t),var2=sample(letters,1000,replace=t),freq=sample(1:10,1000,replace=t))
with same code gives
error in rep(1, freq) : invalid 'times' argument
my question
might bug in data.table
?
(i got syntax of example r machine learning essentials)
edit
problem seems rep
, not data.table
. page rep
says parameter times
:
a integer vector giving (non-negative) number of times repeat each element if of length length(x), or repeat whole vector if of length 1.
the second data.table
creates times
of different length x
throws error.
my guess: when rep(x,times)
given vector times
, insists x
same length (instead of doing natural thing in r , recycling). manual recycling works:
dt[ ,.(rep(rep(1,.n),freq)), by=.(var1,var2)]
seems problem in base r (or maybe it's deliberate?), not in data.table
. op didn't hit problem in first example because by=.(var1,var2)
ensured 1 row returned each group, times
argument scalar.
Comments
Post a Comment