r - Strange error when expanding data.table -


we stumbled upon strange behaviour trying expand data.table. following code works alright:

dt <- data.table(var1=1:2e3, var2=1:2e3, freq=1:2e3) system.time(dt.expanded <- dt[ ,list(freq=rep(1,freq)),by=c("var1","var2")]) ##    user  system elapsed  ##    0.05    0.01    0.06 

but using following data.table

set.seed(1) dt <- data.table(var1=sample(letters,1000,replace=t),var2=sample(letters,1000,replace=t),freq=sample(1:10,1000,replace=t)) 

with same code gives

error in rep(1, freq) : invalid 'times' argument 

my question
might bug in data.table?

(i got syntax of example r machine learning essentials)

edit
problem seems rep , not data.table. page rep says parameter times:

a integer vector giving (non-negative) number of times repeat each element if of length length(x), or repeat whole vector if of length 1.

the second data.table creates times of different length x throws error.

my guess: when rep(x,times) given vector times, insists x same length (instead of doing natural thing in r , recycling). manual recycling works:

dt[ ,.(rep(rep(1,.n),freq)), by=.(var1,var2)] 

seems problem in base r (or maybe it's deliberate?), not in data.table. op didn't hit problem in first example because by=.(var1,var2) ensured 1 row returned each group, times argument scalar.


Comments

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -