r - Get the (t-1) data within groups -
apologies if has been asked before, couldn't find question answers exactly. have data this:
project date price 30/3/2013 2082 b 19/3/2013 1567 b 22/2/2013 1642 c 12/4/2013 1575 c 5/6/2013 1582
i want have column last-instance prices group. example, row 2, last instance price same group 1642. final data this:
project date price lastprice 30/3/2013 2082 0 b 19/3/2013 1567 1642 b 22/2/2013 1642 0 c 12/4/2013 1575 0 c 5/6/2013 1582 1575
how this? main issue i'm facing data may not ordered date not if can take last cell.
here's option. i'd recommend use na
s instead if 0
because 0
actual price.
library(dplyr) df %>% arrange(as.date(date, format = "%d/%m/%y")) %>% group_by(project) %>% mutate(lastprice = lag(price)) # source: local data frame [5 x 4] # groups: project # # project date price lastprice # 1 b 22/2/2013 1642 na # 2 b 19/3/2013 1567 1642 # 3 30/3/2013 2082 na # 4 c 12/4/2013 1575 na # 5 c 5/6/2013 1582 1575
another option use shift
the devel version of data.table
library(data.table) ## v >= 1.9.5 setdt(df)[order(as.date(date, format = "%d/%m/%y")), lastprice := shift(price), = project] # project date price lastprice # 1: 30/3/2013 2082 na # 2: b 19/3/2013 1567 1642 # 3: b 22/2/2013 1642 na # 4: c 12/4/2013 1575 na # 5: c 5/6/2013 1582 1575
or base r
df <- df[order(df$project, as.date(df$date, format = "%d/%m/%y")), ] within(df, lastprice <- ave(price, project, fun = function(x) c(na, x[-length(x)]))) # project date price lastprice # 1 30/3/2013 2082 na # 3 b 22/2/2013 1642 na # 2 b 19/3/2013 1567 1642 # 4 c 12/4/2013 1575 na # 5 c 5/6/2013 1582 1575
as side note, better keep date column in date
class in first place, i'd recommend doing df$date <- as.date(df$date, format = "%d/%m/%y")
once , all.
Comments
Post a Comment