ggplot2 - R Decile percentage plotting -
i have data frame home income poll looks this:
id income expense 001 2389.9 1400.5 003 5499.3 2309.2 .. .. ..
*this example, actual 1 has on 5k observations
i able :
- plot decile distribution income only.
- create variable wich asigns tenth of distribution income each home in.
1) tried not want, i'd know percentage of homes in each tenth:
> deciles<-quantile(df$income, prob = seq(0, 1, length = 11), type = 5) > deciles 0% 10% 20% 30% 40% 50% 60% 231.89 9024.48 13308.24 16945.15 21071.38 25661.58 31607.07 70% 80% 90% 100% 40360.98 52927.98 77926.47 1634433.60
2) second part im looking this:
id income expense decile 001 2389.9 1400.5 3 003 5499.3 2309.2 5 009 2245.0 1789.2 3 .. .. .. ..
thanks!
i think asking if there function inverse of quantile, scaled , ceilinged return decile number (1-10) each observation in distribution. use ecdf
or write own. mine looks this:
# using convention, decile 1 highest value. swap -x x if want change get_decile <- function(x) ceiling(10*rank(-x, ties.method="random") / length(x))
and plot mean income decile like:
# reproducible example! your_df <- data.frame(id=1:1e3, income=rnorm(1e3,5e4,2e4), expense=rnorm(1e3, 3e4, 1e4)) your_df$income_decile <- get_decile(your_df$income) library(ggplot2) ggplot(your_df, aes(x=income_decile, y=income)) + stat_summary(fun.y=mean, geom="line") + scale_x_reverse(breaks=1:10)
Comments
Post a Comment