r - Generating stacked bar chart corresponding to percentage count of categories with overlapping dots in ggplot -
i have following data set:
require(ggplot2) # graphs require(plyr) # join data frames in list require(stringr) # clean strings require(reshape2) # melt data require(scales) # handle percentage formats require(grid) # handle units set.seed(1) df <- data.frame(group=c("group a", "group a", "group a", "group a", "group b","group b","group b", "group b", "group b", "group b", "group b", "group c", "group c", "group c", "group c", "group c", "group c", "group c", "group c"), observation=c("obs00001", "obs00002", "obs00003", "obs00004", "obs00005", "obs00006", "obs00007", "obs00008", "obs00009", "obs00010", "obs00011", "obs00012", "obs00013", "obs00014", "obs00015", "obs00016", "obs00017", "obs00018", "obs00019"), important_value = sample(1:3, 19, replace=t), second_value = runif(n = 19), some_random_stuff = runif(n = 19), other_indicator = runif(n = 19)) and generate following plot:

ideally, plot:
- provide percentage stacked bars derived counts of values in
important_value - overlay bars dot correspond arbitrary value, in case
second_value, placed on bar values provided on separate y-axis
my initial code looks that:
# melt data frame df_mlt <- melt(data = df[,c("observation", "group","important_value")], id.vars = c("observation", "group")) # sort chart df_mlt <- df_mlt[order(df_mlt$value, df_mlt$group),] # average population density df_avg <- aggregate(x = df$second_value, = list(df$group), fun = mean, na.rm = true) # graph ggplot(df_mlt, aes(x = group, y = value, fill = factor(value))) + geom_bar(stat = "identity", position = "fill") + scale_y_continuous(labels = percent_format()) + geom_point(data = df_avg, aes(x = as.numeric(group.1), y = x)) + ggtitle("some title") + theme(plot.title = element_text(lineheight = .8, face = "bold"), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.margin = unit(c(0, 0, 0, 0), "cm"), plot.margin = unit(c(0, 0, 0, 0), "cm"), axis.line = element_blank(), axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), axis.title.x = element_blank(), axis.title.y = element_blank(), legend.key = element_rect(linetype = 'blank')) naturally, attempt run code not work. in context of particular example error is:
error in factor(value) : object 'value' not found
edit
as requested in comments below, have added packages use. by way of explanation packages not have immediate relevance code below. i'm using them do transformations on original data, use. in case, decided include full list. example package stringr should not relevant here.
edit 2
it appears particular error concerned with:
scale_y_continuous(labels = percent_format()) + when applied graph definition multiple data sets used. consequently, final question be: how can force scale_y... use original data , have remaining plots defined in example.
Comments
Post a Comment