Scatter plot ggplot2 point size1/21/2024 Creating non-overlapping labels with ggrepel Ggplot2 scatter plot with default geom_label() labels on top of each otherĮnter ggrepel. ma_graph2 <- ggplot(ma_data_fake, aes(x = PctBachelors, y = CovidPer100K, size = AdultPop, color = Region)) + geom_point() + scale_x_continuous(labels = scales::percent) + geom_smooth(method='lm', se = FALSE, color = "#0072B2", linetype = "dotted") + theme_minimal() + guides(size = FALSE) ma_graph2 ma_graph2 + geom_label(aes(label = Place, size = NULL, color = NULL), nudge_y = 0.75) Sharon Machlis, IDG If I re-run the code with the new data, Fake blocks part of the Middlesex label. I added a fake data point close to Middlesex County in the Massachusetts data. But if data points are closer together, labels can end up on top of each other - especially in a smaller graph. These functions work well when points are spaced out. ma_graph + geom_label(aes(label = Place, size = NULL), nudge_y = 0.7) Sharon Machlis, IDG The following code using geom_label() produces the graph shown below. There’s another built-in ggplot labeling function called geom_label(), which is similar to geom_text() but adds a box around the text. geom_text() lets you “nudge” them a bit higher with the nudge_y argument. It can also be a bit difficult to read labels when they’re right on top of the points. I can stop that behavior by setting size = NULL. But sizing the text based on point size makes the small points’ labels hard to read. Geom_text() uses the same color and size aesthetics as the graph by default. Ggplot scatter plot with default text labels. ggplot’s geom_text() function adds labels to all the points: ma_graph + geom_text(aes(label = Place)) Sharon Machlis However, it’s currently impossible to know which points represent what counties. That creates a basic scatter plot: Sharon Machlis, IDG ma_graph <- ggplot(ma_data, aes(x = PctBachelors, y = CovidPer100K, size = AdultPop, color = Region)) + geom_point() + scale_x_continuous(labels = scales::percent) + geom_smooth(method='lm', se = FALSE, color = "#0072B2", linetype = "dotted") + theme_minimal() + guides(size = FALSE) The graph is stored in a variable called ma_graph. geom_smooth() adds a linear regression line, and I also tweak a couple of ggplot design defaults. The next group of code creates a ggplot scatter plot with that data, including sizing points by total county population and coloring them by region. Here is the data structure for the ma_data data frame: head(ma_data) To start, the code below loads several libraries and sets scipen = 999 so I don’t get scientific notation in my graphs: library(ggplot2) library(ggrepel) library(dplyr) options(scipen = 999)
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |