\(~\)
After the niceViolin()
function, here’s how to make nice scatter plots easily!
Let’s first load the demo data. This data set comes with base R
(meaning you have it too and can directly type this command into your R
console).
data("mtcars")
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Source the function from my github
:
source("https://raw.githubusercontent.com/RemPsyc/niceplots/master/niceScatterFunction.R")
\(~\)
Make the basic plot
*Warning:* running the function below for the first time will install and load the following
package (if it is not already installed and loaded on your machine): ggplot2.
Note: This will run many lines of code on your console and could take a few minutes.
niceScatter(data = mtcars,
predictor = wt,
response = mpg)
Save a high-resolution image file to specified directory
ggsave('nicescatterplothere.tiff', width = 7, height = 7, unit = 'in', dpi = 300,
path = "D:/R treasures/")
# Change the path to where you would like to save it.
# If you copy-paste your path name, remember to use "R" slashes ('/' rather than '\').
# Also remember to specify the .tiff extension of the file.
Pro tip: Change .tiff
for .pdf
or .eps
for scalable vector graphics for high-resolution submissions to scientific journals!
\(~\)
Change x- and y- axis labels
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
ytitle = "Miles/(US) gallon",
xtitle = "Weight (1000 lbs)")
Have points “jittered”
Meaning randomly moved around a bit to prevent overplotting (when two or more points overlap, thus hiding information).
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
has.jitter = TRUE)
Change the transparency of the points
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
alpha = 1) # default is 0.7
Remove points
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
has.points = FALSE,
has.jitter = FALSE)
Add confidence band
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
has.confband = TRUE)
Set x- and y- scales manually
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
xmin = 1,
xmax = 6,
xby = 1,
ymin = 10,
ymax = 35,
yby = 5)
Change plot colour
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
colours = "blueviolet")
Add correlation coefficient to plot and p-value
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
has.r = TRUE,
has.p = TRUE)
Change location of correlation coefficient or p-value
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
has.r = TRUE,
r.x = 4,
r.y = 25,
has.p = TRUE,
p.x = 5,
p.y = 20)
Plot by group
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl)) # here we need to specify 'cyl' as a factor because it is numeric by default
Use full range on the slope/confidence band
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.fullrange = TRUE)
Add a legend
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.legend = TRUE)
Change order of labels on the legend
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.legend = TRUE,
groups.order = c(8,4,6)) # These are the levels of 'mtcars$cyl', so we place lvl 8 first, then lvl 4, etc.
Change legend labels
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.legend = TRUE,
groups.names = c("Weak","Average","Powerful")) # Warning: This applies after changing order of level
**Warning**: This only changes labels and applies after changing order of level!
Always use `groups.order` first if you also need to use `groups.names`!
This is to make sure to have the right labels for the right groups!
Add a title to legend
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.legend = TRUE,
legend.title = "Cylinders")
Plot by group + manually specify colours
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
colours = c("burlywood","darkgoldenrod","chocolate"))
Plot by group + use different line types for each group
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.linetype = TRUE)
Plot by group + use different point shapes for each group
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.shape = TRUE)
Plot by group, point shapes, line types, legend + no colours (black and white)
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
group.variable = factor(mtcars$cyl),
has.legend = TRUE,
legend.title = "Cylinders",
has.linetype = TRUE,
has.shape = TRUE,
colours = rep("black",3))
Putting it all together
If you’d like to see all available options at once (a bit long):
niceScatter(data = mtcars,
predictor = wt,
response = mpg,
ytitle = "Miles/(US) gallon",
xtitle = "Weight (1000 lbs)",
has.points = FALSE,
has.jitter = TRUE,
alpha = 1,
has.confband = TRUE,
has.fullrange = FALSE,
group.variable = factor(mtcars$cyl),
has.linetype = TRUE,
has.shape = TRUE,
xmin = 1,
xmax = 6,
xby = 1,
ymin = 10,
ymax = 35,
yby = 5,
has.r = TRUE,
has.p = TRUE,
r.x = 5.5,
r.y = 25,
colours = c("burlywood","darkgoldenrod","chocolate"),
has.legend = TRUE,
legend.title = "Cylinders",
groups.names = c("Weak","Average","Powerful"))
Special situation: Add group average
There’s no straightforward way to add group average, so here’s a hack to do it. We first have to create a second data set with another “group” that will be used as the average.
<- mtcars # This simply copies the 'mtcars' dataset
new.Data $cyl <- "Average" # That would be your "Group" variable normally
new.Data# And this operation fills all cells of that column with the word "Average" to identify our new 'group'
<- rbind(mtcars,new.Data) # This adds the new "Average" group rows to the original data rows XData
Then we need to create a FIRST layer of just the slopes. We add transparency to the group lines except the group average to emphasize the group average (with the new argument manual.slope.alpha
).
<- niceScatter(data = XData,
(p predictor = wt,
response = mpg,
has.points = FALSE,
has.legend = TRUE,
group.variable = XData$cyl,
colours = c("black", "#00BA38", "#619CFF", "#F8766D"), # We add colours manually because we want average to be black to stand out
groups.order = c("Average","4","6","8"), # We do this to have average on top since it's the most important
manual.slope.alpha = c(1,0.5,0.5,0.5))) # This adds 50% transparency to all lines except the first one (Average) which is 100%
Finally we are ready to add a SECOND layer of just the points on top of our previous layer. We use standard ggplot
syntax for this.
+ geom_point(data = mtcars,
p size = 2,
alpha = 0.5,
shape = 16, # We use shape 16 because the default shape 19 sometimes causes problems when exporting to PDF
mapping = aes(x = wt,
y = mpg,
colour = factor(cyl),
fill = factor(cyl)))
If you’d like instead to still show the group points but only the black average line, you can do the following as first layer:
<- niceScatter(data = mtcars,
(p predictor = wt,
response = mpg,
has.points = FALSE,
has.legend = TRUE, # This argument is important else the next legend won't appear on the second layer!
colours = "black"))
Then to add the points as second layer we do the same as before:
+ geom_point(data = mtcars,
p size = 2,
alpha = 0.5,
shape = 16,
mapping = aes(x = wt,
y = mpg,
colour = factor(cyl)))
\(~\)
\(~\)
Concluding Statement
Make sure to check out this page again if you use the code after a time or if you encounter errors, as I periodically update or improve the code.
You can always edit the function to suit your purposes, or contact me for questions or requests to modify this function at https://remi-theriault.com/contact! Thanks for reading my guide! :) \(~\)
\(~\)
\(~\)
\(~\)
\(~\)
Updated 2020-11-02 (added: argument has.r
& has.p
)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)