Introduction to ggpointless

ggpointless is a small extension of the ggplot2 package that provides additional layers:

  1. geom_pointless() & stat_pointless()
  2. geom_lexis() & stat_lexis()
  3. geom_chaikin() & stat_chaikin()
  4. geom_catenary() & stat_catenary()

geom_pointless

geom_pointless() is a layer to easily add minimal emphasis to your plots. The function takes it’s power from stat_pointless(), which does all the work, but is not usually in the spotlight.

library(ggplot2)
library(ggpointless)

x <- seq(-pi, pi, length.out = 100)
y <- outer(x, 1:5, function(x, y) sin(x * y))

df1 <- data.frame(
  var1 = x,
  var2 = rowSums(y)
)

p <- ggplot(df1, aes(x = var1, y = var2))
p + geom_pointless(location = c("first", "last", "minimum", "maximum"))

As you see, just adding geom_pointless() to ggplot(...) is not terribly useful on its own but when it teams up with geom_line() and friends, hopefully.

p <- p + geom_line()
p + geom_pointless(location = "all", size = 3)

geom_pointless() behaves like geom_point() does with the addition of a location argument. You can set it to "first", "last" (the default), "minimum", "maximum", and "all"; where "all" is just shorthand to select "first", "last", "minimum" and "maximum".

In addition, you can use the computed variable location and map it to an aesthetic, e.g. color.

p + geom_pointless(aes(color = after_stat(location)),
  location = "all",
  size = 3
) +
  theme(legend.position = "bottom")

Order and orientation

The locations are determined in the order in which they appear in the data, like geom_path() does compared to geom_line(). This can be seen in the next example, with sample data kindly taken from the geomtextpath package:

x <- seq(5, -1, length.out = 1000) * pi
spiral <- data.frame(
  var1 = sin(x) * 1:1000,
  var2 = cos(x) * 1:1000
)

p <- ggplot(spiral) +
  geom_path() +
  coord_equal(xlim = c(-1000, 1000), ylim = c(-1000, 1000)) +
  theme(legend.position = "none")

p + aes(x = var1, y = var2) +
  geom_pointless(aes(color = after_stat(location)),
    location = "all",
    size = 3
  ) +
  labs(subtitle = "orientation = 'x'")

p + aes(y = var1, x = var2) +
  geom_pointless(aes(color = after_stat(location)),
    location = "all",
    size = 3
  ) +
  labs(subtitle = "orientation = 'y'")

As you see from the first of the last two examples "first" and "minimum" overlap, and "first" wins over "minimum". If location is set to "all", then the order in which points are plotted from top to bottom is: "first" > "last" > "minimum" > "maximum".

Otherwise, the order is determined as specified in the location argument, which also applies to the order of the legend key labels.

cols <- c(
  "first" = "#f8766d",
  "last" = "#7cae00",
  "minimum" = "#00bfc4",
  "maximum" = "#c77cff"
)

df2 <- data.frame(
  var1 = 1:2,
  var2 = 1:2
)

p <- ggplot(df2, aes(x = var1, y = var2)) +
  geom_path() +
  coord_equal() +
  scale_color_manual(values = cols)

# same as location = 'all'
p + geom_pointless(aes(color = after_stat(location)),
  location = c("first", "last", "minimum", "maximum"),
  size = 3
) +
  labs(subtitle = "same as location = 'all'")

# reversed order
p + geom_pointless(aes(color = after_stat(location)),
  location = c("maximum", "minimum", "last", "first"),
  size = 3
) +
  labs(subtitle = "custom order")

# same as location = 'all' again
p + geom_pointless(aes(color = after_stat(location)),
  location = c("maximum", "minimum", "last", "first", "all"),
  size = 3
) +
  labs(subtitle = "same as location = 'all' again")

Pick a different geom

Just like all stat_* functions, stat_pointless() has a default geom, which is "point". This means in reverse that you can highlight e.g. minimum and maximum in another way, for example with a horizontal line.

set.seed(42)
ggplot(data.frame(x = 1:10, y = sample(1:10)), aes(x, y)) +
  geom_line() +
  stat_pointless(
    aes(yintercept = y, color = after_stat(location)),
    location = c("minimum", "maximum"),
    geom = "hline"
  ) +
  guides(color = guide_legend(reverse = TRUE))

Plotting lifelines with geom_lexis

geom_lexis() draws a lifeline for an event from it’s start to it’s end. The required aesthetics are x and xend. Here is an example:

df1 <- data.frame(
  key = c("A", "B", "B", "C", "D"),
  x = c(0, 1, 6, 5, 6),
  y = c(5, 4, 10, 8, 10)
)

p <- ggplot(df1, aes(x = x, xend = y, color = key)) +
  coord_equal()
p + geom_lexis()

Also, if there is a gap in an event a horizontal line is drawn, which can be hidden setting gap_filler = FALSE.

p + geom_lexis(gap_filler = FALSE)

You can further style the appearance of your plot using the additional arguments. If you e.g. want to make a visual distinction between the ascending lines and the connecting lines, use after_stat() to map the type variable to the linetype aesthetic (or any other aesthetic). The variable type is created by geom_lexis() and takes two values: “solid” and “dotted”; so you might also want to call scale_linettype_identity.

p + geom_lexis(
  aes(linetype = after_stat(type)),
  point_show = FALSE
) +
  scale_linetype_identity()

Date and POSIXct classes

You see the coordinates on the vertical y-axis show the difference between x and xend aesthetics. The “magic” of geom_lexis() happens in stat_lexis() when the input data is transformed and the calculations are performed.


df1 <- data.frame(
  start = c(2019, 2021),
  end = c(2022, 2022),
  key = c("A", "B")
)

ggplot(df1, aes(x = start, xend = end, group = key)) +
  geom_lexis() +
  coord_fixed()

Keeping in mind that dates are internally represented as the number of days, and the POSIXct class in turn represents seconds since some origin, the y-scale values in the next plots should come as no surprise.

# Date
fun <- function(i, class) as.Date(paste0(i, "-01-01"))
df1[, c("start", "end")] <- lapply(df1[, c("start", "end")], fun)
p1 <- ggplot(df1, aes(x = start, xend = end, group = key)) +
  geom_lexis() +
  labs(y = "days") +
  coord_fixed()

# POSIXct
df2 <- df1
df2[, c("start", "end")] <- lapply(df2[, c("start", "end")], as.POSIXct)
p2 <- ggplot(df2, aes(x = start, xend = end, group = key)) +
  geom_lexis() +
  labs(y = "seconds") +
  coord_fixed()

p1
p2

In order to change the breaks and labels of the vertical scale to, say, years, we make the assumption that 1 year has 365.25 days, or 365.25 * 86400 seconds.

# years, roughly
p1 +
  scale_y_continuous(
    breaks = 0:3 * 365.25, # or for p2: 0:3*365.25*86400
    labels = function(i) floor(i / 365.25) # floor(i / 365.25*86400)
  ) +
  labs(y = "years")

geom_chaikin

The algorithm of the geom_chaikin() function iteratively cuts off the ragged corners, as you can see in the example below.

set.seed(42)
dat <- data.frame(
  x = seq.int(10),
  y = sample(15:30, 10)
)

p1 <- ggplot(dat, aes(x, y)) +
  geom_line(linetype = "12")

p1 +
  geom_chaikin()

geom_catenary

A catenary is the curve of an hanging chain or cable under its own weight when supported only at its ends (see Wikipedia for more information).

ggplot(data.frame(x = c(0, 1), y = c(1, 1)), aes(x, y)) +
  geom_catenary() +
  ylim(0, 1)
#> Set chainLength to 2

geom_catenary() calculates a default value for the length of the chain that is set to 2 times the Euclidean distance of all x/y coordinates. Use the `chainLength argument to overwrite the default.

ggplot(data.frame(x = c(0, 1), y = c(1, 1)), aes(x, y)) +
  geom_catenary(chainLength = 1.5)  +
  ylim(0, 1)

Multiple x/y coordinates are supported too.

ggplot(data.frame(x = c(0, 1, 4), y = c(1, 1, 1)), aes(x, y)) +
  geom_catenary()
#> Set chainLength to 8

If you set chainLength to a value that is too short to connect two points, a straight line is drawn and a message is shown.

ggplot(data.frame(x = c(0, 1, 4), y = c(1, 1, 1)), aes(x, y)) +
  geom_catenary(chainLength = 4)
#> Minimum chain length from '(1, 1)' to '(4, 1)' is 3, drawing a straight line.

If you want to draw a chain where each piece has a different chainLength value, remember you can add a list to a ggplot object:

ggplot(data.frame(x = c(0, 1), y = 1),
       aes(x, y)) + 
  lapply(2:10, function(chainLength) {
    geom_catenary(chainLength = chainLength)
  }
)

Data

The following data sets are shipped with the ggpointless package:

  1. co2_ml : CO2 records taken at Mauna Loa
  2. covid_vac : COVID-19 Cases and Deaths by Vaccination Status
  3. female_leaders : Elected and appointed female heads of state and government

See the vignette("examples") for possible use cases.