Making sense of changes in ordinal variables can be hard.

A few weeks ago I tweeted out a plot that was my attempt to visualize the relationship between a covariate or predictor and an ordinal outcome (a challenge I often struggle with!). Since some people asked, I thought it’d be a good opportunity walk through the modeling and plotting process, with a focus on the latter.

But first, acknowledgments are in order: I already mentioned this in the tweet, but all of this is heavily inspired by Richard McElreath’s excellent Statistical Rethinking, quite possible the best introduction to Bayes and statistical reasoning out there. If you’re more familiar with the `tidyverse`

, Solomon Kurz’s translation of McElreath is a must. I relied on it heavily as an introduction to cumulative models in `brms`

. I also make use of the `tidybayes`

/ `ggdist`

packages by Matthew Kay. I encourage everyone to check the vignettes for extra thoughts on visualizing ordinal outcomes.

Ok, let’s get going!

My sense is that we often don’t know what to do with ordinal variables, partly because they don’t fit neatly into traditional analysis routines. Ordinal variables are not numeric, they’re not categorical, they’re *strange*. As Torrin Liddell and John Kruschke pointed out in a great review paper, the most common way of analyzing ordinal data is to just assume it’s metric. In fact, they found that 100% (!) of the papers they surveyed analyzed ordinal data with metric models. And this is problematic.

Why? That’s out of the scope of this blog post, but treating these variables as metric can lead us to find effects where there are none, to find no effects where we should find them, and to even flip the sign of an effect. Treating ordinal variables like they’re numeric assumes that they represent regular intervals. Imagine we’re using a Likert scale that looks like this:

**Either something is true or it is false; there is nothing in-between.**

- Strongly disagree
- Disagree
- Neither agree nor disagree
- Agree
- Strongly agree

If we code this variable as metric (on a 1-5 scale), we are assuming that the difference between **Strongly disagree** and **Disagree** (1 and 2) is the same as the difference between **Neither agree nor disagree** and **Agree** (3 and 4). Now, this *could* potentially be true, but there’s no reason to believe it is, since the numbers we assigned to the scale are just markers of ordering, nothing more. Assuming they’re intervals is a very strong assumption! To see why, imagine we asked people the following question:

**How often do you feel sad?**

- Never
- Sometimes
- Often or very often

Should we believe the change between **Never** and **Sometimes** is the same as the change between **Sometimes** and **Often or very often**? Probably not. This is a contrived example, sure. But it shows that we should be careful about systematically assuming that ordinal data is equivalent to numeric data.

A more appropriate tool for this sort of problem is an ordinal models, of which there are quite a few available; the most common are the cumulative,sequential, and adjacent-category models We won’t get into this, but if this interests you, you can take a look at this paper by Paul Bürkner and Matti Vuorre, which is a great overview from a Bayesian perspective.

But enough of modeling assumptions. I find that ordinal data is also somewhat complicated to visualize whenever there’s any sort of *change* involved. Levels are straightforward to visualize. At the end of the day, we can use a simple bar chart to show the count or percentage of respondents who selected each of the options, and then order the bars accordingly:

```
library(dplyr)
library(ggplot2)
library(hrbrthemes)
my_labels <- c("Not common at all",
"Not very common",
"Somewhat common",
"Very common")
level_df <- tibble(choice = c(1, 2, 3, 4),
count = c(55, 30, 50, 80)) %>%
mutate(choice = factor(choice, labels = my_labels))
level_df %>%
ggplot(aes(y = choice,
x = count)) +
geom_col(fill = "#00A08A") +
labs(title = "How common is fraud in social assistance programs?",
y = NULL,
x = "Count") +
theme_ipsum_ps() +
theme(plot.title.position = "plot")
```

But imagine we wanted to show the effect of a treatment. A simple option is to just use a dodged bar chart:

```
change_df <- tibble(choice = rep(c(1, 2, 3, 4), 2),
count = c(55, 30, 50, 80, 65, 35, 45, 70),
treatment = c(rep("Control",4), rep("Treatment",4))) %>%
mutate(choice = factor(choice, labels = my_labels))
change_df %>%
ggplot(aes(y = choice,
x = count,
fill = treatment)) +
geom_col(position = position_dodge()) +
labs(title = "How common is fraud in social assistance programs?",
y = NULL,
x = "Count") +
theme_ipsum_ps() +
theme(plot.title.position = "plot")
```

It’s not *bad*, but it’s not super intuitive either. It also becomes problematic when working with continuous predictors, or with lots of different groups. Let’s walk through a simulated example to explore other options.

My team at ideas42 has been running a bunch of surveys on attitudes towards poverty in the United States. This almost always involves ordinal outcomes, often in the form of Likert scales. One of our research questions is whether ideology (and specifically, authoritarianism and dominance orientation) is predictive of overall fraud perceptions across a bunch of social assistance programs (think SNAP, Social Security, and Medicare).

Since that was the topic of the original plot, we’ll stick to it, except this time we’ll simulate some fake data to make the results more reproducible.

First, let’s load the extra packages we’ll need. Note we have already loaded `dplyr`

, `ggplot2`

, and `hrbrthemes`

.

Let’s use the `simstudy`

package to simulate some ordinal data. There’s a bunch of ways to do this (including by hand), but this package makes things significantly easier.

Let’s create 4 different levels, which we’ll assume to be a Likert scale ranging from **Strongly Agree** to **Strongly Disagree**. You can follow the code below, or skip it if you’re more interested in the modeling and plotting part of it.

To keep with our examples, we’ll also assume that there’s 5 different cases. We can assume these are government programs, and people respond differently to them. Some cause, on average, more positive reactions that others.

We’ll also assume (as we found in practice) that authoritarianism is predictive of general fraud perceptions. In our simulation, we’ll give it a coefficient of 0.3, so that every unit increase in authoritarianism is associated with a 0.3 unit increase in the latent fraud perception variable.

```
# Simulating our ordinal categorical data
## Define our authoritarianism variable (mean = 0, sd = 1)
def1 <- defData(
varname = "auth",
formula = 0,
dist = "normal",
id = "id",
variance = 1)
# Define our 5 different programs with equal probability
def1 <- defData(def1,
varname = "program",
formula = "0.2;0.2;0.2;0.2;0.2",
dist = "categorical")
# Define the continuous variable that sit below the ordinal outcomes
def1 <- defData(def1,
varname = "z",
formula = "0.3*auth + 0.2*program",
dist = "nonrandom")
```

Now let’s generate the categorical response variable (`choice`

) and the rest of the data.

We won’t dwell on the specifics of the `simstudy`

package, but if you’re interested you should check out the vignettes available. There are also a number of case studies you may find helpful, including one on how to visualize ordinal treatment effects, or how to think about generating ordinal data.

Ok, we have our data. Let’s take a look at it to make sure that everything worked fine.

id | auth | program | z | choice |
---|---|---|---|---|

1 | 1.37 | 1 | 0.61 | 1 |

2 | -0.56 | 3 | 0.43 | 2 |

3 | 0.36 | 1 | 0.31 | 2 |

4 | 0.63 | 3 | 0.79 | 4 |

5 | 0.40 | 5 | 1.12 | 2 |

6 | -0.11 | 4 | 0.77 | 2 |

All good. To assess whether the distributions are representative of what we would see in real life, let’s plot our fake data.

```
d %>%
ggplot(aes(choice)) +
geom_bar() +
facet_wrap(~program)
```

Looks fairly realistic, but let’s make it a bit nicer. We can flip the axes so that it’s easier to compare the levels, and also add the labels for each of the ordinal levels.

```
df <- d %>%
# Generate labels for each program
mutate(program = factor(program, labels = c("Program 1",
"Program 2",
"Program 3",
"Program 4",
"Program 5")),
# Generate labels for each Likert category
choice_lab = factor(as.numeric(choice), labels = my_labels, ordered = TRUE),
# Generate a numeric version of the variable for our model
choice_num = as.numeric(choice))
df %>%
ggplot(aes(y = choice_lab)) +
geom_bar(fill = "#00A08A") +
facet_wrap(~program, nrow = 1) +
theme_ipsum_ps() +
labs(title = "How common do you think fraud is?",
y = NULL,
x = "Number of respondents") +
theme(plot.title.position = "plot")
```

Ok, that looks good for now! As specified in our data-generating process, every program has a slightly different distribution. As we move towards the right, people are more likely to see fraud. Now that we have our data, let’s move on to modeling.

We’ll use Bayesian methods because they’re far more flexible once you move beyond the basics. To do it, we’ll need the `brms`

and `tidybayes`

/ `ggdist`

packages. If you’re not used to Bayesian stats, there are bunch of Frequentist alternatives out there (for example, the `ordinal`

package).

Note that the model may take a couple of minutes to compile and run. If this is your first time using `brms`

, head to the great documentation and vignettes.

Here’s the model summary. Things look pretty good! You’ll see that the model managed to capture the right `authoritarianism`

parameter, and it got pretty close to the real (simulated) `program`

parameters too:

```
print(fit_brms)
```

```
Family: cumulative
Links: mu = logit; disc = identity
Formula: choice_lab ~ 1 + auth + program
Data: df (Number of observations: 1000)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept[1] -0.71 0.13 -0.97 -0.46 1.00 2825
Intercept[2] 0.50 0.13 0.25 0.75 1.00 2541
Intercept[3] 1.28 0.13 1.03 1.54 1.00 2518
auth 0.21 0.06 0.10 0.33 1.00 4845
programProgram2 0.23 0.18 -0.13 0.59 1.00 3107
programProgram3 0.39 0.18 0.05 0.74 1.00 2739
programProgram4 0.69 0.17 0.34 1.03 1.00 2686
programProgram5 0.97 0.19 0.60 1.34 1.00 3354
Tail_ESS
Intercept[1] 2773
Intercept[2] 2945
Intercept[3] 3253
auth 3062
programProgram2 3212
programProgram3 3075
programProgram4 3059
programProgram5 3248
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 1.00 4000 4000
Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
```

Now let’s extract the model predictions so we can plot them. We’ll get 100 fitted draws. Note you’ll need an index variable so that `ggplot`

can keep track of what dots it should (literally) connect!

```
df_fitted <- df %>%
data_grid(auth = c(-3,-2,-1,0, 1,2, 3),
program = unique(df$program)) %>%
add_fitted_draws(fit_brms, n=100) %>%
ungroup() %>%
mutate(choice = factor(as.numeric(.category), labels = my_labels))
# This is a crucial step if you want to be able to plot lines
df_fitted <- df_fitted %>%
group_by(.draw, choice) %>%
mutate(indices = cur_group_id()) %>%
ungroup()
```

Now let’s reproduce the graph. There’s quite a lot of stuff going on, so we’ll do it step by step.

First, I’ll create the color palette. The `viridis`

collection is always a good option, for a number of reasons, including robustness to colorblindness and also perceptual uniformity. Let’s use plasma, and get 4 different colors. You can play around with the `begin`

and `end`

values to pick a range that suits your needs. To make this cleaner, let’s also creating character vectors for the title and subtitle. Note that there’s some `ggtext`

(see vignettes and documentation here code inserted *within* the character vector. This allows us to add Markdown notation to **bold** and *italicize* text. More importantly, it also makes it relatively easy to change colors!

```
# Defining our color palette. One color for every category.
colors <- viridis(option = "plasma",
begin = 0.15,
end = 0.8,
direction = -1,
n = 4)
# Defining our title and subtitle
my_title <- c("People high in authoritarianism see more fraud across the board")
my_subtitle <- c("Lines are the predicted fraction of people saying fraud is <span style = 'color:#5601A4FF;'>**very common**</span>, <span style = 'color:#A62098FF;'>**somewhat common**</span>, <span style = 'color:#DE5F65FF;'>**not very common**</span>, and <span style = 'color:#FCA636FF;'>**not common at all**</span>")
# Plotting the fitted draws
p <- df_fitted %>%
ggplot(aes(x = auth,
y = .value,
color = choice,
# Don't forget the indices!
group = indices)) +
facet_wrap(~program, nrow = 1) +
geom_line(alpha = 0.2) +
scale_color_manual(values = colors) +
# We won't need these
guides(color = FALSE,
label = FALSE) +
theme_ipsum_ps()
p
```

Ok, we’re almost there. Time for the bells and whistles. Let’s add our title and subtitle using `ggtext`

. Note that to use the extra formatting, you’ll have to add an `element_textbox_simple()`

call wherever you plan to use it (in this case, the subtitle). We’ll also expand the plot margins to make space for our labels, and remove the limits using `coord_cartesian(clip = "off")`

. Then we’ll add the labels, filtering the original data to make sure it only shows up once, towards the right hand side of the plot.

```
p +
labs(title = my_title,
subtitle = my_subtitle,
x = "Authoritarianism",
y = NULL) +
# Note we extend the right margin to make space for our labels(the order is top, right, bottom, left)
theme(plot.margin = margin(30, 100, 30, 100),
# You'll have to add this element_textbox_simple call to make the formatting work
plot.subtitle = element_textbox_simple(margin = ggplot2::margin(10,0,20,0),
family = "IBM Plex Sans")) +
# This allows any labels or data to go past the grid
coord_cartesian(clip = "off") +
# Finally, our labels. We filter the data to avoid having a million of them
geom_text_repel(data = df_fitted
%>% filter(program == "Program 5" & auth == 3) %>%
# Keep just one of each
distinct(auth, program, choice, .keep_all = TRUE),
aes(label = choice),
direction = "y",
hjust = 0,
segment.size = 0.2,
# Move the labels to the right
nudge_x = 0.4,
na.rm = TRUE,
# Expand limits so that the label doesn't get stuck
xlim = c(-10, 10),
# Use the theme_ipsum_ps() font family
family = "IBM Plex Sans Medium",
# Adjust size as needed!
size = 3.5)
```

And there we go! Lots of wrangling later, here’s our final result. It’s not perfect, but I think it’s an interesting way to show both the **uncertainty of the predictions** as well as the **effect of a continuous variable on the ordinal outcome** (though note that this could easily work with a binary or ordered categorical variable too).