Using colors in R

Load dplyr and gapminder

Change the default plotting symbol to a solid circle

The colour demos below will exist more than effective if the default plotting symbol is a solid circle. Nosotros limit ourselves to base R graphics in this tutorial, therefore we use par(), the function that queries and sets base R graphical parameters. In an interactive session or in a plain R script, do this:

Technically, you don't need to brand the assignment, merely it's a expert practice. We're killing two birds with 1 stone:

  1. Irresolute the default plotting symbol to a filled circle, which has code 19 in R. (Below I link to some samplers showing all the plotting symbols, FYI.)
  2. Storing the pre-existing and, in this case, default graphical parameters in opar.

When yous alter a graphical parameter via par(), the original values are returned and we're capturing them via assignment to opar. At the very bottom of this tutorial, we use opar to restore the original country.

Big picture, information technology is best practice to restore the original, default state of hidden things that affect an R session. This is polite if you programme to inflict your code on others. Even if you live on an R desert island, this practice will prevent you from creating maddening little puzzles for yourself to solve in the middle of the night before a deadline.

Because of the way figures are handled by knitr, information technology is more than complicated to change the default plotting symbol throughout an R Markdown document. To meet how I've washed it, check out a subconscious clamper around here in the source of this folio.

Basic color specification and the default palette

I need a minor well-behaved excerpt from the Gapminder data for sit-in purposes. I randomly draw 8 countries, keep their data from 2007, and sort the rows based on GDP per capita. See jdat.

A elementary scatterplot, using plot() from the base of operations parcel graphics.

Yous tin can specify color explicitly by proper noun by supplying a character vector with ane or more color names (more on those soon). If you need a color for eight points and you lot input fewer, recycling will boot in. Here'southward what happens when you specify one or two colors via the col = argument of plot().

                              plot(lifeExp                ~                                gdpPercap, jdat,                log =                'x',                xlim =                j_xlim,                ylim =                j_ylim,                 col =                "red",                principal =                'col = "red"')                plot(lifeExp                ~                                gdpPercap, jdat,                log =                'x',                xlim =                j_xlim,                ylim =                j_ylim,                 col =                c("blueish",                "orange"),                principal =                'col = c("blue", "orange")')            

You tin specify color explicitly with a pocket-size positive integer, which is interpreted every bit indexing into the electric current palette, which tin be inspected via palette(). I've added these integers and the color names equally labels to the figures beneath. The default palette contains 8 colors, which is why we're looking at data from eight countries. The default palette is ugly.

                              plot(lifeExp                ~                                gdpPercap, jdat,                log =                'ten',                xlim =                j_xlim,                ylim =                j_ylim,                 col =                ane                :n_c,                principal =                paste0('col = 1:', n_c))                with(jdat,                text(x =                gdpPercap,                y =                lifeExp,                pos =                1))                plot(lifeExp                ~                                gdpPercap, jdat,                log =                'x',                xlim =                j_xlim,                ylim =                j_ylim,                 col =                1                :n_c,                main =                'the default palette()')                with(jdat,                text(x =                gdpPercap,                y =                lifeExp,                labels =                palette(),                 pos =                rep(c(1,                3,                i),                c(5,                ane,                2))))                          

You lot can provide your ain vector of colors instead. I am intentionally modelling all-time practice hither too: if you're going to use custom colors, store them as an object in exactly one identify, and use that object in plot calls, legend-making, etc. This makes it much easier to fiddle with your custom colors, which few of us can resist.

              j_colors <-                                c('chartreuse3',                'cornflowerblue',                'darkgoldenrod1',                'peachpuff3',                 'mediumorchid2',                'turquoise3',                'wheat4',                'slategray2')                plot(lifeExp                ~                                gdpPercap, jdat,                log =                'x',                xlim =                j_xlim,                ylim =                j_ylim,                 col =                j_colors,                primary =                'custom colors!')                with(jdat,                text(x =                gdpPercap,                y =                lifeExp,                labels =                j_colors,                 pos =                rep(c(ane,                three,                1),                c(5,                1,                two))))                          

What colors are available? Ditto for symbols and line types

Who would take guessed that R knows almost "peachpuff3"? To see the names of all 657 the built-in colors, use colors().

But it's much more exciting to see the colors displayed! Lots of people have tackled this – for colors, plotting symbols, line types – and put their work on the internet. Some examples:

  • I put color names on a white groundwork and on blackness (sad, these are PDFs)
  • I printed the kickoff 30 plotting symbols (presumably using code found elsewhere or in documentation? can't recall whom to credit) (sorry, it's a PDF)
  • In Chapter iii of R Graphics 1st edition (2005), Paul Murrell shows predefined and custom line types in Effigy 3.6 and plotting symbols in Effigy iii.10.

RColorBrewer

Most of us are pretty lousy at choosing colors and it's piece of cake to spend too much time piddling with them. Cynthia Brewer, a geographer and color specialist, has created sets of colors for print and the spider web and they are available in the addition package RColorBrewer. Yous will need to install and load this bundle to employ.

Let'southward look at all the associated palettes.

They fall into three classes. From peak to bottom, they are

  • sequential: great for low-to-high things where one extreme is exciting and the other is tiresome, like (transformations of) p-values and correlations (caveat: here I'g assuming the just exciting correlations yous're likely to see are positive, i.eastward. virtually 1)
  • qualitative: dandy for non-ordered categorical things – such as your typical factor, like country or continent. Note the special case "Paired" palette; example where that'southward useful: a non-experimental factor (due east.g. blazon of wheat) and a binary experimental gene (eastward.g. untreated vs. treated).
  • diverging: great for things that range from "extreme and negative" to "farthermost and positive", going through "non extreme and boring" along the way, such as t-statistics and z-scores and signed correlations

You tin can view a single RColorBrewer palette by specifying its name:

The package is, frankly, rather clunky, every bit evidenced by the requirement to specify n above. Distressing folks, you'll but accept to cope.

Hither we revisit specifying custom colors as we did above, merely using a palette from RColorBrewer instead of our artisanal "peachpuff3" work of art. As before, I brandish the colors themselves but you'll meet nosotros're non getting the friendly names you've seen before, which brings the states to our next topic.

              j_brew_colors <-                                brewer.pal(n =                8,                name =                "Dark2")                plot(lifeExp                ~                                gdpPercap, jdat,                log =                'x',                xlim =                j_xlim,                ylim =                j_ylim,                 col =                j_brew_colors,                main =                'Dark2 qualitative palette from RColorBrewer')                with(jdat,                text(ten =                gdpPercap,                y =                lifeExp,                labels =                j_brew_colors,                 pos =                rep(c(1,                3,                1),                c(5,                one,                2))))                          

viridis

In 2015 Stéfan van der Walt and Nathaniel Smith designed new color maps for matplotlib and presented them in a talk at SciPy 2015. The viridis R bundle provides iv new palettes for use in R: on CRAN with development on GitHub. From Description:

These color maps are designed in such a way that they volition analytically be perfectly perceptually-compatible, both in regular grade and too when converted to black-and-white. They are also designed to be perceived by readers with the most common form of colour incomprehension (all color maps in this package) and color vision deficiency ('cividis' only).

I encourage yous to install viridis and read the vignette. It is easy to use these palettes in ggplot2 via scale_color_viridis() and scale_fill_viridis(). Taking control of color palettes in ggplot2 is covered elsewhere (run across Chapter 26.

Hither are two examples that show the viridis palettes:

Hexadecimal RGB colour specification

Instead of minor positive integers and Crayola-style names, a more general and machine-readable approach to colour specification is as hexadecimal triplets. Here is how the RColorBrewer Dark2 palette is actually stored:

The leading # is simply there by convention. Parse the hexadecimal cord like and then: #rrggbb, where rr, gg, and bb refer to colour intensity in the red, dark-green, and blue channels, respectively. Each is specified as a two-digit base 16 number, which is the significant of "hexadecimal" (or "hex" for brusque).

Here's a table relating base of operations 16 numbers to the beloved base 10 organization.

1 2 3 four v half dozen 7 8 9 x 11 12 thirteen xiv 15 16
hex 0 ane 2 3 four 5 6 vii 8 9 A B C D E F
decimal 0 1 ii 3 four 5 vi 7 8 9 x 11 12 13 14 15

Example: the outset color in the palette is specified as "#1B9E77", and so the intensity in the dark-green channel is 9E. What does that mean? \[ 9E = 9 * xvi^i + 14 * 16^0 = 9 * 16 + 14 = 158 \] Note that the everyman possible channel intensity is 00 = 0 and the highest is FF = 255.

Of import special cases that assist you lot stay oriented. Here are the saturated RGB colors, cerise, blueish, and green:

color_name hex red green blue
blue #0000FF 0 0 255
green #00FF00 0 255 0
ruby #FF0000 255 0 0

Here are shades of gray:

color_name hex red green blueish
white, gray100 #FFFFFF 255 255 255
gray67 #ABABAB 171 171 171
gray33 #545454 84 84 84
black, gray0 #000000 0 0 0

Note that everywhere you lot see "gray" above, you will get the same results if you substitute "gray". We see that white corresponds to maximum intensity in all channels and black to the minimum.

To review, here are the ways to specify colors in R:

  • a positive integer, used to index into the current color palette (queried or manipulated via palette())
  • a color name amidst those plant in colors()
  • a hexadecimal cord; in add-on to a hexadecimal triple, in some contexts this can exist extended to a hexadecimal quadruple with the fourth channel referring to alpha transparency

Here are some functions to read up on if you want to acquire more – don't forget to mine the "See Too" section of the help to expand your horizons: rgb(), col2rgb(), convertColor().

Alternatives to the RGB color model, especially HCL

The RGB color infinite or model is by no ways the only or best ane. It's natural for describing colors for brandish on a calculator screen merely some really of import color picking tasks are hard to execute in this model. For example, it'south not obvious how to construct a qualitative palette where the colors are easy for humans to distinguish, but are as well perceptually comparable to one other. Capeesh this: nosotros can apply RGB to describe colors to the reckoner but we don't accept to use it as the space where nosotros construct colour systems.

Color models generally have 3 dimensions, every bit RGB does, due to the physiological reality that humans accept three different receptors in the retina. (Here is an informative blog postal service on RGB and the human visual organization.) The closer a colour model's dimensions correspond to distinct qualities people can perceive, the more useful it is. This correspondence facilitates the deliberate construction of palettes and paths through color space with specific properties. RGB lacks this concordance with human perception. Merely considering you have photoreceptors that detect red, light-green, and blueish light, information technology doesn't mean that your perceptual experience of color breaks downwardly that way. Do you lot experience the colour xanthous equally a mix of red and green light? No, of grade not, but that'due south the physiological reality. An RGB alternative you may take encountered is the Hue-Saturation-Value (HSV) model. Unfortunately, information technology is likewise quite problematic for color picking, due to its dimensions being confounded with each other.

What are the proficient perceptually-based color models? CIELUV and CIELAB are two well-known examples. We volition focus on a variant of CIELUV, namely the Hue-Chroma-Luminance (HCL) model. It is written up nicely for an R audience in Zeileis et al.'s "Escaping RGBland: Selecting Colors for Statistical Graphs" in Computational Statistics & Data Assay (2009). There is a companion R bundle colorspace, which will aid you to explore and exploit the HCL color model. Finally, this color model is fully embraced in ggplot2 (every bit are the RColorBrewer palettes).

Here's what I can tell you about the HCL model's three dimensions:

  • Hue is what you lot usually think of when yous think "what colour is that?" It's the easy one! It is given equally an angle, going from 0 to 360, so imagine a rainbow donut.
  • Blush refers to colorfullness, i.e. how pure or vivid a color is. The more something seems mixed with grey, the lower its chromaticity. The everyman possible value is 0, which corresponds to bodily grayness. The maximum value varies with luminance.
  • Luminance is related to effulgence, lightness, intensity, and value. Low luminance ways dark and indeed black has luminance 0. High luminance ways light and white has luminance 1.

Full disclosure: I have a difficult time actually grasping and distinguishing chroma and luminance. As nosotros bespeak out higher up, they are not entirely independent, which speaks to the weird shape of the 3 dimensional HCL space.

This figure in Wickham's ggplot2: Elegant Graphics for Data Analysis (2009) volume is helpful for agreement the HCL color infinite:

Paraphrasing Wickham: Each facet or panel depicts a slice through HCL infinite for a specific luminance, going from low to high. The extreme luminance values of 0 and 100 are omitted because they would, respectively, be a single black point and a unmarried white signal. Inside a slice, the centre has chroma 0, which corresponds to a shade of grayness. Every bit you motility toward the slice's border, chroma increases and the color gets more pure and intense. Hue is mapped to angle.

A valuable contribution of the colorspace package is that it provides functions to create color palettes traversing colour infinite in a rational mode. In contrast, the palettes offered by RColorBrewer, though well-crafted, are unfortunately fixed.

Hither is an article that uses compelling examples to advocate for perceptually based color systems and to demonstrate the importance of signalling where zero is in colorspace:

  • "Why Should Engineers and Scientists Be Worried About Colour?" (Rogowitz and Treinish 1996)

Accommodating colour blindness

The dichromat bundle (on CRAN) will help you select a color scheme that will exist effective for colour blind readers.

This colorschemes listing contains length(colorschemes) color schemes "suitable for people with scarce or anomalous cherry-green vision":

Color schemes "suitable for people with deficient or anomalous red-green vision"

Figure 25.3: Colour schemes "suitable for people with deficient or anomalous reddish-green vision"

What else does the dichromat package offer? The dichromat() part transforms colors to approximate the issue of different forms of color incomprehension, allowing yous to assess the operation of a candidate scheme. The control data("dalton") will make two objects available which represent a 256-color palette every bit it would appear with normal vision, with two types of red-green colour blindness, and with green-blue color blindness.

Make clean upward

Resources

  • Zeileis et al.'s "Escaping RGBland: Selecting Colors for Statistical Graphs" in Computational Statistics & Data Analysis (2009).
  • Vignette for the colorspace bundle.
  • Earl F. Glynn (Stowers Institute for Medical Inquiry):
    • Fantabulous resources for named colors, i.due east. the ones available via colors().
    • Informative talk "Using Color in R", though features some questionable utilise of colour itself.
  • Blog post My favorite RGB colour on the Many World Theory blog.
  • Wickham'southward ggplot2: Elegant Graphics for Data Analysis (2009).
    • Online docs (nice!)
    • Package webpage
    • ggplot2 on CRAN and GitHub
    • Department 6.4.three Colour
  • "Why Should Engineers and Scientists Be Worried Almost Color?" by Bernice E. Rogowitz and Lloyd A. Treinish of IBM Enquiry (1996), h/t @EdwardTufte.