Loading [MathJax]/jax/output/HTML-CSS/jax.js

Format numbers

Prefix Symbol Value Prefix Symbol Value
peta P 1015 milli m 103
tera T 1012 micro μ 106
giga G 109 nano n 109
mega M 106 pico p 1012
kilo k 103 femto f 1015

The rationale for the formatdown package is formatting numbers in power-of-ten notation in inline R code or tabulated columns of data frames. Other features of the package provide tools for typesetting non-power-of-ten columns to match. In this vignette, we discuss the primary formatting function format_numbers() and its convenience wrappers for scientific, engineering, and decimal notation.

Types of notation

Notation to represent large and small numbers depends on the mode of communication. In a computer script, for example, we might encode the Avogadro constant as N_A = 6.0221*10^23. The asterisk (*) and caret (^) in this expression, however, communicate instructions to a computer, not syntactical mathematics. And while scientific E-notation (6.0221E+23) has currency in some discourse communities, power-of-ten notation, e.g., NA=6.0221×1023, is the conventional format for professional technical communication.

Power-of-ten notation is expressed,

a×10n,

where a is the coefficient in decimal form and the exponent n is an integer. Two formats are in common use (Chase, 2021, pp. 63–67):

The utility of the engineering form follows from the SI prefixes for physical units such as “mega-”, “kilo-”, “milli-”, etc., corresponding to powers of 10 that are integer multiples of three.


Notes on syntax.   Programming symbols are not necessarily mathematical symbols:


Decimal subsets.   In a vector of numbers formatted in power-of-ten form, the decimal form may be preferred for any subset of values with exponents near zero, e.g., n{1,0,1,2}.

Decimal form may be preferred for a subset
scientific notation subset in decimal form
3.12×103 3.12×103
3.12×102 3.12×102
3.12×101 0.312
3.12×100 3.12
3.12×101 31.2
3.12×102 312
3.12×103 3.12×103
3.12×104 3.12×104


Decimal columns.   A table of numeric information can include columns formatted in both power-of-ten notation and decimal notation. For example, a table of atmospheric properties shown below has altitude in integer form, temperature in decimal form, and density in power-of-ten engineering notation (except for those values with exponents near zero).

Properties of the atmosphere
Altitude (km) Temperature (K) Density (kg/m3)
0 288.15 1.23
10 223.25 0.414
20 216.65 88.9×103
30 226.51 18.4×103
40 250.35 4.00×103
50 270.65 1.03×103
60 247.02 310×106
70 219.59 82.8×106
80 198.64 18.5×106
90 186.87 3.43×106
100 195.08 560×109

The purpose of the decimal format in formatdown is to match the font face and size of decimal columns to those of the power-of-ten columns. If no power-of-ten columns are used, of course, decimal columns can be displayed as-is or formatted using other R tools.


Packages.   If you are writing your own script to follow along, we use the following packages in this vignette. Data frame operations are performed with data.table syntax. Some users may wish to translate the examples to use base R or dplyr syntax.

library("formatdown")
library("data.table")
library("knitr")

Markup

We format numbers as inline math expressions delimited by $ ... $ or the optional \( ... \). For example, the Avogadro constant is marked up as

    $6.0221 \times 10^{23}$, 

where the \times macro creates the multiplication symbol (×). This math markup, as an inline equation in an R markdown document, renders as: 6.0221×1023. To program the markup, however, we enclose it in quote marks as a character string, that is,

    "$6.0221 \\times 10^{23}$", 

which requires us to “escape” the backslash in \times by adding an extra backslash. When the optional font size argument is assigned, formatdown adds a LaTeX-style sizing macro such as \small or \large, for example,

    "$\\small 6.0221 \\times 10^{23}$", 

where again the markup includes an extra backslash.

format_sci()

Converts numbers to character strings in power-of-ten form,

    "$a \\times 10^{n}$" 

where a is the coefficient and n is the exponent. format_sci() is a wrapper for the more general function format_numbers(). For a subset of values with exponents near zero, e.g., n{1,0,1,2}, the output is in decimal form,

    "$a$"


Usage.  

format_sci(x,
           digits = 4,
           ...,
           omit_power = c(-1, 2),
           set_power = NULL,
           delim          = formatdown_options("delim"),
           size           = formatdown_options("size"),
           decimal_mark   = formatdown_options("decimal_mark"),
           small_mark     = formatdown_options("small_mark"),
           small_interval = formatdown_options("small_interval"), 
           whitespace     = formatdown_options("whitespace"))


Examples.   These early examples are shown with default arguments. Arguments are explored more fully starting with Numeric input section.

# 1. Avogadro constant
L <- 6.0221e+23
format_sci(L)
#> [1] "$6.022 \\times 10^{23}$"

# 2. Elementary charge
e <- 1.602176634e-19
format_sci(e)
#> [1] "$1.602 \\times 10^{-19}$"

Examples 1 and 2 (in inline code chunks) render as,

  1. The Avogadro constant is L= 6.022×1023 mol1.
  2. The elementary charge constant is e= 1.602×1019 C.

format_engr()

Similar to format_sci() except using engineering notation, i.e., exponents are multiples of 3.


Usage.

format_engr(x,
            digits = 4,
            ...,
            omit_power = c(-1, 2),
            set_power = NULL,
            delim          = formatdown_options("delim"),
            size           = formatdown_options("size"),
            decimal_mark   = formatdown_options("decimal_mark"),
            small_mark     = formatdown_options("small_mark"),
            small_interval = formatdown_options("small_interval"), 
            whitespace     = formatdown_options("whitespace"))


Examples.   (with default arguments)

# 3. Avogadro constant
format_engr(L)
#> [1] "$602.2 \\times 10^{21}$"

# 4. Elementary charge
format_engr(e)
#> [1] "$160.2 \\times 10^{-21}$"

Examples 3 and 4 render as,

  1. The Avogadro constant is L= 602.2×1021 mol1.
  2. The elementary charge constant is e= 160.2×1021 C.

format_dcml()

A wrapper for the more general function format_numbers(); converts numbers to character strings in decimal form,

    "$a$"

where a is the decimal value.


Usage.

format_dcml(x,
            digits = 4,
            ...,
            size           = formatdown_options("size"),
            delim          = formatdown_options("delim"),
            decimal_mark   = formatdown_options("decimal_mark"),
            big_mark       = formatdown_options("big_mark"),
            big_interval   = formatdown_options("big_interval"),
            small_mark     = formatdown_options("small_mark"),
            small_interval = formatdown_options("small_interval"), 
            whitespace     = formatdown_options("whitespace"))


Examples.   (with default arguments)

# 5. Speed of light in a vacuum
c <- 299792458
format_dcml(c)
#> [1] "$299800000$"

# 6. Molar gas constant
R <- 8.31446261815324
format_dcml(R)
#> [1] "$8.314$"

Examples 5 and 6 render as,

  1. The speed of light in a vacuum is c= 299800000 m/s.
  2. The molar gas constant is R= 8.314 JK1mol1.

format_numbers()

format_numbers() is the general-purpose formatting function called by format_sci(), format_engr(), and format_dcml(). The general function can be used instead of the convenience functions simply by setting its format argument to "sci", "engr" (default), or "dcml".


Usage.

format_numbers(x,
               digits = 4,
               format = "engr",
               ...,
               omit_power = c(-1, 2),
               set_power = NULL,
               delim          = formatdown_options("delim"),
               size           = formatdown_options("size"),
               decimal_mark   = formatdown_options("decimal_mark"),
               big_mark       = formatdown_options("big_mark"),
               small_mark     = formatdown_options("small_mark"),
               big_interval   = formatdown_options("big_interval"),
               small_interval = formatdown_options("small_interval"), 
               whitespace     = formatdown_options("whitespace"))

Examples.   Reproducing some of the earlier examples using format_numbers().

# 7. Scientific
format_numbers(L, format = "sci")
#> [1] "$6.022 \\times 10^{23}$"

# 8. Engineering
format_numbers(e, format = "engr")
#> [1] "$160.2 \\times 10^{-21}$"

# 9. Decimal
format_numbers(R, format = "dcml")
#> [1] "$8.314$"

Examples 7–9 render as,

  1. The Avogadro constant is L= 6.022×1023 mol1.
  2. The elementary charge constant is e= 160.2×1021 C.
  3. The molar gas constant is R= 8.314 JK1mol1.

Numeric input

This section begins our detailed discussion of arguments.

Scalar input.   Generally used with inline R code. For example, the following R markdown sentence, which includes some math markup and some inline R code,

    The Avogadro constant is $L = $ `r format_sci(L)` $\mathit{mol}^{-1}$. 

renders as: The Avogadro constant is L= 6.022×1023 mol1.


Vector.   A vector of numbers (or a data frame column) is marked up as follows,

# 10. Sample vector
x <- c(2.3333e-05, 0.00034444, 0.052222, 0.63333, 81.111, 922.22, 24444, 311110,
    4222200)
format_engr(x)
#> [1] "$23.33 \\times 10^{-6}$" "$344.4 \\times 10^{-6}$"
#> [3] "$52.22 \\times 10^{-3}$" "$0.6333$"               
#> [5] "$81.11$"                 "$922.2$"                
#> [7] "$24.44 \\times 10^{3}$"  "$311.1 \\times 10^{3}$" 
#> [9] "$4.222 \\times 10^{6}$"

In a table, the output renders as,

DT <- data.table(x, format_engr(x))
knitr::kable(DT, align = "r", col.names = c("Unformatted", "Engr notation"), caption = "Example 10.")
Example 10.
Unformatted Engr notation
2.3300e-05 23.33×106
3.4440e-04 344.4×106
5.2222e-02 52.22×103
6.3333e-01 0.6333
8.1111e+01 81.11
9.2222e+02 922.2
2.4444e+04 24.44×103
3.1111e+05 311.1×103
4.2222e+06 4.222×106

For values with exponents n{1,0,1,2}, the default format is decimal; see Excluding exponents.

Units input

The units R package (website: Measurement Units for R) provides measurement units for R vectors, converting vectors of class “numeric” to class “units” (Pebesma et al., 2016). For example

# Number
x <- 10320
class(x)
#> [1] "numeric"

# Convert to units class
units(x) <- "m"
x
#> 10320 [m]
class(x)
#> [1] "units"

# Operations are reflected in the values and its units
y <- x^2
y
#> 106502400 [m^2]

# Unit conversion is supported
z <- y
z
#> 106502400 [m^2]
units(z) <- "ft^2"
z
#> 1146382293 [ft^2]

If an input argument to format_numbers() (or its convenience functions) is of class “units”, formatdown attempts to extract the units character string, format the number in the expected way, and append a units character string to the result. For example,

# 11. Units-class inputs
format_sci(x)
#> [1] "$1.032 \\times 10^{4}\\>\\mathrm{m}$"
format_sci(y)
#> [1] "$1.065 \\times 10^{8}\\>\\mathrm{m^{2}}$"
format_sci(z)
#> [1] "$1.146 \\times 10^{9}\\>\\mathrm{ft^{2}}$"

Example 11 renders as,

More complicated units can be managed. For example the Newtonian gravitational constant could be formatted as follows, where the exponents in the units definition are given in “implicit” form, that is, where m3kg1s2 is represented by "m3 kg-1 s-2".

    G        <- 6.6743e-11
    units(G) <- "m3 kg-1 s-2"
    format_sci(G)

Applying a similar procedure to several physical constants and collecting the results in a data frame yields,

symbol quantity formatted_value
c speedoflightinavacuum 2.998×108ms1
h Planckconstant 6.626×1034JHz1
μ0 vacuummagneticpermeability 1.257×106NA2
G Newtoniangravitationalconstant 6.674×1011m3kg1s2
ke Coulombconstant 8.988×109Nm2C2
σ StefanBoltzmannconstant 5.670×108WK4m2

This table is constructed simply to illustrate how formatdown returns a variety of units-class values with units appended to the formatted number.

In a typical application, however, the numbers in a column have the same physical units and are formatted as a vector. For example,

# Example 12
DT <- air_meas[, .(temp, pres, sp_gas, dens)]

# Examine data
DT[]
#>     temp   pres sp_gas  dens
#>    <num>  <num>  <int> <num>
#> 1: 294.1 101100    287 1.198
#> 2: 294.1 101000    287 1.196
#> 3: 294.6 101100    287 1.196
#> 4: 293.4 101000    287 1.200
#> 5: 293.9 101100    287 1.199

# Assign units
units(DT$temp) <- "K"
units(DT$pres) <- "Pa"
units(DT$sp_gas) <- "J kg-1 K-1"
units(DT$dens) <- "kg m-3"

# Format one column at a time
DT$temp <- format_dcml(DT$temp)
DT$pres <- format_engr(DT$pres)

# Or format multiple columns in one pass
cols <- c("sp_gas", "dens")
DT[, (cols) := lapply(.SD, format_dcml), .SDcols = cols]

knitr::kable(DT, align = "r", caption = "Example 12.")
Example 12.
temp pres sp_gas dens
294.1K 101.1×103Pa 287.0JK1kg1 1.198kgm3
294.1K 101.0×103Pa 287.0JK1kg1 1.196kgm3
294.6K 101.1×103Pa 287.0JK1kg1 1.196kgm3
293.4K 101.0×103Pa 287.0JK1kg1 1.200kgm3
293.9K 101.1×103Pa 287.0JK1kg1 1.199kgm3

Significant digits

Significant digits are applied to the input argument using the base R function signif() before additional formatting is applied. For example,

# 13. Significant digits
format_sci(e, digits = 5)
#> [1] "$1.6022 \\times 10^{-19}$"
format_sci(e, digits = 4)
#> [1] "$1.602 \\times 10^{-19}$"
format_sci(e, digits = 3)
#> [1] "$1.60 \\times 10^{-19}$"

Example 13 renders as,

Formats

The format argument appears in format_numbers() only. The default is “engr”. The format is preset in the format_dcml(), format_engr(), and format_sci() convenience functions.

To compare the effects across many orders of magnitude, we display the same vector in different formats.

# 14. Comparing formats
x <- c(2.3333e-05, 0.00034444, 0.052222, 0.63333, 81.111, 922.22, 24444, 311110,
    4222200)
dcml <- format_numbers(x, 3, format = "dcml")
sci <- format_numbers(x, 3, format = "sci")
engr <- format_numbers(x, 3, format = "engr")
DT <- data.table(dcml, sci, engr)
knitr::kable(DT, align = "r", col.names = c("decimal", "scientific", "engineering"),
    caption = "Example 14.")
Example 14.
decimal scientific engineering
0.0000233 2.33×105 23.3×106
0.000344 3.44×104 344×106
0.0522 5.22×102 52.2×103
0.633 0.633 0.633
81.1 81.1 81.1
922 922 922
24400 2.44×104 24.4×103
311000 3.11×105 311×103
4220000 4.22×106 4.22×106

The values displayed without powers-of-ten notation in the scientific and engineering columns are determined by the omit_power argument discussed next.

Excluding a range of exponents

When specifying power-of-ten notation, numbers with exponents lying within the range of the omit_power argument are typeset in decimal form. In engineering notation, the exponent is checked for lying within the range before and after the conversion to multiple-of-3 exponents.

To illustrate, we compare two omit_power settings in both scientific and engineering formats. In some columns, we set omit_power = NULL, which imposes power-of-ten notation on the entire vector.

# 15. Effects of omit_power
DT <- atmos[3:12, .(pres)]
DT[, sci_all := format_sci(pres, 3, omit_power = NULL)]
DT[, sci_omit := format_sci(pres, 3, omit_power = c(-1, 0))]
DT[, engr_all := format_engr(pres, 3, omit_power = NULL)]
DT[, engr_omit := format_engr(pres, 3, omit_power = c(-1, 0))]
knitr::kable(DT, align = "r", col.names = c("Unformatted", "all scientific", "scientific w/ omit",
    "all engineering", "engineering w/ omit"), caption = "Example 15.")
Example 15.
Unformatted all scientific scientific w/ omit all engineering engineering w/ omit
5.529e+03 5.53×103 5.53×103 5.53×103 5.53×103
1.197e+03 1.20×103 1.20×103 1.20×103 1.20×103
2.870e+02 2.87×102 2.87×102 287×100 287
8.000e+01 8.00×101 8.00×101 80.0×100 80.0
2.200e+01 2.20×101 2.20×101 22.0×100 22.0
5.220e+00 5.22×100 5.22 5.22×100 5.22
1.050e+00 1.05×100 1.05 1.05×100 1.05
1.840e-01 1.84×101 0.184 184×103 0.184
3.200e-02 3.20×102 3.20×102 32.0×103 32.0×103
4.540e-04 4.54×104 4.54×104 454×106 454×106

Comments:


If a single value is assigned, e.g., omit_power = 0, the argument is interpreted as c(0, 0).

# 16. Omit power used for a single value of exponent
DT <- atmos[3:12, .(pres)]
DT[, sci_all := format_sci(pres, 3, omit_power = NULL)]
DT[, sci_omit := format_sci(pres, 3, omit_power = 0)]
DT[, engr_all := format_engr(pres, 3, omit_power = NULL)]
DT[, engr_omit := format_engr(pres, 3, omit_power = 0)]
knitr::kable(DT, align = "r", col.names = c("Unformatted", "all scientific", "scientific w/ omit",
    "all engineering", "engineering w/ omit"), caption = "Example 16.")
Example 16.
Unformatted all scientific scientific w/ omit all engineering engineering w/ omit
5.529e+03 5.53×103 5.53×103 5.53×103 5.53×103
1.197e+03 1.20×103 1.20×103 1.20×103 1.20×103
2.870e+02 2.87×102 2.87×102 287×100 287
8.000e+01 8.00×101 8.00×101 80.0×100 80.0
2.200e+01 2.20×101 2.20×101 22.0×100 22.0
5.220e+00 5.22×100 5.22 5.22×100 5.22
1.050e+00 1.05×100 1.05 1.05×100 1.05
1.840e-01 1.84×101 1.84×101 184×103 184×103
3.200e-02 3.20×102 3.20×102 32.0×103 32.0×103
4.540e-04 4.54×104 4.54×104 454×106 454×106


Setting omit_power = c(-Inf, Inf) yields the same decimal result as format = "dcml" and overrides any other format setting. For example,

# 17. Different ways of creating a decimal format
(y <- 0.00678)
#> [1] 0.00678

(p <- format_numbers(y, 3, "sci", omit_power = c(-Inf, Inf)))
#> [1] "$0.00678$"

(q <- format_numbers(y, 3, "dcml"))
#> [1] "$0.00678$"

(r <- format_dcml(y, 3))
#> [1] "$0.00678$"

all.equal(p, q)
#> [1] TRUE
all.equal(p, r)
#> [1] TRUE

Example 17 (all cases) renders as,

Enforcing a specific exponent

When values in a table column span only a few orders of magnitude, an audience is sometimes better served by setting the notation to a constant power of ten. For example, here we show numbers in scientific format and compare to columns in which the exponents are set to fixed values. Assigning a value to set_power overrides omit_power and format.

# 18. set_power argument
DT <- atmos[alt <= 40, .(alt, pres, dens)]
DT[, sci_pres := format_sci(pres, 3, omit_power = c(-1, 2))]
DT[, set_pres := format_sci(pres, 3, omit_power = c(-1, 2), set_power = 3)]
DT[, sci_dens := format_engr(dens, 3, omit_power = c(-1, 2))]
DT[, set_dens := format_engr(dens, 3, omit_power = c(-1, 2), set_power = -2)]
DT[, pres := NULL]
DT[, dens := NULL]
knitr::kable(DT, align = "r", col.names = c("Altitude (km)", "Pressure (Pa)", "with set_power",
    "Density (kg/m$^{3}$)", "with set_power"), caption = "Example 18.")
Example 18.
Altitude (km) Pressure (Pa) with set_power Density (kg/m3) with set_power
0 1.01×105 101×103 1.23 123×102
10 2.65×104 26.5×103 0.414 41.4×102
20 5.53×103 5.53×103 88.9×103 8.89×102
30 1.20×103 1.20×103 18.4×103 1.84×102
40 287 0.287×103 4.00×103 0.400×102

Options

Arguments assigned using formatdown_options() are described in the Global settings article.

References

Chase, M. (2021). Technical Mathematics. https://openoregon.pressbooks.pub/techmath/chapter/module-11-scientific-notation/
Pebesma, E., Mailund, T., & Hiebert, J. (2016). Measurement units in R. R Journal, 8(2), 486–494. https://doi.org/10.32614/RJ-2016-061