library( tidyverse )
= tibble( G = sample( LETTERS[1:5], 100, replace=TRUE ),
dat X = rnorm( 100 ),
rp = sample( letters[1:3], 100, replace=TRUE ),
Z = sample( c("tx","co"), 100, replace=TRUE ),
Y = rnorm( 100 ) )
6 Making tables in Markdown
When writing reports you may, from time to time, need to include a table. You should probably make a chart instead, but every so often a table actually is a nice thing to have. This chapter focuses on two key aspects: creating the table itself, and formatting it for a report or presentation. We here cover only generic tables; for guidance on creating regression tables (where you show a bunch of different regression models together), see Chapter 7.
Many table-making packages and functions in R produce basic tables that display nicely in a monospace font on the screen. This is a good starting point, but you’ll often need additional formatting to make the table publication-ready. R offers several excellent packages to help with this, catering to different needs. Some are particularly suited for HTML documents (such as websites), while others are better for PDF documents (such as reports and papers). Finding the right package for your specific use case can take some trial and error.
To illustrate these concepts, let’s start with some fake data.
We can make summery of it by our grouping variable:
<- dat %>% group_by( G) %>%
sdat summarise( EY = mean( Y ),
pT = mean( Z == "tx" ),
sdY = sd( Y ) )
Our intermediate results:
sdat
# A tibble: 5 × 4
G EY pT sdY
<chr> <dbl> <dbl> <dbl>
1 A 0.464 0.542 0.953
2 B -0.522 0.562 0.800
3 C 0.185 0.5 0.850
4 D 0.184 0.565 1.07
5 E 0.0220 0.565 1.16
We can print this out in a much cleaner form using the kable()
method from the knitr
package:
::kable( sdat, digits = 2 ) knitr
G | EY | pT | sdY |
---|---|---|---|
A | 0.46 | 0.54 | 0.95 |
B | -0.52 | 0.56 | 0.80 |
C | 0.18 | 0.50 | 0.85 |
D | 0.18 | 0.57 | 1.07 |
E | 0.02 | 0.57 | 1.16 |
Say our grouping variable is a set of codes for something more special. We can merge in better names by first making a small “cross-walk” of the ID codes to the full names, and then merging them to our results:
= tribble( ~ G, ~ name,
names "A", "fred",
"B", "doug",
"C", "xiao",
"D", "lily",
"E", "unknown" )
names
# A tibble: 5 × 2
G name
<chr> <chr>
1 A fred
2 B doug
3 C xiao
4 D lily
5 E unknown
= left_join( sdat, names ) %>%
sdat relocate( name)
Joining with `by = join_by(G)`
Again, the easiest way to make a nice clean table is with the kable
command.
::kable( sdat, digits=2 ) knitr
name | G | EY | pT | sdY |
---|---|---|---|---|
fred | A | 0.46 | 0.54 | 0.95 |
doug | B | -0.52 | 0.56 | 0.80 |
xiao | C | 0.18 | 0.50 | 0.85 |
lily | D | 0.18 | 0.57 | 1.07 |
unknown | E | 0.02 | 0.57 | 1.16 |
This is a great workhorse table-making tool! There are expansion R packages as well, e.g. kableExtra
, which can do lots of fancy customization stuff.
6.1 Making a “table one”
The “table one” is the first table in a lot of papers that show general means of different variables for different groups. Perhaps not surprisingly, the tableone
package is useful for making such tables:
library(tableone)
# sample mean
CreateTableOne(data = dat,
vars = c("G", "Z", "X"))
Overall
n 100
G (%)
A 24 (24.0)
B 16 (16.0)
C 14 (14.0)
D 23 (23.0)
E 23 (23.0)
Z = tx (%) 55 (55.0)
X (mean (SD)) 0.02 (0.95)
# you can also stratify by a variables of interest
<- CreateTableOne(data = dat,
tb vars = c("X", "G", "Y"),
strata = c("Z"))
tb
Stratified by Z
co tx p test
n 45 55
X (mean (SD)) 0.21 (0.94) -0.14 (0.93) 0.065
G (%) 0.995
A 11 (24.4) 13 (23.6)
B 7 (15.6) 9 (16.4)
C 7 (15.6) 7 (12.7)
D 10 (22.2) 13 (23.6)
E 10 (22.2) 13 (23.6)
Y (mean (SD)) 0.00 (1.02) 0.19 (1.04) 0.365
You can then use kable
on your table as so:
print(tb$ContTable, printToggle = FALSE) %>%
::kable() knitr
co | tx | p | test | |
---|---|---|---|---|
n | 45 | 55 | ||
X (mean (SD)) | 0.21 (0.94) | -0.14 (0.93) | 0.065 | |
Y (mean (SD)) | 0.00 (1.02) | 0.19 (1.04) | 0.365 |
6.2 The stargazer package
You can easily make pretty tables using the stargazer
package. You need to ensure the data is a data.frame, not tibble, because stargazer
is old school. It appears to only do continuous variables. Stargazer is probably best known for making regression tables (see next chapter), but it can make other kinds of tables as well, such as data summaries.
When using stargazer
to summarize a dataset, you can specify that it should include only some of the variables and you can omit stats that are not of interest:
# to include only variables of interest
stargazer(as.data.frame(dat), header=FALSE,
omit.summary.stat = c("p25", "p75", "min", "max"),
# to omit percentiles
title = "Table 1: Descriptive statistics",
type = "text")
Table 1: Descriptive statistics
============================
Statistic N Mean St. Dev.
----------------------------
X 100 0.018 0.948
Y 100 0.101 1.027
----------------------------
See the stargazer
help file for how to set/change more of the options: https://cran.r-project.org/web/packages/stargazer/stargazer.pdf
Warning: stargazer
does not work well with tibbles (the data frames you get from tidyverse commands), so you need to convert your data to a data.frame before using it. In particular, you have to “cast” your data to a data.frame
to make it work:
library(stargazer)
# to include all variables
stargazer( as.data.frame(dat), header = FALSE, type="text")
To use stargazer
in a PDF or HTML report, you will want the report to format the table so it doesn’t look like raw output. To do so, you would not set type="text"
but rather type="latex"
or type="html"
, and then in the markdown chunk header (the thing that encloses all your R code) you would say “results=‘asis’” in your code chunk header like so:
This will ensure the output of stargazer gets formatted properly in your R Markdown.
Unfortunately, it is hard to dynamically make a report that can render to either html or a pdf, so you will have to choose one or the other. If you are making a PDF, you will want to use type="latex"
and if you are making an HTML report, you will want to use type="html"
.
6.3 The xtable
package
The xtable
package is another great package for making tables. It is particularly good for LaTeX documents. It is a bit more complicated to use than stargazer
, but it is very powerful. Here is an example of how to use it:
library(xtable)
xtable(sdat, caption = "A table of fake data" )
Here you would again use the “results=‘asis’” in the chunk header to get the table to render properly in your R Markdown document.