R Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. Note that R Markdown works in R Studio, and the packages rmarkdown and knitr must have been installed. In this handout we will discuss the basics of R Markdown. If you are able to duplicate this handout, you have covered the basics!

1. Creating an R Markdown Document

Starting a new R Markdown document is very easy. Simply follow the path File -> New File -> R Markdown, or use the drop-down menu on the top-left corner of R Studio. A dialogue box will appear where you will be able to enter document title and author name, and select the output format to be HTML, PDF or WORD. Choose HTML for this exercise. The default document will come with some basic code. You may update this code and start creating your own document. When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

2. General Typesetting

Let’s first discuss how to add web links to a document. If you simply type the http address, it will appear as a link in the final document. For example, http://rmarkdown.rstudio.com. More often, we want to type a linked phrase, which can be done by typing the phrase in square brackets, followed by the http address in regular paranthesis. For example, R Markdown.

R Markdown website has all the resources you need, just follow the Get Started tab to browse the resources. The site contains an excellent Cheat Sheet that you you may consult anytime you use R Markdown. Another very useful introductory resource is the Markdown Basics

As you may see in the Cheat Sheet (part 3, syntax) you may create an unordered list by simply putting a star at the beginning of a new line. Let us create an unordered list which includes the Basic Components of a Professional Document.

Basic Components of a Professional Document

  • Title, author, date (created automatically in R Markdown)
  • Headers for sections and subsections
  • Typesetting basics such as boldface, italic and color
  • Ordered and unordered lists
  • Web links (covered above)
  • Mathematical expressions
  • Tables
  • Graphs, images
  • Code

In what follows, we will address each component briefly.

Verbatim Mode Using the Backtick

A very important keystroke in R Markdown is the backtick: `. While knitting, R Markdown will not compile the code inside two backticks, instead it will only include it in the text with a typewriter font. In general typesetting practice, this is known as verbatim mode. In this document we use verbatim mode for writing a syntax without seeing its usual effect. If you cannot find the backtick sign on your keyboard, simply copy from here and paste.

Headers for Sections and Subsections

As you may see in the Cheat Sheet (part 3, syntax) or the Markdown Basics, headers must start with # signs. More # signs produce smaller headers, indicating a subsection. Depending on your taste, a section or subsection may start with a number.

Typesetting Basics

You may use **boldface** for boldface and *italic* for italic text. If you want to change the color of some text, use <span style="color:red">your text here</span>

Ordered and Unordered Lists

In ordered lists, items start on a new line with numbers followed with a dot. In unordered lists, items start on a new line with *. See the Cheat Sheet for details and sub-items. Here is an example:

  1. Stuff
  2. More stuff
  • sub-item 1
  • sub-item 2
  1. Even more stuff

Mathematical Expressions

The leading typesetting tool in the scientific community is LaTex (read as lay-tech), which is known for its beautiful mathematical expressions. LaTex type expressions can easily be included in an R Markdown document. If you want to produce an inline expression, just write the LaTex expression inside $ signs, which will put you to the math mode. For example, $y=x^2$ will produce \(y=x^2\). See the below section More on LaTex Equations for more details.

Simple Tables

Simple tables may be created by separating columns with | sign and adding -|- below the first row to indicate that this is a table. See the Cheat Sheet for a simple example and produce the following.

Column 1 Column 2
Cell a Cell b
Cell c Cell d

R data frames may be displayed as tables in R Markdown. Details are in the below section Tables in R Markdown.

3. Code Chunks in R Markdown

The biggest advantage of using R Markdown is the fact that R code may be embedded to the source code, and the output for this code may be included in the final document. The R code must be written into the so called code chunks. You can quickly insert a code chunk into your file with one of these three methods:

For example, let us examine the mtcars data set available in R. In this simple code, we will display the first six rows of data using head() and get a simple numerical summary using summary().

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000

As you see, the final document displays the original code and the R output generated from this code. Sometimes we may want the output in the final document, but not the code. Other times we want the code, but not the output. Sometimes we want both. To accomodate all these needs, you may use the chunk options. Chunk output can be customized with these options, which are set in the chunk header. Below are some basic options. Click here for more details on code chunks.

As a practice, here is a little code for calculating the mean of the mpg column in the mtcars data. I will set echo parameter to FALSE by typing ```{r,echo=FALSE} in the chunk header, so you will only see the output, not the code.

## [1] 20.09062

Hiding the code is especially very useful for adding powerful R graphics to your documents. Here is a simple example producing the histogram of mpg column in the mtcars data. Code is hidden.

4. Tables in R Markdown

We covered hot to create simple tables above, but for larger tables this is not practical. Moreover, the way we displayed the mtcars data above is also not very professional. Since we have the data available as an R data frame, we may use the kable() function of the knitr package to display the data nicely. Again, we wil only display the first six rows to save space here. As you see, this way of presenting data is much more professional.

library(knitr)
kable(mtcars[1:6,])
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

5. Including Images

Sometimes we may want to include images to our document. In R Markdown, this can be done by include_graphics() function of the knitr package. You have to specify the path for your image file which have either .pdf, .jpg, or .png extensions. Here is one example.

include_graphics("jamaica.jpg")

6. More on LaTex Equations

Beautiful mathematical typesetting is a very valuable skill. As mentioned above, R Markdown may call LaTex functions to produce high quality expressions. LaTex is not easy to learn, and that is not the purpose of this course. However, using it inside R Markdown is easy. Here is a nice web page summarizing some LaTex functions. Let us practice some basics now.

If you want to write a mathematical expression as a new line (equation), write the LaTex expression inside $$ signs. For example, $$y=\frac{x^2}{2}.$$ will produce \[y=\frac{x^2}{2}.\] Let us write the simple linear regression model: \[ Y=\beta_0+\beta_1 X + \epsilon, \] where \(Y\) is the dependent variable, \(X\) is the independent variable, \(\beta_0\) nad \(\beta_1\) are the regression coefficients, and \(\epsilon\) is the random error term with \(\epsilon\sim N(0,\sigma)\). Let us write the probability density function (p.d.f.) of a normal distribution with mean \(\mu\) and standard deviation \(\sigma\):

\[ f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}, \] where \(-\infty<x<\infty\), \(-\infty<\mu<\infty\), and \(\sigma>0\). Just for fun, let us plot the p.d.f. of a standard normal distribution. You may control the size of your plot using the fig.height and fig.width parameters by typing ```{r,fig.height=3,fig.width=5} in the code chunk header.

x=seq(-4,4,by=0.1)
y=dnorm(x,mean=0,sd=1)
plot(x,y,type="l",lwd=2,col="steelblue",main="Standard Normal Distribution",xlab="",ylab="")