Quarto đź“–

MATH/COSC 3570 Introduction to Data Science

Dr. Cheng-Han Yu
Department of Mathematical and Statistical Sciences
Marquette University

Hello Quarto

Sharing Data Science Products using Quarto

What and Why – Full Reproducibility

  • [What] data science publishing system
  • Use a single Quarto file (.qmd) to
    • weave together narrative text and code
    • produce elegantly formatted outputs: word/pdf, webpages, blogs, books, etc.
  • [Why] Fully reproducible reports
    • Have code, results, and text in the same document
    • Results are generated from the source code, and your documents are automatically updated if your data or code changed.

Why Quarto – Various Types of Output

  • Support dozens of static and dynamic/interactive output formats!

Moving Between Formats Straightforward

HTML Document

{{}} datascience.qmd

title: "Data Science"
format: html

Presentation Slides

{{}} datascience.qmd

title: "Data Science"
format: revealjs

Website

{{}} _quarto.yml

project:
  type: website

website: 
  navbar: 
    left:
      - datascience.qmd

Why Quarto1 – Integrate Multiple Languages

Quarto Is Built on Pandoc

  • R uses knitr and Python/Julia uses Jupyter to evaluate our code and turn qmd into md file.

Why Quarto – Comfort of Your Own Workspace

A screenshot of a Quarto document rendered inside RStudio

A screenshot of a Quarto document rendered inside JupyterLab

A screenshot of a Quarto document rendered inside VSCode

Why Quarto – Simple Markdown Syntax for Text

To generate a PDF report, you prefer writing this with 24 lines…

Why Quarto – Simple Markdown Syntax for Text

Or this with 250 lines!?

Markdown

  • Quarto is based on markdown, a markup language that is widely used to generate HTML pages.
  • Markdown is a lightweight and easy-to-use syntax for styling the writing on the GitHub platform.
  • We go through basic (Pandoc’s) Markdown syntax together, and you can learn more at:


Quarto file = plain text file with extension .qmd

---
title: "ggplot2 demo"
date: "1/25/2024"
format: html
---

## Cars
There is a relationship between *miles per gallon* and *displacement*.

```{r}
mpg |> ggplot(aes(x = displ, y = hwy)) + 
  geom_point()
```

YAML Header (“YAML Ain’t Markup Language”)

---
key: value
---

Markdown Text

Code Chunk

```{r}
## code right here
```

02-Quarto File

  • Go to your GitHub repo lab-yourusername. Clone it to your Posit Cloud as a project in 2024-Spring-Math-3570 workspace.

  • Open the file lab.qmd.

  • Change author in YAML.

  • Click on or Ctrl/Cmd + Shift + K to produce a HTML document.

  • How can we show the current date every time we compile the file? [Hint:] Check your hw00. Compile your document and make sure the date shows up.

  • How do we produce a pdf? Describe it in ## Lab 2: Quarto

  • Once done, commit with message “02-Quarto File” and push your version to GitHub.

Text Formatting

Markdown Syntax Output
*italics* and **bold**
italics and bold
superscript^2^ / subscript~2~
superscript2 / subscript2
~~strikethrough~~
strikethrough
`verbatim code`
verbatim code
> here is the quote

here is the quote

Headings

Markdown Syntax Output
# Header 1

Header 1

## Header 2

Header 2

### Header 3

Header 3

#### Header 4

Header 4

##### Header 5
Header 5
###### Header 6
Header 6

Lists

Markdown Syntax Output
* unordered list
    + sub-item 1
    + sub-item 2
        - sub-sub-item 1
  • unordered list

    • sub-item 1

    • sub-item 2

      • sub-sub-item 1
*   item 2
    <new line>
    Continued (indent 4 spaces)
  • item 2

    Continued (indent 4 spaces)

1. ordered list
2. item 2
    i) sub-item 1
         A.  sub-sub-item 1
  1. ordered list

  2. item 2

    1. sub-item 1

      1. sub-sub-item 1

Tables

| Right | Left | Default | Center |
|------:|:-----|---------|:------:|
|   12  |  12  |    12   |    12  |
|  123  |  123 |   123   |   123  |
|    1  |    1 |     1   |     1  |
Right Left Default Center
12 12 12 12
123 123 123 123
1 1 1 1

Source vs. Visual Mode

Source Mode

Visual Mode (What You See Is What You Mean (WYSIWYM))

Data as Table

knitr::kable() can turn dataframes into tables.

head(mtcars) |> 
    knitr::kable()
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.62 16.5 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.88 17.0 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.21 19.4 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1

03-Markdown

  • Back to your lab.qmd. In ## Lab 3: Markdown section, add a self-introduction paragraph containing a header, bold and italic text.

  • Add another paragraph that contains

    • listed items
    • a hyperlink
    • a blockquote
    • math expression
  • Once done, commit with message “03-Markdown” and push your updated work to GitHub.

Code

Anatomy of a Code Chunk

```{r}
#| label: car-stuff
#| echo: false
mtcars |> 
  distinct(cyl)
```


```{python}
#| label: string
#| eval: false
x = 'hello, python world!'
print(x.split(' '))
```
  • Has 3x backticks ``` on each end

  • To insert a code chunk,

  • Alt + Ctrl + I (Win)

  • Option + Cmd + I (Mac)

  • Indicate engine (r) between curly braces {r}

  • Place options underneath, behind the #| (hashpipe): #| option1: value

  • Tools > Modify Keyboard Shortcuts > Filter… > Insert Chunk Python > Cmd + P

Option echo

  • If you simply want code highlighting, you can use 3x backticks + the language ```r
```r
head(mtcars)
```

Which returns the below but is not executed since there aren’t {} around the language:

head(mtcars)


  • If you instead want to see source code and evaluate it, you could use echo: true where echo: false would instead hide the code but still evaluate it.
```{r}
#| echo: true
1 + 1
```
1 + 1
[1] 2

Chunk Options

  • The following table summarises which types of output each option suppresses:
Option Run code Show code Output Plots Messages Warnings
eval: false x x x x x
include: false x x x x x
echo: false x
results: "hide" x
fig-show: "hide" x
message: false x
warning: false x
  • Check knitr for more chunk options.

Global Options: execute

  • Should be specified within the execute key.
---
execute: 
  echo: false
  eval: false
---

Images

Basic markdown syntax:

![Maru](images/05-quarto/maru1.jpg)

Maru

Figures w/ code

```{r}
#| out-width: 40%
#| fig-align: right

knitr::include_graphics("images/05-quarto/maru1.jpg")
```

Including Plots

  • Many chunk options for figures and images start with fig-, for example fig-width, fig-height, fig-show, etc.


```{r}
#| eval: false
#| fig-cap: "Fig. 1: Car stuff"
plot(x = cars$speed, y = cars$dist)
```

Fig. 1: Car stuff

Divs and Spans

This is text with [special]{style="color:red"} formatting.

This is text with special formatting.


::: {style="color:red"}
This content can be styled with a border
:::

This content can be styled with a border

Divs and Spans

  • Think of a ::: div as a HTML <div> but it can also apply in specific situations to content in PDF

  • [text]{.class} spans can be thought of a <span .class>Text</span>

    ::: {.border}
    This content can be styled with a border
    :::
    
    <div class="border">
      <p>This content can be styled with a border</p>
    </div>

Subfigures Fenced div Class

::: {#fig-maru layout-ncol=2}

![Loaf](images/05-quarto/maru2.jpg){#fig-loaf width="250px"}
![Lick](images/05-quarto/maru3.jpg){#fig-lick width="250px"}
Two states of Maru

:::
(a) Loaf
(b) Lick
Figure 1: Two states of Maru

Inline Code

  • Inside your text you can include code with the syntax `r your-r-code`.
  • For example, `r 4 + 5` would output 9 in your text.
head(cars)
  speed dist
1     4    2
2     4   10
3     7    4
4     7   22
5     8   16
6     9   10
num_cars <- nrow(cars)

Code in Quarto

There are `r num_cars` rows in the cars dataset. Four plus five is `r 4 + 5`

Output

There are 50 rows in the cars dataset. Four plus five is 9

04-Code Chunk

  • In lab.qmd ## Lab 4: Code Chunk, use code chunks to

  • Add option fig-height: 4, fig-width: 6 and fig-align: right to your plot. What are the changes?

  • How do we set global chunk options to hide and run code in every chunk?

  • Once done, commit with message “04-Code Chunk” and push your work to GitHub.

Quarto Skills to the Next Level