import os
import sys
= ["PWD", "SHELL", "USER", "PYTHONIOENCODING", "VIRTUAL_ENV", "RETICULATE_PYTHON", "R_HOME", "R_PLATFORM", "LD_LIBRARY_PATH", "R_LIBS_USER", "R_LIBS_SITE","RENV_PROJECT", "RSTUDIO_PANDOC", "RMARKDOWN_MATHJAX_PATH", "R_SESSION_INITIALIZED", "PYTHONPATH"]
itemlist
= list(set(itemlist) & set(os.environ))
itemlist
itemlist.sort()
for item in itemlist:
print(f'{item}{" : "}{os.environ[item]}')
## LD_LIBRARY_PATH : /home/susan/.virtualenvs/book/lib:/usr/lib:/usr/lib/R/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/bin/java/lib/server:/home/susan/.virtualenvs/book/lib:/usr/lib:/usr/lib/R/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/lib/server
## PWD : /home/susan/Projects/Class/stat-computing-r-python
## PYTHONIOENCODING : utf-8
## PYTHONPATH : /usr/lib/python311.zip:/usr/lib/python3.11:/usr/lib/python3.11/lib-dynload:/home/susan/.virtualenvs/book/lib/python3.11/site-packages:/home/susan/.cache/R/renv/cache/v5/linux-debian-bookworm/R-4.5/x86_64-pc-linux-gnu/reticulate/1.43.0/0b3db378d9940f6846a626b24352530a/reticulate/python
## RENV_PROJECT : /home/susan/Projects/Class/stat-computing-r-python
## RETICULATE_PYTHON : ~/.virtualenvs/book/bin/python
## RMARKDOWN_MATHJAX_PATH : /usr/lib/rstudio/resources/app/resources/mathjax-27
## RSTUDIO_PANDOC : /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64
## R_HOME : /usr/lib/R
## R_LIBS_SITE : /usr/local/lib/R/site-library/:/usr/local/lib/R/site-library/:/usr/local/lib/R/site-library:/usr/lib/R/site-library:/usr/lib/R/library:/usr/lib/R/library
## R_LIBS_USER : /home/susan/Projects/Class/stat-computing-r-python/renv/library/linux-debian-bookworm/R-4.5/x86_64-pc-linux-gnu
## R_PLATFORM : x86_64-pc-linux-gnu
## R_SESSION_INITIALIZED : PID=1137411:NAME="reticulate"
## SHELL : /bin/bash
## USER : susan
## VIRTUAL_ENV : /home/susan/.virtualenvs/book
print(sys.path)
## ['', '/usr/bin', '/usr/lib/python311.zip', '/usr/lib/python3.11', '/usr/lib/python3.11/lib-dynload', '/home/susan/.virtualenvs/book/lib/python3.11/site-packages', '/home/susan/.cache/R/renv/cache/v5/linux-debian-bookworm/R-4.5/x86_64-pc-linux-gnu/reticulate/1.43.0/0b3db378d9940f6846a626b24352530a/reticulate/python', '/home/susan/.virtualenvs/book/lib/python311.zip', '/home/susan/.virtualenvs/book/lib/python3.11', '/home/susan/.virtualenvs/book/lib/python3.11/lib-dynload']
Statistical Computing using R and Python
Preface
Content Overload!
This book is designed to explain and demonstrate statistical programming concepts and techniques. I started this project in Summer 2020 as a less-tedious way to learn programming compared to hours and hours of video lectures. I’ve always found that watching someone code and talk about code is not usually the best way to learn how to code. It’s far better to learn how to code by … coding, but it’s hard to start from nothing, too.
Because this book was begun as an alternative to recorded lectures with slides, I have included all of the comics, snark, and gifs that I would normally have put in lecture slides. I’ve also supplemented this with other things that you can’t usually put in slide presentations: YouTube videos, extra resources, links to other textbooks that are more specific than this one. My goal is to make this a collection of the best information I can find on data science and statistical programming.
There is a downside to this approach: in most cases, this book includes way more information than you need. Everyone starts with a different level of computing experience, so I’ve attempted to make this book comprehensive. Unfortunately, that means some sections will seem like they are stating the obvious, and some sections will have more detail than you ever wanted to know. Use this book in the way that works best for you - skip over the stuff you know already, ignore the stuff that seems too complex until you understand the basics. Come back to the scary stuff later and see if it makes more sense to you.
Book Format Guide
I’ve made an effort to use some specific formatting and enable certain features that make this book a useful tool for this class.
Operating System Instructions
Some instructions depend on your operating system. Where it’s shorter, I will use tabs to provide you with OS specific instructions. Here are the icons I will use:
Windows-specific instructions
Mac specific instructions
Linux specific instructions. I will usually try to make this generic, but if it’s gui based, my instructions will usually be for KDE.
These sections contain things you may want to look out for: common errors, mistakes, and unfortunate situations that may arise when programming.
These sections contain basic demonstrations of how functions or concepts work. They are slightly less interactive than examples.
Examples are intended to be interactive - you should attempt them before looking at the code solutions. Often, examples will have tabs which provide the problem, a sketch (sometimes), and then solutions in R and python.
The problem will be in the first tab for you to start with
A solution will be provided in R, potentially with an explanation.
A solution will be provided in Python as well, with an explanation of that code.
In some cases, the problem will be more open-ended and may not adhere to this format, but most example sections in this book will have solutions provided. I highly recommend that you attempt to solve the problem yourself before you look at the solutions - this is the best way to learn. Passively reading code does not result in information retention.
These sections will direct you to additional resources that may be helpful to consult as you learn about a topic. You do not have to use these sections unless you are 1) bored, or 2) hopelessly lost. They’re provided to help but are not expected reading (Unlike the essential reading sections in red).
Advanced
These sections are intended to apply to more advanced courses. If you are taking an introductory course, feel free to skip that content for now.
Expandable Sections
These are expandable sections, with additional information when you click on the line
This additional information may be information that is helpful but not essential, or it may be that an example just takes a LOT of space and I want to make sure you can skim the book without having to scroll through a ton of output.
Answers or punchlines may be hidden in this type of expandable section as well.
Analytics
I have enabled Google Analytics on this site for the purposes of measuring this work’s impact and use both in my own classes and elsewhere. I’m not using the individual tracking/ad-targeting settings (to the best of my knowledge) - my only purpose in using Google Analytics is to assess how often this site is used, and where its’ users are located at a rough (state/regional) level.
If you are using this site and aren’t affiliated with the University of Nebraska Lincoln, or have found it useful, please let me know by making a comment in Giscus (below) or sending me an email! These affirmations help me make a case that spending time on this resource is actually a good investment.
Acknowledgements
The cover of this book is an amalgam of different images by the lovely @allison_horst, which are released under the cc-by 4.0 license. I have modified them to remove most of the R package references and arrange them to represent the topics covered in this book.
Laptop icon used in the tab/logo created by Good Ware - Flaticon
Throughout this book, I have borrowed liberally from other online tutorials, published books, and blog posts. I have tried to ensure that I link to the source material throughout the book and provide appropriate credit to anyone whose examples I have used, modified, or repurposed. Special thanks to the tutorials provided by Posit/RStudio and the tidyverse project.
I don’t have official editors, but thank you to those who make use of the giscus comment box to let me know about issues and typos. So far, you’ve helped me fix at least 3 issues so far!
This book was built with the following parameters/settings/library versions:
library(devtools)
devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.5.1 (2025-06-13)
## os Debian GNU/Linux 12 (bookworm)
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/Chicago
## date 2025-09-03
## pandoc 3.4 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
## quarto 1.7.31 @ /usr/local/bin/quarto
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## ! package * version date (UTC) lib source
## P cachem 1.1.0 2024-05-16 [?] CRAN (R 4.5.1)
## P cli 3.6.5 2025-04-23 [?] CRAN (R 4.5.1)
## P devtools * 2.4.5 2022-10-11 [?] CRAN (R 4.5.1)
## P digest 0.6.37 2024-08-19 [?] CRAN (R 4.5.1)
## P ellipsis 0.3.2 2021-04-29 [?] CRAN (R 4.5.1)
## P evaluate 1.0.4 2025-06-18 [?] CRAN (R 4.5.1)
## P fastmap 1.2.0 2024-05-15 [?] CRAN (R 4.5.1)
## P fontawesome * 0.5.3 2024-11-16 [?] CRAN (R 4.5.1)
## P fs 1.6.6 2025-04-12 [?] CRAN (R 4.5.1)
## P glue 1.8.0 2024-09-30 [?] CRAN (R 4.5.1)
## P htmltools 0.5.8.1 2024-04-04 [?] CRAN (R 4.5.1)
## P htmlwidgets 1.6.4 2023-12-06 [?] CRAN (R 4.5.1)
## P httpuv 1.6.16 2025-04-16 [?] CRAN (R 4.5.1)
## P jsonlite 2.0.0 2025-03-27 [?] CRAN (R 4.5.1)
## P knitr 1.50 2025-03-16 [?] CRAN (R 4.5.1)
## P later 1.4.2 2025-04-08 [?] CRAN (R 4.5.1)
## P lattice 0.22-7 2025-04-02 [?] CRAN (R 4.5.1)
## P lifecycle 1.0.4 2023-11-07 [?] CRAN (R 4.5.1)
## P magrittr 2.0.3 2022-03-30 [?] CRAN (R 4.5.1)
## P Matrix 1.7-3 2025-03-11 [?] CRAN (R 4.5.1)
## P memoise 2.0.1 2021-11-26 [?] CRAN (R 4.5.1)
## P mime 0.13 2025-03-17 [?] CRAN (R 4.5.1)
## P miniUI 0.1.2 2025-04-17 [?] CRAN (R 4.5.1)
## P pkgbuild 1.4.8 2025-05-26 [?] CRAN (R 4.5.1)
## P pkgload 1.4.0 2024-06-28 [?] CRAN (R 4.5.1)
## P png 0.1-8 2022-11-29 [?] CRAN (R 4.5.1)
## P profvis 0.4.0 2024-09-20 [?] CRAN (R 4.5.1)
## P promises 1.3.3 2025-05-29 [?] CRAN (R 4.5.1)
## P purrr 1.1.0 2025-07-10 [?] CRAN (R 4.5.1)
## P R6 2.6.1 2025-02-15 [?] CRAN (R 4.5.1)
## P Rcpp 1.1.0 2025-07-02 [?] CRAN (R 4.5.1)
## P remotes 2.5.0 2024-03-17 [?] CRAN (R 4.5.1)
## renv 1.1.5 2025-07-24 [1] CRAN (R 4.5.1)
## P reticulate 1.43.0 2025-07-21 [?] CRAN (R 4.5.1)
## P rlang 1.1.6 2025-04-11 [?] CRAN (R 4.5.1)
## P rmarkdown 2.29 2024-11-04 [?] CRAN (R 4.5.1)
## P rstudioapi 0.17.1 2024-10-22 [?] CRAN (R 4.5.1)
## P sessioninfo 1.2.3 2025-02-05 [?] CRAN (R 4.5.1)
## P shiny 1.11.1 2025-07-03 [?] CRAN (R 4.5.1)
## P urlchecker 1.0.1 2021-11-30 [?] CRAN (R 4.5.1)
## P usethis * 3.1.0 2024-11-26 [?] CRAN (R 4.5.1)
## P vctrs 0.6.5 2023-12-01 [?] CRAN (R 4.5.1)
## P xfun 0.52 2025-04-02 [?] CRAN (R 4.5.1)
## P xtable 1.8-4 2019-04-21 [?] CRAN (R 4.5.1)
## P yaml 2.3.10 2024-07-26 [?] CRAN (R 4.5.1)
##
## [1] /home/susan/Projects/Class/stat-computing-r-python/renv/library/linux-debian-bookworm/R-4.5/x86_64-pc-linux-gnu
## [2] /home/susan/.cache/R/renv/sandbox/linux-debian-bookworm/R-4.5/x86_64-pc-linux-gnu/9a444a72
##
## * ── Packages attached to the search path.
## P ── Loaded and on-disk path mismatch.
##
## ─ Python configuration ───────────────────────────────────────────────────────
## python: /home/susan/.virtualenvs/book/bin/python
## libpython: /usr/lib/python3.11/config-3.11-x86_64-linux-gnu/libpython3.11.so
## pythonhome: /home/susan/.virtualenvs/book:/home/susan/.virtualenvs/book
## version: 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0]
## numpy: /home/susan/.virtualenvs/book/lib/python3.11/site-packages/numpy
## numpy_version: 2.2.6
##
## NOTE: Python version was forced by RETICULATE_PYTHON
##
## ──────────────────────────────────────────────────────────────────────────────