R Course
1
Preface
1.1
Notation Conventions
1.2
Acknowledgements
2
Introduction
2.1
What is R?
2.2
The R Ecosystem
2.3
Bibliographic Notes
3
R Basics
3.0.1
Other IDEs
3.1
File types
3.2
Simple calculator
3.3
Probability calculator
3.4
Getting Help
3.5
Variable Assignment
3.6
Missing
3.7
Piping
3.8
Vector Creation and Manipulation
3.9
Search Paths and Packages
3.10
Simple Plotting
3.11
Object Types
3.12
Data Frames
3.13
Exctraction
3.14
Augmentations of the data.frame class
3.15
Data Import and Export
3.15.1
Import from WEB
3.15.2
Import From Clipboard
3.15.3
Export as CSV
3.15.4
Export non-CSV files
3.15.5
Reading From Text Files
3.15.6
Writing Data to Text Files
3.15.7
.XLS(X) files
3.15.8
Massive files
3.15.9
Databases
3.16
Functions
3.17
Looping
3.18
Apply
3.19
Recursion
3.20
Strings
3.21
Dates and Times
3.21.1
Dates
3.21.2
Times
3.21.3
lubridate Package
3.22
Complex Objects
3.23
Vectors and Matrix Products
3.24
RStudio Projects
3.25
Bibliographic Notes
3.26
Practice Yourself
4
data.table
4.1
Make your own variables
4.2
Join
4.3
Reshaping data
4.3.1
Wide to long
4.3.2
Long to wide
4.4
Bibliographic Notes
4.5
Practice Yourself
5
Exploratory Data Analysis
5.1
Summary Statistics
5.1.1
Categorical Data
5.1.2
Continous Data
5.2
Visualization
5.2.1
Categorical Data
5.2.2
Continuous Data
5.3
Mixed Type Data
5.3.1
Alluvial Diagram
5.4
Bibliographic Notes
5.5
Practice Yourself
6
Linear Models
6.1
Problem Setup
6.2
OLS Estimation in R
6.3
Inference
6.3.1
Testing a Hypothesis on a Single Coefficient
6.3.2
Constructing a Confidence Interval on a Single Coefficient
6.3.3
Multiple Regression
6.3.4
ANOVA (*)
6.3.5
Testing a Hypothesis on a Single Contrast (*)
6.4
Extra Diagnostics
6.4.1
Diagnosing Heteroskedasticity
6.4.2
Diagnosing Multicolinearity
6.5
Bibliographic Notes
6.6
Practice Yourself
7
Generalized Linear Models
7.1
Problem Setup
7.2
Logistic Regression
7.2.1
Logistic Regression with R
7.3
Poisson Regression
7.4
Extensions
7.5
Bibliographic Notes
7.6
Practice Yourself
8
Linear Mixed Models
8.1
Problem Setup
8.1.1
Non-Linear Mixed Models
8.1.2
Generalized Linear Mixed Models (GLMM)
8.2
LMMs in R
8.2.1
A Single Random Effect
8.2.2
A Full Mixed-Model
8.2.3
Sparsity and Memory Efficiency
8.3
Serial Correlations in Space/Time
8.4
Extensions
8.4.1
Cluster Robust Standard Errors
8.4.2
Linear Models for Panel Data
8.4.3
Testing Hypotheses on Correlations
8.5
Bibliographic Notes
8.6
Practice Yourself
9
Multivariate Data Analysis
9.1
Signal Detection
9.1.1
Hotelling’s T2 Test
9.1.2
Various Types of Signal to Detect
9.1.3
Simes’ Test
9.1.4
Signal Detection with R
9.2
Signal Counting
9.3
Signal Identification
9.3.1
Signal Identification in R
9.4
Signal Estimation (*)
9.5
Bibliographic Notes
9.6
Practice Yourself
10
Supervised Learning
10.1
Problem Setup
10.1.1
Common Hypothesis Classes
10.1.2
Common Complexity Penalties
10.1.3
Unbiased Risk Estimation
10.1.4
Collecting the Pieces
10.2
Supervised Learning in R
10.2.1
Linear Models with Least Squares Loss
10.2.2
SVM
10.2.3
Neural Nets
10.2.4
Classification and Regression Trees (CART)
10.2.5
K-nearest neighbour (KNN)
10.2.6
Linear Discriminant Analysis (LDA)
10.2.7
Naive Bayes
10.2.8
Random Forrest
10.2.9
Boosting
10.3
Bibliographic Notes
10.4
Practice Yourself
11
Unsupervised Learning
11.1
Dimensionality Reduction
11.1.1
Principal Component Analysis
11.1.2
Dimensionality Reduction Preliminaries
11.1.3
Latent Variable Generative Approaches
11.1.4
Purely Algorithmic Approaches
11.1.5
Dimensionality Reduction in R
11.2
Clustering
11.2.1
Latent Variable Generative Approaches
11.2.2
Purely Algorithmic Approaches
11.2.3
Clustering in R
11.3
Bibliographic Notes
11.4
Practice Yourself
12
Plotting
12.1
The graphics System
12.1.1
Using Existing Plotting Functions
12.1.2
Exporting a Plot
12.1.3
Fancy graphics Examples
12.2
The ggplot2 System
12.2.1
Extensions of the ggplot2 System
12.3
Interactive Graphics
12.3.1
Plotly
12.4
Other R Interfaces to JavaScript Plotting
12.5
Bibliographic Notes
12.6
Practice Yourself
13
Reports
13.1
knitr
13.1.1
Installation
13.1.2
Pandoc Markdown
13.1.3
Rmarkdown
13.1.4
BibTex
13.1.5
Compiling
13.2
bookdown
13.3
Shiny
13.3.1
Installation
13.3.2
The Basics of Shiny
13.3.3
Beyond the Basics
13.3.4
shinydashboard
13.4
flexdashboard
13.5
Bibliographic Notes
13.6
Practice Yourself
14
Sparse Representations
14.1
Sparse Matrix Representations
14.1.1
Coordinate List Representation
14.1.2
Compressed Row Oriented Representation
14.1.3
Compressed Column Oriented Representation
14.1.4
Sparse Algorithms
14.2
Sparse Matrices and Sparse Models in R
14.2.1
The Matrix Package
14.2.2
The glmnet Package
14.2.3
The MatrixModels Package
14.2.4
The SparseM Package
14.3
Beyond Sparsity
14.4
Apache Arrow
14.5
Bibliographic Notes
14.6
Practice Yourself
15
Memory Efficiency
15.1
Efficient Computing from RAM
15.1.1
Summary Statistics from RAM
15.2
Computing from a Database
15.3
Computing From Efficient File Structrures
15.3.1
bigmemory
15.3.2
bigstep
15.4
ff
15.5
disk.frame
15.6
matter
15.7
iotools
15.8
HDF5
15.9
DelayedArray
15.10
Computing from a Distributed File System
15.11
Bibliographic Notes
15.12
Practice Yourself
16
Parallel Computing
16.1
When and How to Parallelise?
16.2
Terminology
16.2.1
Hardware:
16.2.2
Software:
16.3
Parallel R
16.3.1
Starting a New R Processes
16.3.2
Inter-process Communication
16.3.3
The parallel Package
16.3.4
The foreach Package
16.3.5
Rdsm
16.3.6
pbdR
16.4
Parallel Extensions
16.4.1
Parallel Linear Algebra
16.4.2
Parallel Data Munging with data.table
16.4.3
Spark
16.4.4
H2O
16.5
Caution: Nested Parallelism
16.6
Bibliographic Notes
16.7
Practice Yourself
17
Numerical Linear Algebra
17.1
LU Factorization
17.2
Cholesky Factorization
17.3
QR Factorization
17.4
Singular Value Factorization
17.5
Iterative Methods
17.6
Solving the OLS Problem
17.7
Numerical Libraries for Linear Algebra
17.7.1
OpenBlas
17.7.2
MKL
17.8
Bibliographic Notes
17.9
Practice Yourself
18
Convex Optimization
18.1
Theoretical Backround
18.2
Optimizing with R
18.2.1
The optim Function
18.2.2
The nloptr Package
18.2.3
minqa Package
18.3
Bibliographic Notes
18.4
Practice Yourself
19
RCpp
19.1
Bibliographic Notes
19.2
Practice Yourself
20
Debugging Tools
20.1
Bibliographic Notes
20.2
Practice Yourself
21
The Hadleyverse
21.1
readr
21.2
dplyr
21.3
tidyr
21.4
reshape2
21.5
stringr
21.6
anytime
21.7
Biblipgraphic Notes
21.8
Practice Yourself
22
Causal Inferense
22.1
Causal Inference From Designed Experiments
22.1.1
Design of Experiments
22.1.2
Randomized Inference
22.2
Causal Inference from Observational Data
22.2.1
Principal Stratification
22.2.2
Instrumental Variables
22.2.3
Propensity Scores
22.2.4
Direct Lieklihood
22.2.5
Regression Discontinuity
22.3
Bibliographic Notes
22.4
Practice Yourself
Published with bookdown
R (BGU course)
Chapter 18
Convex Optimization
TODO
18.1
Theoretical Backround
18.2
Optimizing with R
18.2.1
The optim Function
18.2.2
The nloptr Package
18.2.3
minqa Package
18.3
Bibliographic Notes
Task views
18.4
Practice Yourself