Skip to main content

Basics

R is a powerful language used for data analysis, statistical computing, and graphical representation. This guide will introduce the basic syntax, data structures, control structures, and functions in R, providing a solid foundation for beginners.

Basic Syntax

Elements of a Program

  • Data Structures: Information carriers (vectors, matrices, data frames, lists)
  • Algorithms: Steps to complete tasks

Data Structures

Basic Data Types

  1. Numeric: Includes integers and floating-point numbers
  2. Character: Represents textual information
  3. Logical: Boolean values (TRUE, FALSE)

Vectors

A vector is a sequence of elements of the same type. A scalar is a vector of length 1.

Creating Numeric Vectors

c(1.70, 1.72, 1.80, 1.66, 1.65, 1.88)

Checking Data Types

typeof(3.14)
class(3.14)

Operations on Vectors

heights <- c(1.70, 1.72, 1.80, 1.66, 1.65, 1.88)
mean(heights)
sd(heights)

Character Vectors

c("Male", "Female", "Female", "Male")

Factors

Factors are used to represent categorical data with levels.

sex <- factor(c("Male", "Female", "Female", "Male"))
levels(sex)

Logical Vectors

heights > 1.7
heights[heights > 1.7]

Arrays and Matrices

2D and higher-dimensional arrays.

matrix(1:12, nrow = 4, ncol = 3)

Data Frames

Data frames can hold different types of data.

df <- data.frame(
sex = c("F", "M", "M", "F"),
age = c(17, 29, 20, 33),
heights = c(1.66, 1.84, 1.83, 1.56)
)
str(df)

Lists

Lists can hold different types of elements.

l <- list(
sex = c("F", "M"),
age = c(17, 29, 20),
heights = c(1.66, 1.84, 1.83, 1.56)
)
l$sex

Control Structures

Conditional Statements

If-else

age <- 16
if (age > 18) {
message("You are an adult!")
} else {
message("You are a child!")
}

Switch

ch <- "b"
switch(EXPR = ch, a = 1, b = 2:3)

Looping Structures

For Loop

for (i in 1:10) {
print(i)
}

While Loop

v <- 10
while(v > 2) {
print(v)
v <- v - 1.1
}

Repeat Loop

i <- 1
repeat {
print(i)
i <- i * 2
if (i > 100) break
}

Functions and Functional Programming

Creating Functions

customMean <- function(x) {
s <- i <- 0
for (j in x) {
s <- s + j
i <- i + 1
}
return(s / i)
}

Scope

Variables within functions have local scope.

a <- 3
Sum <- function(b) {
a + b
}
Sum(10)

Passing Functions

Functions can be passed as arguments.

f <- function(x, fun) {
fun(x)
}
f(1:10, mean)

Packages

Installing Packages

install.packages("ggplot2")
BiocManager::install("maftools")
remotes::install_github("tidyverse/ggplot2")

Loading Packages

library(ggplot2)

Practical Example: ROC Curve Calculation and Plotting

Background and Goal

Implement a function to compute ROC curve and AUC for binary classification.

Implementation

Define a function to calculate true positive rate, false positive rate, and plot the ROC curve.

Explanation

Discuss the steps, code logic, and how it integrates various R concepts.

Common Problems and Solutions

Complex Numbers

Complex numbers can be represented using i for the imaginary part.

1 + 2i

Differences between = and <-

The <- operator is preferred for assignment, especially in function calls.

x <- NULL
customMean(x <- 1:100)
x

Using :: and :::

:: accesses exported functions, while ::: accesses internal functions of a package.

xfun:::base_pkgs()

Factor Reordering

Convert factors to characters before adding new levels.

sex <- factor(c(as.character(sex), "M", "M"))