R Programming Nuts and Bolts — Comprehensive Tutorial

Author

CEU, AIIMS Bhopal

Published

November 1, 2025

Introduction

Welcome to R Programming Nuts and Bolts — a foundational tutorial designed for beginners and classroom teaching.
This guide covers the basics of R programming in a structured, example-driven format with visible outputs for immediate understanding.

title: “r-nut_bolt” author: “cfm” date: ‘2022-10-06’ output: html_document: df_print: paged

What we will be looking at-

history R programming

atomic data types

basic arithmetic operations

Subset R objects

The explicit coercion feature of R

missing (NA) values

knitr::opts_chunk$set(echo = TRUE)

R

dialect of S

Internal analysis environment used in 1976

1988- rewritten in C +

version 2004 and 2008 -S Plus

r is the implementation of s

an interactive environment where programming is not required initially and you may still do analyis….as the needs became clearer and their sophistication increased one can move slightly into programming, when the language and system aspects would become more important.

Ross and Robert -1991 ,Newzealand created R

basic R system - two conceptual parts. There is the base R system download from a CRAN which is the comprehensive R archive network and everything else

Packages in R Programming language -set of R functions, compiled code, and sample data. These are stored under a directory called “library” within the R environment.

base package has all the kind of low level fundamental functions that you need to run the R system.

other packages contained in the base system (for example util stats, data sets, graphics….) fundamental packages that everyone will use

+ 18000+ add on packages ….which makes your life easier

R console inputs and evaluation

<-(less than symbol followed by a hyphen -assignment operator )

x<-1           #expressions
print(x)## next expression 
[1] 1
y<-2
x+y ## add the values of x and y 
[1] 3
# [1] 3- []-indicate the what element of bracket is shown 
SHAKTI<-x+y
z <- SHAKTI
z
[1] 3

objects- all the things we encounter in R can be classified into objects

R has five basic atomic classes of objects.

character,numerical ,integer,logical and complex

x<-"welcome to workshop"
x
[1] "welcome to workshop"
class(x)
[1] "character"
y<-c("a","b","c","d")
class(y)
[1] "character"
3.5->x
y<-2.5
x-y->z
z
[1] 1
class(z)
[1] "numeric"
z*2->z1
z1
[1] 2
class(z1)
[1] "numeric"
## There is a  explicit way to mention integer using capital L suffix 
2L+3L->p
 ## integer- a whole number with a fraction 
class(p)
[1] "integer"
2.3L+3.5L->p #( see the warning message )
p
[1] 5.8
class(p)
[1] "numeric"
## Logical 
a<-c(2,3,NA ,6,1,NA)
is.na(a)->b
b
[1] FALSE FALSE  TRUE FALSE FALSE  TRUE
class(b)
[1] "logical"
a<-c(2,3,5,6,7,3)
length(a)
[1] 6
b<- a>4
b
[1] FALSE FALSE  TRUE  TRUE  TRUE FALSE
class(b)
[1] "logical"
## complex -A complex value in R is defined via the pure imaginary value i.

z =1+ 2i   # create a complex number 
 z              # print the value of z 
[1] 1+2i
class(z)       # print the class name of z 
[1] "complex"

DATA TYPE

Vector

multiple copies of single type of object

v1<-c("ram","shyam","rahul","radha")
class(v1)
[1] "character"
v2<-c(1:49)
class(v2)
[1] "integer"
v3<-c(v2<10)
v3
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE
class(v3)
[1] "logical"
### what if I mix different atomic class of object 
##Implicit coercion- R will create a resulting vector with a mode that can most easily accommodate all the elements it contains. This conversion between modes of storage is called “coercion”. When R converts the mode of storage based on its content, it is referred to as “implicit coercion”. (every object of the vector becomes same class.)
o4<-c(v1,v2,v3)
class(o4)
[1] "character"
str(o4)
 chr [1:102] "ram" "shyam" "rahul" "radha" "1" "2" "3" "4" "5" "6" "7" "8" ...
##create a vector and you mix two different types of objects
o5<-c(1,2,3,"ram","shyam")
sir<-c(1,2,3,8.9,9.23)
o6<-c(o5,sir)
length(o6)
[1] 10
## explicit coercion
a<-3
a<-as.character(a)
a
[1] "3"
class(a)
[1] "character"
b<-as.logical(sir<5)
b
[1]  TRUE  TRUE  TRUE FALSE FALSE
class(b)
[1] "logical"

Lists- club vectors of differnt type

mylist<-list(v1,v2,v3,o4,o5)
str(mylist)
List of 5
 $ : chr [1:4] "ram" "shyam" "rahul" "radha"
 $ : int [1:49] 1 2 3 4 5 6 7 8 9 10 ...
 $ : logi [1:49] TRUE TRUE TRUE TRUE TRUE TRUE ...
 $ : chr [1:102] "ram" "shyam" "rahul" "radha" ...
 $ : chr [1:5] "1" "2" "3" "ram" ...
mylist
[[1]]
[1] "ram"   "shyam" "rahul" "radha"

[[2]]
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

[[3]]
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE

[[4]]
  [1] "ram"   "shyam" "rahul" "radha" "1"     "2"     "3"     "4"     "5"    
 [10] "6"     "7"     "8"     "9"     "10"    "11"    "12"    "13"    "14"   
 [19] "15"    "16"    "17"    "18"    "19"    "20"    "21"    "22"    "23"   
 [28] "24"    "25"    "26"    "27"    "28"    "29"    "30"    "31"    "32"   
 [37] "33"    "34"    "35"    "36"    "37"    "38"    "39"    "40"    "41"   
 [46] "42"    "43"    "44"    "45"    "46"    "47"    "48"    "49"    "TRUE" 
 [55] "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "FALSE"
 [64] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
 [73] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
 [82] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
 [91] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
[100] "FALSE" "FALSE" "FALSE"

[[5]]
[1] "1"     "2"     "3"     "ram"   "shyam"
mylist[[1]]
[1] "ram"   "shyam" "rahul" "radha"
mylist[4]
[[1]]
  [1] "ram"   "shyam" "rahul" "radha" "1"     "2"     "3"     "4"     "5"    
 [10] "6"     "7"     "8"     "9"     "10"    "11"    "12"    "13"    "14"   
 [19] "15"    "16"    "17"    "18"    "19"    "20"    "21"    "22"    "23"   
 [28] "24"    "25"    "26"    "27"    "28"    "29"    "30"    "31"    "32"   
 [37] "33"    "34"    "35"    "36"    "37"    "38"    "39"    "40"    "41"   
 [46] "42"    "43"    "44"    "45"    "46"    "47"    "48"    "49"    "TRUE" 
 [55] "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "TRUE"  "FALSE"
 [64] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
 [73] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
 [82] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
 [91] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "FALSE"
[100] "FALSE" "FALSE" "FALSE"
## now what i apply [[ ]] instead of []
mylist[1]->l1
l1
[[1]]
[1] "ram"   "shyam" "rahul" "radha"
class(l1)
[1] "list"
str(l1)
List of 1
 $ : chr [1:4] "ram" "shyam" "rahul" "radha"
length(l1)
[1] 1
l1[[1]]->l2
l2
[1] "ram"   "shyam" "rahul" "radha"
class(l2)
[1] "character"
str(l2)
 chr [1:4] "ram" "shyam" "rahul" "radha"
length(l2)
[1] 4
another_list<-list(v1=1:10,v2=TRUE,v3=c("a","b","c"))
str(another_list)
List of 3
 $ v1: int [1:10] 1 2 3 4 5 6 7 8 9 10
 $ v2: logi TRUE
 $ v3: chr [1:3] "a" "b" "c"
another_list[1]
$v1
 [1]  1  2  3  4  5  6  7  8  9 10
another_list$v1  ### see the $(dollor) sign
 [1]  1  2  3  4  5  6  7  8  9 10

matrix- vectors with dimensional attributes

dimension attributes- integer vectors of length (nrow,ncol)

## matrix are constructed coloumn wise 
mymatrix<-matrix(1:6,nrow = 2,ncol = 3)
mymatrix
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
dim(mymatrix)
[1] 2 3
length(mymatrix)
[1] 6
mymatrix*2
     [,1] [,2] [,3]
[1,]    2    6   10
[2,]    4    8   12
mymatrix+2
     [,1] [,2] [,3]
[1,]    3    5    7
[2,]    4    6    8
attributes(mymatrix)
$dim
[1] 2 3

dataframe - special type of list where evry element of the list has to have the same length

name<-c("A","B","C","D","H")
age<-c(32,29,23,25,21)
sex<-c("M","M","F","F","M")
hba1c<-c(6.4,7.8,9.1,6.4,10)
df<-data.frame(name,age,sex,hba1c)
attributes(df)
$names
[1] "name"  "age"   "sex"   "hba1c"

$class
[1] "data.frame"

$row.names
[1] 1 2 3 4 5
dim(df)
[1] 5 4
class(df)
[1] "data.frame"

now we have understanding about vector,list,matrix and dataframe .

Its time to know something about factor

Factor-integer vector with attribute levels(each integer has a label)

f1<-c(1,0,1,0,0,1)
f2<-as.factor(f1)
levels(f2)
[1] "0" "1"
attributes(f2)
$levels
[1] "0" "1"

$class
[1] "factor"
f2<-c("mild","mild","moderate","moderate","severe","mild","moderate")
f2<-as.factor(f2)
attributes(f2)
$levels
[1] "mild"     "moderate" "severe"  

$class
[1] "factor"
unclass(f2)
[1] 1 1 2 2 3 1 2
attr(,"levels")
[1] "mild"     "moderate" "severe"  
##levels-explicitly order the factor

Special values (NaN) (Inf) NA and NULL

#0/0#answer is NaN
#1/0 answer is Inf
sv1<-c(1,0,1,0,3,1)
sv2<-c(1,3,1,2,NA,1)
s<-sv1+sv2
s
[1]  2  3  2  2 NA  2
#Warning message:In Ops.factor(f1, f2) : ‘+’ not meaningful for factors
#Q =What is the difference  between NA an NaN
# NA=Not Available ( can be with character ,numeric or logical )
# NaN=Not a Number (only with numeric vector )
### NaN is also NA but converse is not true 
f3<- c(1,2,3,NA,4,5,NA)
is.na(f3)
[1] FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE
is.nan(f3)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
f4 <- c(5, 9, NaN, 3, 8, NA, NaN) 

is.na(f4)
[1] FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE
is.nan(f4)
[1] FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE
#NaN in R means “Not a Number” which means there is something or some value, but it cannot be described in the computer. NaN designates a result that cannot be calculated for whatever reason, or it is not a floating-point number.

#NULL : is for empty object.
a<-c()
length(a)
[1] 0
dim(a)
NULL

Attributes can be thought of nameplates and identifiers of r objects .

Objects can have attributes. Attributes are part of the object. These include:

names

dimnames

dim

class

attributes (contain metadata)

a<-c(2,3,5,6,7,3)
class(a)
[1] "numeric"
length(a)
[1] 6
dim(a)
NULL
attributes(df)
$names
[1] "name"  "age"   "sex"   "hba1c"

$class
[1] "data.frame"

$row.names
[1] 1 2 3 4 5
attributes(mymatrix)
$dim
[1] 2 3

subsetting in r

name<-c("A","B","C","D","H")
age<-c(32,36,29,25,21)
sex<-c("M","M","F","F","M")
hba1c<-c(6.4,7.8,9.1,6.4,10)
df<-data.frame(name,age,sex,hba1c)
attributes(df)
$names
[1] "name"  "age"   "sex"   "hba1c"

$class
[1] "data.frame"

$row.names
[1] 1 2 3 4 5
dim(df)
[1] 5 4
class(df)
[1] "data.frame"
df
  name age sex hba1c
1    A  32   M   6.4
2    B  36   M   7.8
3    C  29   F   9.1
4    D  25   F   6.4
5    H  21   M  10.0
## I want rows of  hba1c>6.5.
df6.5<-df[df$hba1c>6.5,]## by indexing 
df6.5
  name age sex hba1c
2    B  36   M   7.8
3    C  29   F   9.1
5    H  21   M  10.0
df6.5<-subset(df,hba1c>6.5)# by base subsetting 
df6.5
  name age sex hba1c
2    B  36   M   7.8
3    C  29   F   9.1
5    H  21   M  10.0
library(dplyr) # by dplyr

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
df %>% select(1:4) %>% filter(hba1c>6.5)->df6.5
df6.5
  name age sex hba1c
1    B  36   M   7.8
2    C  29   F   9.1
3    H  21   M  10.0
df6.5<-df[which(df$hba1c>6.5),] ## using which 
df6.5
  name age sex hba1c
2    B  36   M   7.8
3    C  29   F   9.1
5    H  21   M  10.0
## but if i want to extract the age only  of participants having hba1c>6.5
df$age[which(df$hba1c>6.5)]
[1] 36 29 21
## but if i want to extract the gender of participants having hba1c>6.5
df$sex[which(df$hba1c>6.5)]
[1] "M" "F" "M"
## I want to extract the 5th row 
df[5,]
  name age sex hba1c
5    H  21   M    10
## I want to extract the 3rd coloumn
df[,3]
[1] "M" "M" "F" "F" "M"
## i want to extract the 4th observation in 5th row
df[5,4]
[1] 10

missing value extraction

m1<-c(3,4,6,NA,7,4)
m2<-c("a","b","c", NA,"e","f")
m3<-c(2.6,2.9,5.6,NA,NA,4.9)
df.m<-data.frame(m1,m2,m3)
str(df.m)
'data.frame':   6 obs. of  3 variables:
 $ m1: num  3 4 6 NA 7 4
 $ m2: chr  "a" "b" "c" NA ...
 $ m3: num  2.6 2.9 5.6 NA NA 4.9
summary(df.m)
       m1           m2                  m3       
 Min.   :3.0   Length:6           Min.   :2.600  
 1st Qu.:4.0   Class :character   1st Qu.:2.825  
 Median :4.0   Mode  :character   Median :3.900  
 Mean   :4.8                      Mean   :4.000  
 3rd Qu.:6.0                      3rd Qu.:5.075  
 Max.   :7.0                      Max.   :5.600  
 NA's   :1                        NA's   :2      
df.c<- df.m[complete.cases(df.m),]
df.m
  m1   m2  m3
1  3    a 2.6
2  4    b 2.9
3  6    c 5.6
4 NA <NA>  NA
5  7    e  NA
6  4    f 4.9
df.c
  m1 m2  m3
1  3  a 2.6
2  4  b 2.9
3  6  c 5.6
6  4  f 4.9
## remove na only from a single col 

Vectors

Vectors are the most basic R data structures — they store elements of the same type.

# Creating vectors
num_vec <- c(1, 2, 3, 4, 5)
char_vec <- c("apple", "banana", "cherry")

num_vec
# [1] 1 2 3 4 5

char_vec
# [1] "apple" "banana" "cherry"

Vector operations:

num_vec * 2
# [1]  2  4  6  8 10

num_vec + c(10, 20, 30, 40, 50)
# [1] 11 22 33 44 55

Data Frames

Data frames are two-dimensional tables that can hold different data types in each column.

# Creating a data frame
students <- data.frame(
  Name = c("Amit", "Bhavna", "Chirag"),
  Age = c(21, 22, 20),
  Marks = c(89, 76, 92)
)

students
#    Name  Age  Marks
# 1  Amit   21     89
# 2 Bhavna  22     76
# 3 Chirag  20     92
# Accessing columns
students$Marks
# [1] 89 76 92

Functions in R

Functions are reusable blocks of code that perform a specific task.

# Defining a function
add_numbers <- function(a, b) {
  sum <- a + b
  return(sum)
}

# Calling the function
add_numbers(5, 3)
# [1] 8

Built-in functions:

mean(c(10, 20, 30))
# [1] 20

sqrt(16)
# [1] 4

The Help System

R provides in-built documentation for almost every function.

?mean
help("sum")

You can also search more generally using:

help.search("regression")

Or browse online documentation:

browseVignettes()

Control Structures

R includes standard control structures: if, else, for, and while.

x <- 10
if (x > 5) {
  print("x is greater than 5")
} else {
  print("x is 5 or less")
}
# [1] "x is greater than 5"
# For loop example
for (i in 1:3) {
  print(paste("Iteration", i))
}
# [1] "Iteration 1"
# [1] "Iteration 2"
# [1] "Iteration 3"

Summary and Quick Reference

Concept Function Example Description
Assign value <- x <- 5 Assigns 5 to x
Combine elements c() c(1,2,3) Creates vector
Create sequence : 1:5 Generates 1,2,3,4,5
Mean mean() mean(c(2,4,6)) Computes average
Conditional if() if(x>0) print("yes") Executes code if condition true
Data frame data.frame() data.frame(Name, Age) Creates table-like structure

Closing Note

This comprehensive document now includes your original explanations, followed by enhanced topics like Vectors, Data Frames, Functions, Help System, and Control Structures — all complete with visible outputs and syntax highlighting.
It’s ready to be used as a pedagogical classroom resource for R basics.