Foundational Concepts of R Programming (Sample Notebook just for preview)
A tutorial of Basics R programming.
- Data Types
- Data Structures
- Operators
- Control Structures & Loops
- Functions
- Rounding of Numbers
- Apply Functions Over Array Margins
- Packages/Modules/Libraries
i1 = 100
i2= 2.34
i3 = 5/2
i4 = -90
i5 = 5L # Here, L tells R to store the value as an integer,
i6 = 3+6i
print(class(i1))
print(typeof(i1))
print(class(i2))
print(class(i3))
print(class(i4))
print(class(i5))
print(class(i6))
# these are to test for numerical data types
#real numbers (real) and complex numbers
print(typeof(i1))
print(typeof(i2))
print(typeof(i3))
print(typeof(i4))
print(typeof(i5))
i5 = 2+7i
print(class(i5))
c1 = "c"
c2 = "data"
c3 = 'R-Programming'
print(class(c1))
print(class(c2))
print(class(c3))
v1 <- c("data","science","R-Programming")
print(class(v1))
Factors Factor object encodes a vector of unique elements (levels) from the given data vector.
v1 <- c("data","science","R-Programming")
f1 =factor(v1)
print(class(v1))
print(v1)
print(f1)
v2 <- c("medium","High","low")
f2 =factor(v2)
print(f2)
v3 <- c("medium","high","low")
f3 =factor(v3, ordered = TRUE)
print(f3)
v4 <- c("medium","high","low")
f4 =factor(v4, ordered = TRUE, levels = c("low", "medium", "high"))
print(f4)
l1 = TRUE
l2 = FALSE
print(class(l1))
print(class(l2))
c1 <- '2021-6-16'
d1 <- as.Date('2021-6-16')
print(c1)
print(d1)
print(class(c1))
print(class(d1))
as.Date('1/15/2001',format='%m/%d/%Y')
# "2001-01-15"
as.Date('April 26, 2001',format='%B %d, %Y')
# "2001-04-26"
as.Date('22JUN01',format='%d%b%y') # %y is system-specific; use with caution
# "2001-06-22"
as.POSIXct: Date-time Conversion-
Sys.Date()
Sys.time()
raw<- charToRaw("R programming")
print(raw)
Raw is a very unusual data type. For instance, you could transform a character object or a integer numeric value to a raw object with the charToRaw
and intToBits
functions, respectively.
a <- charToRaw("data science")
print(a) # [1] 64 61 74 61 20 73 63 69 65 6e 63 65
class(a) # "raw"
print(is.vector(a))
b <- intToBits(6)
print(b)
class(b) # "raw"
rawToChar(a, multiple = TRUE)
Objects:
In every computer language variables provide a means of accessing the data stored in memory. R does not provide direct access to the computer’s memory but rather provides a number of specialized data structures we will refer to as objects. These objects are referred to through symbols or variables
Everything in R language is an Object, Objects are further categorized into above mentioned list
In this chapter we provide preliminary descriptions extraction and operations of the various data structures provided in R
Atomic vectors are one of the basic types of objects in R programming. Atomic vectors can store homogeneous data types such as character, doubles, integers, raw, logical, and complex. A single element variable is also said to be vector.
To create a vector we use c()
function each value/element seperated with comma
Eg:
x <- c(2,7,1,7,1,6,80,0,1)
-
z <- c("Alec", "Dan", "Rob", "Rich")
-
y <- c(TRUE, TRUE, FALSE, FALSE)
x <- c(2,7,1,7,1,6,80,0,1)
y <- c(TRUE, TRUE, FALSE, FALSE)
z <- c("Alec", "Dan", "Rob", "Rich")
print(class(x)) # numeric vector
print(class(y)) # logical vector
print(class(z)) # character vector
Indexing
Positive
print(x) #entire vector
print(x[1])
print(x[2])
# to get range of elements
print(x) #entire vector
print(x[1:5])
print(x[3:8])
Negative
print(x) #entire vector
print(x[-1]) #ignore/exclude first element
print(x[-4]) #ignore/exclude fourth value
Matrices To store values as 2-Dimensional array, matrices are used in R. Data, number of rows and columns are defined in the matrix() function.
To create a matrix we use matrix()
function with nrow and ncol arguements
Eg:
-
m1 <- matrix(c(2,7,1,7,1,6,80,0,1),nrow=2,ncol=4)
m1 <- matrix(c(2,7,1,7,1,6,80,0,1),nrow=3,ncol=3)
print(m1)
Indexing
print(m1[1,1]) #first row first column element
print(m1[2,3]) #second row third column element
print(m1[3,3]) #third row third column value
print(m1[2,]) #entire second row
print(m1[,3]) #entire second row
print(m1[,3,drop=FALSE]) #entire second row as vertical display
thismatrix <- matrix(c("apple", "banana", "cherry", "orange", "mango", "pineapple"), nrow = 3, ncol =2)
print(thismatrix)
#Remove the first row and the first column
thismatrix <- thismatrix[-c(1), -c(1)]
print(thismatrix)
create a matrix using cbind
and rbind
m <- cbind(Index = c(1:3), Age = c(30, 45, 34), Salary = c(500, 600, 550))
class(m)
print(m)
Arrays array() function is used to create n-dimensional array. This function takes dim attribute as an argument and creates required length of each dimension as specified in the attribute.
a <- array(data = 1:27, dim=c(3,3,3))
print(class(a))
print(a)
Indexing
print(a[2,3,2]) # extract single element
print(a[,,2]) # A two-dimensional array is the same thing as a matrix.
print(a[1,,2]) # extract one row
print(a[, 3, 2, drop = FALSE]) # extract one column
Data Frames Data frames are 2-dimensional tabular data object in R programming. Data frames consists of multiple columns and each column represents a vector. Columns in data frame can have different modes of data unlike matrices.
Name <- c("Ramesh", "Tarun", "Shekar")
age <- c(23, 54, 32)
height <- c(4.6,5.4,6.2)
df <- data.frame(Name, age, height)
df
df <- data.frame(x = 1:3, y = c("a", "b", "c"))
print(df)
Create dataframe using cbind
and rbind
methods
df1 <- cbind(1, df)
print(class(df1))
print(df1)
# Using rbind
df <- data.frame(a = c(1:5), b = (1:5)^2)
df
df2 = rbind(df, c(2, 3), c(5, 6))
print(df2)
Indexing and Extracting Elements:
Name <- c("Ramesh", "Tarun", "Shekar")
age <- c(23, 54, 32)
height <- c(4.6,5.4,6.2)
df <- data.frame(Name, age, height)
df
df[1,1]
print(df$Name)
print(df[,'Name'])
print(df[,c('Name','age')])
Create a dataframe using expand.grid()
method
eg <- expand.grid(pants = c("blue", "black"), shirt = c("white", "grey", "plaid"))
print(class(eg))
print(eg)
Lists List is another type of object in R programming. List can contain heterogeneous data types such as vectors or another lists.
- List is a special vector. Each element can be a different class.
- lists act as containers
- Unlike atomic vectors, its contents are not restricted to a single type
- a list can be anything, and two elements within a list can be of different types!
- Lists are sometimes called recursive vectors, because a list can contain other lists
Create a ists using list function
x <- list(1, "a", TRUE, 1+4i)
lst1 <- list(1, "a", TRUE, 1+4i)
print(lst1)
Indexing
print(lst1[1])
class(lst1[1])
print(lst1[1]*2)
# we will get error because its not a number, but its still a list
class(lst1[[1]])
print(lst1[[1]])
lst1[[1]]*2
v <- c("apple", "banana", "cherry", "orange", "mango", "pineapple")
print(v)
#creating matrix
m <- matrix(c("apple", "banana", "cherry", "orange", "mango", "pineapple"), nrow = 3, ncol =2)
print(m)
#creating array
a <- array(c("apple", "banana", "cherry", "orange", "mango", "pineapple"), dim = c(2,2,2))
print(a)
#creating dataframe
Name <- c("Ramesh", "Tarun", "Shekar")
age <- c(23, 54, 32)
height <- c(4.6,5.4,6.2)
df <- data.frame(Name, age, height)
print(df)
lst2 <- list(v,m,a,df, lst1)
print(class(lst2))
print(lst2)
lst2[[1]] # a vector stored in a lis
lst2[[1]][2]
lst2[[2]] # a matrix stored in a list
lst2[[2]][2,2]
print(lst2[[3]])
print(lst2[[3]][,,2])
print(lst2[[3]][2,1,2])
print(lst2[[4]])
print(lst2[[4]][2,1])
print(lst2[[5]])
print(lst2[[5]][[2]])
Reshaping R Objects
vec <- 1:12 # a vector
print(vec)
mat <- matrix( vec, nrow=2) # a matrix
print(mat)
dim(mat) <- NULL
print(mat) # back to vector
print(mtcars)
ULmtcars <- unlist(mtcars) # produces a vector from the dataframe
print(ULmtcars)
UCmtcars <- unclass(mtcars) # removes the class attribute, turning the dataframe into a
head(mtcars)
print(UCmtcars)
print(c(mtcars)) # similar to unclass but without the attributes
Assignment Operators (<− = <<− -> ->>)
Arithmetic Operators (+ - * / %%
(modulo) %/%
(integer divide) ^
(raised to the power of))
Relational Operators(> < <= >= == != )
Logical Operators (& | ! && ||)
Miscellaneous Operators (: %in% %*%)
a <- 5.67
print(a)
b = 'data'
print(b)
6+5i -> c
print(c)
make.accumulator<-function(){
a<-0
function(x) {
a<-a+x
a
}
}
f<-make.accumulator()
print(f(1))
print(f(2))
It's the 'superassignment' operator. It does the assignment in the enclosing environment. That is, starting with the enclosing frame, it works its way up towards the global environment
make.accumulator<-function(){
a<-0
function(x) {
a<<-a+x
a
}
}
f<-make.accumulator()
print(f(1))
print(f(2))
Arithmetic Operators
Operator Description Example
- Subtraction 5 - 1 = 4
- Addition 5 + 1 = 6
- Multiplication 5 3 = 15 / Division 10 / 2 = 5 ^ or ** Exponentiation 2222*2 as 2 to the power of 5 x%%y Modulus 5%%2 is 1 x%/%y Integer Division 5%/%2 is 2
print(5 + 1)
print(5 - 1)
print(5 * 3)
print(2^5)
print(2**5)
print(5 / 2)
print(5%/%2) # Integer Division
print(5%%2) # Modulus or reminder
Description
Binary operators which allow the comparison of values in atomic vectors.
Operator Description Example < less than 5 < 10 <= less than or equal to <= 5
greater than 10 > 5
= greater than or equal to >= 10 == exactly equal to == 10 != not equal to != 5
x <- 5
y <- -3
print(x < y)
print(x > y)
print(x <= y)
print(x >= y)
print(x == y)
print(x != y)
Operator Description Example !x not x x <- c(5), !x x | y x or y x <- c(5), y <- c(10), x | y x & y x and y x <- c(5), y <- c(10), x & y
Logical AND (&&
) and Logical OR (||
)
v <- c(3,0,TRUE,2+2i)
t <- c(1,3,TRUE,2+3i)
print(v&&t)
v <- c(0,0,TRUE,2+2i)
t <- c(0,3,TRUE,2+3i)
print(v||t)
(: %in% %*%
)
:
Operator
print(2:10)
print(-2:-10)
print(2:10)
%in%
Operator
v1 <- 8
v2 <- 12
t <- 1:10
print(v1 %in% t)
print(v2 %in% t)
%*%
Operator
a <- matrix(1:9, 3, 3)
b <- matrix(-1:-9, 3, 3)
print(a)
print(b)
print(a*b)
print(a%*%b)
print(1*-1 + 4*-2 + 7*-3)
print(1*-4 + 4*-5 + 7*-6)
Special values
NA, NULL, ±Inf and NaN
# NA Stands for not available
# NA is a placeholder for a missing value
print(NA + 2)
print(sum(c(NA, 4, 6)))
print(median(c(NA, 4, 8, 4), na.rm = TRUE))
print(length(c(NA, 2, 3, 4)))
print(5 == NA)
print(NA == NA)
print(TRUE | NA)
x <- c(2,NA,5,4.89,10,TRUE,6/7)
is.na(x)
NULL
- The class of NULL is null and has length 0
- Does not take up any space in a vector
- The function
is.null()
can be used to detect NULL variables.
print(length(c(3, 4, NULL, 1)))
print(sum(c(5, 1, NULL, 4)))
x <- NULL
print(c(x, 5))
Inf
- Inf is a valid
numeric
that results from calculations like division of a number by zero. - Since Inf is a numeric, operations between Inf and a finite numeric are well-defined and comparison operators work as expected.
print(32/0)
print(5 * Inf)
print(Inf - 2e+10)
print(Inf + Inf)
8 < -Inf
print(Inf == Inf)
NaN
- Stands for not a number.
- unknown resulsts, but it is surely not a number
- e.g like
0/0, Inf-Inf
andInf/Inf
result in NaN - Computations involving numbers and NaN always result in NaN
NaN + 1
exp(NaN)
Coercion and Testing an Object:
Internal (implicit) coercion External coercion and testing objects
Internal (implicit) coercion:
If the two arguments are atomic vectors of different modes, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical
Guess what the following do without running them first
{r}
xx <- c(1.7, "a")
xx <- c(TRUE, 2)
xx <- c("a", TRUE)
This is called implicit coercion.
print(c(1, FALSE))
# numeric 1, 0
print(mode(c(1, FALSE)))
print(c("a", 1))
# character 'a', '1'
print(mode(c("a", 1)))
print(c(TRUE, 1L))
print(mode(c(TRUE, 1L)))
# numeric 1, 1
print(c(3.427, 1L))
print(mode(c(3.427, 1L)))
External coercion and testing objects:
These following function will be used to test the object and convert to another object with coercing
while testing the objects
Use is.atomic()
to test if an object is either atomic vector or is.recursive()
|| is.list()
for recursive list.
is.atomic()
is more suitable for testing if an object is a vector.
is.list()
tests whether an object is truly a list.
is.numeric()
, similarly, is TRUE for either integer or double vectors, but not for lists.
Help: https://bookdown.org/
(decision making statements)
If Condition
IF statement associates a condition with a sequence of statements, The sequence of statements is executed only if the condition is true. If the condition is false or null, the IF statement does nothing. In either case, control passes to the next statement**
num1=10
num2=20
if(num1<=num2)
{
print("Num1 is less or equal to Num2")
}
x <- 1:15
if (sample(x, 1) <= 10)
{
print("x is less than 10")
} else
{
print("x is greater than 10")
}
Another way of ifelse in R:
x <- 1:15
ifelse(x <= 10, "x less than 10", "x greater than 10")
x <- c("what","is","truth")
if("Truth" %in% x) {
print("Truth is found the first time")
} else if ("truth" %in% x) {
print("truth is found the second time")
} else {
print("No truth found")
}
For Loop:
To repeats a statement or group of for a fixed number of times.
vector <- c("aaa","bbb","ccc")
for(i in vector){
print(i)
}
for (year in c(2010,2011,2012,2013,2014,2015)){
print(paste("The year is", year))
}
for(i in 2:5){
z <- i +1
print(z)
}
mymat <- matrix(1:9,3,3)
print(mymat)
for (i in seq_len(nrow(mymat))){
for (j in seq_len(ncol(mymat))){
print(mymat[i,j])
}
}
While Loop:
Loop until a specific condition is met
i <- 1
while (i < 6) {
print(i)
i = i+1
}
break Statement:
break is used inside any loop like repeat
, for
or while
to stop the iterations and flow the control outside of the loop.
x <- 1:5
for (val in x) {
if (val == 3){
break
}
print(val)
}
x <- 1:10
for (num in x){
if (num==6) break
mynum <- paste(num, "and so on. ", sep = " ")
print(mynum)
}
Repeat statement:
Iterate over a block of code multiple number of times.
x <- 1
repeat {
print(x)
x = x+1
if (x == 6){
break
}
}
next Statment:
- Useful to controls the flow of R loops
- general usage inside the For Loop and While Loop
x <- 1:5
for (val in x) {
if (val == 3){
next
}
print(val)
}
Switch Statment:
switch statement allows a variable to be tested for equality against a list of values. Each value is called a case
number1 <- 30
number2 <- 20
operator <- readline(prompt="Please enter any ARITHMETIC OPERATOR You wish!: ")
switch(operator,
"+" = print(paste("Addition of two numbers is: ", number1 + number2)),
"-" = print(paste("Subtraction of two numbers is: ", number1 - number2)),
"*" = print(paste("Multiplication of two numbers is: ", number1 * number2)),
"^" = print(paste("Exponent of two numbers is: ", number1 ^ number2)),
"/" = print(paste("Division of two numbers is: ", number1 / number2)),
"%/%" = print(paste("Integer Division of two numbers is: ", number1 %/% number2)),
"%%" = print(paste("Division of two numbers is: ", number1 %% number2))
)
**Conclusion**
Loops are not recommended until and unless its really needed, since R has vectorisation feature
Vectorization concept
vect <- c(1,2,3,4,5,6,7,9)
# now we multiply each element of vect with 5
print(vect * 5)
# now we add each element of vect with 5
print(vect + 5)
# now we subtract each element of vect with 5
print(vect - 5)
recycling concept
a <- 1:10
b <- 1:5
a + b
a <- 1:10
b <- 5
a * b # here b is a vector of length 1
Functions:
Functions can be described as ”black boxes” that take an input and print out an output based on the operation logic inside the function
used by the user to make their work easier. Eg:mean(x), sum(x) ,sqrt(x),toupper(x), etc.
Some Built-in functions for the Objects
Function | Description |
---|---|
c() |
combines values, vectors, and/or lists to create new objects |
unique() |
returns a vector containing one element for each unique value in the vector |
duplicated() |
returns a logical vector which tells if elements of a vector are duplicated with regard to previous one |
rev() |
reverse the order of element in a vector |
sort() |
sorts the elements in a vector |
append() |
append or insert elements in a vector. |
sum() |
sum of the elements of a vector |
min() |
minimum value in a vector |
max() |
maximum value in a vector |
cumsum |
cumulative sum |
diff |
x[i+1] - x[i] |
prod |
product |
cumprod |
cumulative product |
sample |
random sample |
mean |
average |
median |
median |
var |
variance |
sd |
standard deviation |
Function | Description |
---|---|
abs(x) | absolute value (magnitude of numbers regardless of whether or not they are positive, magnitude of -21 > magnitude of 19. That magnitude is called an absolute) |
sqrt(x) | square root |
floor(x) | floor(3.975) is 3 |
ceiling(x) | ceiling(3.475) is 4 |
trunc(x) | trunc(5.99) is 5 |
round(x, digits=n) | round(3.475, digits=2) is 3.48 |
exp(x) | e^x (calculate the power of e i.e. e^x) |
log10(x) | common logarithm (base 10) |
log(x) | natural logarithm (base e) |
strsplit(x, split) | Split the elements of character vector x at split.strsplit("abc", "") returns 3 element vector "a","b","c" |
toupper(x) | Uppercase |
tolower(x) | Lowercase |
Functions Examples
{r}
(x <- c(sort(sample(1:20, 9)), NA))
(y <- c(sort(sample(3:23, 7)), NA))
which.min(x)
which.max(x)
union(x, y)
intersect(x, y)
setdiff(x, y)
setdiff(y, x)
match(x,y)
Sets of instructions that you want to use repeatedly, it is a piece of code written to carry out a specified task, these functions are created by the user to meet a specific requirement of the user.
- Function Name
- Arguments
- Function Body
- Return Value
Objects & Functions
To understand in R two slogans are helpful:
- Everything that exists is an object
- Everything that happens is a function call
syntax
{r}
function_name <–function(arg_1, arg_2, …)
{
//Function body
}
sum_of_squares <- function(x,y)
{
x^2 + y^2
}
sum_of_squares(3,4)
pow <- function(x, y)
{
result <- x^y
print(paste(x,"raised to the power", y, "is", result))
}
pow(3,5)
Default Arguments:
new.function <- function(a = 3, b = 6) {
result <- a * b
print(result)
}
# Call the function without giving any argument.
new.function()
# Call the function with giving new values of the argument.
new.function(9,5)
# Sets default of exponent to 2 (just square)
MyThirdFun <- function(n, y = 2)
{
# Compute the power of n to the y
n^y
}
# Specify both args
MyThirdFun(2,3)
# Just specify the first arg
MyThirdFun(2)
# Specify no argument: error!
# MyThirdFun()
Named Arguments:
pow <- function(x, y) {
# function to print x raised to the power y
result <- x^y
print(paste(x,"raised to the power", y, "is", result))
}
pow(8, 2)
# 8 raised to the power 2 is 64
pow(x = 8, y = 2)
# 8 raised to the power 2 is 64
pow(y = 2, x = 8)
partial matching:
?round
print(round(9.523, d=2))
print(round(9.523, di=2))
print(round(9.523, dig=2))
testFun <- function(axb, bcd = 1, axdk) {
return(axb + axdk)
}
testFun(ax=2,ax=3)
testFun(axb=2,ax = 3)
Functions in R are first class objects
- can be treated as much like any other objects
- Can be passed as arguments to othre functions
- Can be nested so that you can define a function in another function
- Return value is the last expression in the function body
Lazy evaluation
- Materalize only when necessary
- Data is not loaded until its needed
- increase spped & saving computaions
fun <- function(a,b){
a^2
}
fun(2,x/0)
fun <- function(x){
10
}
fun("hello")
Automatic Returns:
In R, it is not necessary to include the return statement. R automatically returns whichever variable is on the last line of the body of the function. OR we can explicitly define the return statement.
add <- function(x,y=1,z=2){
x+y
x+z
}
add(5)
add <- function(x,y=1,z=2){
x+y
x+z
return(x+y)
}
add(5)
fahr_to_kelvin <- function(temp) {
kelvin <- ((temp - 32) * (5 / 9)) + 273.15
return(kelvin)
}
fahr_to_kelvin(32)
# boiling point of water
fahr_to_kelvin(212)
elipses or three dots (...)
- which is especially useful for creating customized versions of existing functions or in providing additional options to end-users.
- Pass arguments to another function
- These three dots (an ellipsis) act as a placeholder for any extra arguments given to the function
- Take any number of named or unnamed arguments
printDots <- function(...) {
myDots <- list(...)
paste(myDots)
}
printDots("how", "is", "your", "health")
Named AND Anonymous (nameless) functions:
named <- function(x) x*10
# calling a named function
named(6)
(function(x) x*10)(6) #
apply family functions pertains to the R base package and is populated with functions to manipulate slices of data from matrices, arrays, lists and dataframes in a repetitive way. These functions allow crossing the data in a number of ways and avoid explicit use of loop constructs.
head(mtcars)
dim(mtcars)
str(mtcars)
# one method
max(mtcars[,1])
max(mtcars[,2])
max(mtcars[,3])
max(mtcars[,4])
max(mtcars[,5])
#...etc
for (i in 1:ncol(mtcars))
{
col <- mtcars[,i]
max <- max(col)
print(max)
}
- apply - apply over the margins of an array (e.g. the rows or columns of a matrix)
- lapply - apply a function to each element of a list in turn and get a list back.
- sapply - apply a function to each element of a list and get a simplified object like vector back, rather than a list.
- tapply - apply a function to subsets of a vector and the subsets are defined by some other vector, usually a factor.
- mapply - apply a function to the 1st elements of each, and then the 2nd elements of each, etc
apply function:
- When you want to apply a function to the rows or columns of a matrix (and higherdimensional analogues);
?apply #take help
apply(mtcars, 2, max)
apply(mtcars, 1, max)
- When you want to apply a function to each element of a list/vector in turn and get a list back
lapply(1:3, function(x) x^2)
class(lapply(1:3, function(x) x^2) )
CAGO.list <- list(Diet1 = c(2,5,4,3,5,3), Diet2 =c(8,5,6,5,7,7), Diet3 =c(3,4,2,5,2,6) , Diet4 = c(2,2,3,2,5,2))
lapply(CAGO.list, mean)
- convert list to data frame and check whether
lapply()
is working for data frames or not
CAGO.df <- as.data.frame(CAGO.list)
CAGO.df
lapply(CAGO.df, mean) # without specifying margins it calculate as column wise
- We can apply on vector as well using
lappply()
Random <- c("This", "Is", "a", "Random", "Vector")
lapply(Random,nchar) #To get number of character for each vector element from above object
lapply(Random,toupper)
- When you want to apply a function to each element of a list in turn, but you want a vector back, rather than a list
sapply(1:3, function(x) x^2)
class(sapply(1:3, function(x) x^2))
- apply the
sapply()
function on theCAGO.list
andCAGO.df
print(sapply(CAGO.list, mean)) # output as a vector
print(sapply(CAGO.df, mean)) # output as a vector
Applies a function or operation on subset of the vector broken down by a given factor variable.
To understand clearly lets imagine you have height of 1000 people ( 500 male and 500 females), and you want to know the average height of males and females from this sample data. To deal with this problem you can group height by the gender, height of 500 males, and height of 500 females, and later calculate the average height for males and females.
tapply(mtcars$wt,mtcars$cyl,mean)
head(iris)
tapply(iris$Sepal.Length,iris$Species,mean)
Multivariate version of sapply
It applies FUN to the first elements of each (…) argument, the second elements, the third elements, and so on.
Note that the first argument of mapply() here is the name of a function
Advisable when you have several data structures (e.g. vectors, lists) and you want to apply a function over elements
l1 <- list(a = c(1:10), b = c(11:20))
l2 <- list(c = c(21:30), d = c(31:40))
# sum the corresponding elements of l1 and l2
print(mapply(sum, l1$a, l1$b, l2$c, l2$d))
print(mapply(sum, l1))
#sum(c(1:10))
print(mapply(sum, l1,l2))
#sum(c(1:10),c(21:30))
print(mapply(sum, l1$a, l1$b))
Q1 <- matrix(c(rep(1, 4), rep(2, 4), rep(3, 4), rep(4, 4)),4,4)
# Print `Q1`
print(Q1)
# Or use `mapply()`
Q2 <- mapply(rep,1:4,4)
# Print `Q2`
print(Q2)
mapply(rep, 1:4, 4:1)
R has many packages prepared and conitniuosly maintaining and upgrading for the specific purpose of activities
Eg:
we can use stringr
package used for various string related operations
we can use dplyr
package for data manipulation/cleaning/analysis
we can use ggplot2
package for data visualization
....etc
To install any package we need internet connectivity for the machine and use install.packages("package name")
Eg:
install.packages("stringr")
To load the installed package for the current session we should use library(Package_Name)
Eg:
library(stringr)