Factor Functions in R

Factor Functions in R

  • R
  • 4 mins read

In this article, we will study the various functions that can be performed with factor variables in R.

About Factors in R

A factor is a categorical variable in R associated with different levels in it. For instance, nationality or gender can be considered as belonging to factor data. Levels in factors represent the different unique values a factor can take up. A factor in R can be created using the factor() method in R which takes as input the vector belonging to any data type, be it, integer, logical, or character in nature. 

Factor Functions

Here we will see examples of R factor functions for adding, modifying, and ordering the contents of a factor.

Function to Add new values in a Factor

New values can be added to the factors, only when they are available in the levels of the factor. In case, the value doesn’t exist in the factor, the value has to be added to the levels of the factor first, and then subsequently to the factor. If not added to the factor, the value NA is added to it. 

#creating a factor
fac <- factor(c("Python","R","R","Python","Python"))

cat("Original Factor Values : ", fac)
#printing levels of factors
print(levels(fac))

#adding another existing level to the factor
fac[6] = "R"
cat("Factor after adding existing value at the end : ",fac)

#adding value not existing level to the factor
fac[7] = "Machine Learning"
cat("Factor after adding non-existing value at the end : ",fac)

#adding value to the level first 
levels(fac) <- c("R","Python","Machine Learning")
#adding value to the fac
fac[8] = "Machine Learning"
cat("Factor after adding value to levels first and then at the end : ",fac)
The code produces the following output : 
Original Factor Values :  1 2 2 1 1
[1] "Python" "R"     
Factor after adding existing value at the end :  1 2 2 1 1 2
Warning message:
In `[<-.factor`(`*tmp*`, 6, value = "Machine Learning") :
  invalid factor level, NA generated
Factor after adding non-existing value at the end :  1 2 2 1 1 NA
Factor after adding value to levels first and then at the end :  1 2 2 1 1 2 NA 3

Function to Modify the values in a Factor

A value in the factor can be modified by specifying the position using the indexing operator, [ ] and then reassigning it to a new value.

#creating a factor
fac <- factor(c("Python","R","ML","R","Python","Python","ML","R","ML"))

cat("Original Factor Values : ", fac)
#printing levels of factors
print(levels(fac))

#modifying the value at third index 
fac[3] = "R"
cat("Modified Factor Values : ", fac)
Output
Original Factor Values :  2 3 1 3 2 2 1 3 1 #printing levels of factors
[1] "ML"     "Python" "R"      
Modified Factor Values :  2 3 3 3 2 2 1 3 1

Modifying levels of the factor 

The factor levels can also be modified using the levels() method and reassigning it to the new set of levels. However, the length of the new factor level vector should be equivalent to the original factor level vector. Values remain the same.

#creating a factor
fac <- factor(c("Python","R","ML","R","Python","Python","ML","R","ML"))

cat("Original Factor Values : ", fac)
#printing levels of factors
cat("Original Factor Levels : ", levels(fac))

#replacing the level of the factor
levels(fac) = c("R","Python","Machine Learning")
cat("Modified Factor Levels : ", levels(fac))
The code produces the following output : 
Original Factor Values :  2 3 1 3 2 2 1 3 1> #printing levels of factors
Original Factor Levels :  ML Python R
Modified Factor Levels :  R Python Machine Learning

Function to Order the levels of the Factor 

The factor levels are ordered in alphabetical order by default. The levels however can be reordered by reassigning the levels and assigning them to the new vector. The levels, though, remain the same. 

#creating a factor
fac <- factor(c("Python","R","ML","R","Python","Python","ML","R","ML"))

cat("Original Factor Values : ", fac)
#printing levels of factors
cat("Original Factor Levels : ", levels(fac))

#replacing the level of the factor
levels(fac) = c("Python","ML","R")
cat("Modified Factor Levels : ", levels(fac))
Output
Original Factor Values :  2 3 1 3 2 2 1 3 1> #printing levels of factors
Original Factor Levels :  ML Python R
Modified Factor Levels :  Python ML R