R-Data Structures

A data structure is a particular way of organizing data in a computer so that it can be used effectively. The idea is to reduce the space and time complexities of different tasks. Data structures in R programming are tools for holding multiple values.

R’s base data structures are often organized by their dimensionality (1D, 2D, or nD) and whether

they’re homogeneous (all elements must be of the identical type) or heterogeneous (the elements are often of various types). This gives rise to the six data types which are most frequently utilized in data analysis.


The most essential data structures used in R include :

  • Vector
  • Lists
  • Dataframes
  • Matrices
  • Arrays
  • Factors


The Different Kinds of Data Structures in R

Let’s take a little time to familiarize ourselves with the data structures in R. In the process, we can familiarize ourselves with common R functions.


Vector:

In R, a vector is a basic data structure that can hold multiple values of the same type (e.g., numeric, character, logical). Vectors are essential for storing data in R, and they can be created using various functions.

Technically, vectors can be one of two types:

  • atomic vectors
  • lists

although the term “vector” most commonly refers to the atomic types not to lists.


We can categorize a vector into the below types:

  • Numeric Vector (1,808,6527,742,268)
  • Integer Vector ( positive and negative real numbers )
  • Character vector (“a”, “efjvfVF”, “fbyvkdsb sbv”, “ffWVWVVRV”)
  • Logical vector (True/False)
  • Complex vector (complex numbers of a+bi form)

Here’s a detailed guide on how to use vectors in R:


1.  Creating a Vector:

You can create a vector using the c() function (short for concatenate).

Example: Numeric Vector

# Creating a numeric vector
numbers <- c(1, 2, 3, 4, 5)
print(numbers) # Output: 1 2 3 4 5


Example: Character Vector

# Creating a character vector
names <- c(“Aaditya”, “Dhruv”, “Shreya”)
print(names) # Output: “Aaditya” “Dhruv” “Shreya”


Example: Logical Vector

# Creating a logical vector
logical_values <- c(TRUE, FALSE, TRUE)
print(logical_values) # Output: TRUE FALSE TRUE


1.  Accessing Elements in a Vector:

You can access elements of a vector using square brackets [] and specifying the index number.

Example: Accessing Elements

# Accessing the first element
print(numbers[1])
# Output: 1
# Accessing multiple elements
print(numbers[c(2, 4)])
# Output: 2 4


2.  Modifying Elements in a Vector

You can modify vector elements by assigning new values to specific indices.

Example: Modifying Elements


# Changing the second element
numbers[2] <- 10
print(numbers) # Output: 1 10 3 4 5


3..  Vector Operations

You can perform arithmetic operations on vectors element-wise.


Example: Vector Addition

# Multiplying a vector by a scalar
result <- numbers * 2
print(result) # Output: 2 20 6 8 10


Example: Multiplying by a Scalar

# Multiplying a vector by a scalar
result <- numbers * 2
print(result) # Output: 2 20 6 8 10


5.  Length of a Vector

You can find the number of elements in a vector using the length() function.

Example: Length of a Vector

# Checking the length of the vector
print(length(numbers))
# Output: 5


6.  Filtering a Vector

You can filter a vector based on a condition

Example: Filtering Numeric Values

# Filtering values greater than 3
filtered <- numbers[numbers > 3]
print(filtered) # Output: 10 4 5


7.  Combining Vectors

You can combine two or more vectors using the c() function.

Example: Combining Vectors

# Combining two vectors
combined <- c(vector1, vector2)
print(combined) # Output: 1 2 3 4 5 6


8.  Checking Vector Type

You can check the type of a vector using the class() function.

Example: Checking Vector Type

# Checking the type of the vector
print(class(numbers)) # Output: “numeric”


Vectors are a fundamental data structure in R, and you can create, modify, and perform operations on them easily. You can work with vectors containing numeric, character, or logical data, and perform arithmetic, access elements, filter, and combine vectors as needed.

List:

In R, lists act as containers. Unlike atomic vectors, the contents of a list are not restricted to a single mode and can encompass any mixture of data types.

Lists are sometimes called generic vectors, because the elements of a list can be of any type of R object, even lists containing further lists. This property makes them fundamentally different from atomic vectors.

“A list is a special type of vector in which each element can be a different type.”

Create lists using list() or coerce other objects using as.list(). An empty list of the required length can be created using vector().

Characteristics of Lists

  1. Heterogeneous Elements: Lists can store multiple data types, such as numeric, character, logical, etc.
  2. Flexible Size: Lists can hold elements of varying sizes and
  3. Named Elements: List elements can be named for easy


Creating a List

You create a list using the list() function. Here’s how to use and manipulate lists in R.


1.  Creating a List

A list can contain different data types, such as numbers, strings, vectors, and other lists.


Example: Simple List

# Creating a list with different data types
my_list <- list(name = “Roshan”, age = 28, scores = c(85, 90, 98)) print(my_list)
Output:
$name
[1] “Roshan”
$age
[1] 28
$scores
[1] 85 90 98


Explanation:

  • name is a character
  • age is a numeric
  • scores is a numeric


2. Accessing List:

You can access elements of a list using either double square brackets [[ ]] or the $ operator for named elements.

Example: Accessing Elements by Index

# Access the first element (name)
print(my_list[[1]]) # Output: “Roshan”
# Access the third element (scores)
print(my_list[[3]]) # Output: 85 90 98

Example: Accessing Elements by Name

# Access the name element
print(my_list$name) # Output: “Roshan”
# Access the scores element
print(my_list$scores) # Output: 85 90 98

3. Modifying List Elements

You can modify elements in a list by assigning new values to specific indices or names.


Example: Modifying List Elements

# Changing the age element
my_list$age <- 28
print(my_list$age) # Output: 28


4. Adding Elements to a List

Adding Elements to a List


Example: Adding a New Element

# Adding a new element to the list
my_list$city <- “Delhi”
print(my_list)
Output:
$name
[1] “Roshan”
$age [1] 28
$scores
[1] 85 90 98
$city
[1] “Delhi”

5. Removing Elements from a List

To remove an element from a list, assign it NULL.


Example: Removing an Element

# Removing the city element
my_list$city <- NULL
print(my_list)


6.  Length of a List

You can find the number of elements in a list using the length() function.

Example: Length of a List

# Finding the length of the list

print(length(my_list)) # Output: 3

7. Converting Lists to Vectors

Sometimes you may need to convert a list to a vector. This can be done using the unlist() function, which simplifies a list to produce a vector.

Example: Convert List to Vector


# Converting a list to a vector
vector_from_list <- unlist(my_list)
print(vector_from_list)
Output:
name     age    scores1 scores2 scores3 “
Roshan” ” 28″  “85”         “90”         “98”

Lists are highly flexible and allow you to organize diverse data types in a single structure, making them essential for managing complex datasets in R.

Data Frames

In R, a data frame is a two-dimensional, table-like data structure that stores data in rows and columns, much like a spreadsheet or SQL table. Data frames are one of the most important and commonly used data structures in R, especially for data analysis and manipulation. Each column in a data frame can contain different types of data (numeric, character, logical, etc.), but all elements within a column must be of the same type.

Characteristics of Data Frames:

  1. Heterogeneous Columns: Each column can contain a different data type (e.g., numeric, character, or logical), but all values within a column must be of the same
  2. Tabular Structure: Data frames have rows and columns, where rows represent observations and columns represent
  3. Flexible Size: Data frames can grow or shrink as needed by adding or removing rows or columns.


Creating Data Frames

You can create a data frame using the data.frame() function.


Syntax of data.frame() Function:

data.frame(column1 = c(values), column2 = c(values), …)


1.  Creating a Data Frame

Example: Simple Data Frame

# Creating a data frame students_df <- data.frame(
Name = c(“Roshan”, “Harish”, “Aaditya”),
Age = c(27, 22, 24),
Scores = c(95, 90, 88)
)
print(students_df)


Output:

Name   Age Scores

  • Roshan 27 95
  • Harish 22     90
  • Aaditya 24 88

Explanation:

  • Name is a character column storing the names of
  • Age is a numeric column storing the age of
  • Scores is a numeric column storing the exam scores of

1.  Accessing Data in a Data Frame

You can access specific elements, rows, or columns in a data frame using square brackets [] or the $ operator for column access.

Example: Accessing Columns by Name

# Accessing the Name column
print(students_df$Name) # Output: “Roshan” “Harish” “Aaditya


Example: Accessing Elements by Index

# Accessing the element in the 2nd row and 3rd column (Harish’s score)
print(students_df[2, 3]) # Output: 90


Example: Accessing Rows or Columns

# Combining two data frames column-wise
df1 <- data.frame(ID = 1:3, Age = c(27, 26, 25))
df2 <- data.frame(Name = c(“Roshan”, “Nilesh”, “Aaditya”))
combined_df <- cbind(df1, df2) print(combined_df)

2.  Combining Data Frames

You can combine two data frames using cbind() (column-wise) and rbind() (row-wise).

Example: Column Binding Two Data Frames

# Combining two data frames column-wise
df1 <- data.frame(ID = 1:3, Age = c(27, 26, 25))
df2 <- data.frame(Name = c(“Roshan”, “Nilesh”, “Aaditya”)) combined_df <- cbind(df1, df2)
print(combined_df)

ID Age Name
1 27 Roshan
2 26 Nilesh
3 25 Aaditya


Data frames are crucial in R for handling datasets, performing data analysis, and

conducting statistical operations, making them one of the most frequently used data structures in the R programming language.

Factor

In R, a factor is a data structure used to represent categorical data. Factors are particularly useful for storing data that has a limited number of unique values, such as gender, age

groups, survey responses, or any type of categorical data. Factors store both the values and the levels (categories), making them ideal for handling and analyzing categorical data in statistical modeling.

Characteristics of Factors:

  1. Categorical Data Representation: Factors store categorical variables with a fixed number of unique values called levels.
  2. Levels: Factors store the distinct categories as levels and can be ordered or
  3. Efficient Storage: Factors are more efficient than character vectors for categorical data as they store categories as integers under the
  4. Useful for Analysis: Many statistical functions in R, such as regression, treat factors

Creating Factors

You can create a factor using the factor() function.

Syntax of factor() Function:

factor(x, levels = c(…), ordered = TRUE/FALSE)

Explanation

  • x: The vector containing the values to be converted to
  • levels: A vector specifying the unique
  • ordered: Whether the factor is ordered (i.e., if the categories have a meaningful order).

1.  Creating a Factor

Example: Simple Factor
# Creating a factor for gender
gender <- factor(c(“Male”, “Female”, “Female”, “Male”, “Female”))
print(gender)

Output:

  • Male Female Female Male Female Levels: Female Male

Explanation:

  • The factor gender has two levels: “Female” and “Male”.
  • R automatically detects the unique categories and stores them

2.  Specifying Levels in Factors

You can explicitly specify the levels of a factor using the levels argument.

Example: Specifying Levels

# Specifying levels for education levels
education <- factor(c(“High School”, “College”, “Masters”,
“PhD”, “Masters”), levels = c(“High School”, “College”, “Masters”, “PhD”))
print(education)

Output:

  • High School College Masters    PhD        Masters Levels: High School College Masters PhD

3.  Ordered Factors

For ordinal data (where categories have a meaningful order), you can create ordered factors by setting ordered = TRUE.

Example: Creating an Ordered Factor

# Creating an ordered factor for satisfaction levels
satisfaction <- factor(c(“Low”, “Medium”, “High”, “Medium”, “Low”),
levels = c(“Low”, “Medium”, “High”), ordered = TRUE)
print(satisfaction)

Output:

  • Low Medium High Medium Low Levels: Low < Medium < High

Explanation:

  • ordered = TRUE makes the factor levels ordered, indicating a progression from “Low” to “High”.

4.  Converting Factors to Numeric and Character

You can convert factors back to numeric or character values using as.numeric() or as.character().

Example: Converting a Factor to Numeric

# Converting satisfaction to numeric values (indices of the levels)
numeric_satisfaction <- as.numeric(satisfaction)
print(numeric_satisfaction)
# Output: 1 2 3 2 1

5.  Factors in Data Frames

Factors are often used in data frames, especially for categorical columns.


Example: Factors in a Data Frame

# Creating a data frame with factors
survey_df <- data.frame(
Respondent = c(“Roshan”, “Aaditya”, “Chalsee”, “Harish”),
Satisfaction = factor(c(“High”, “Medium”, “High”,”Medium”),
levels = c(“Low”, “Medium”, “High”), ordered = TRUE) )
print(survey_df)
Output:
Respondent Satisfaction
1 Roshan High
2 Aaditya Medium
3 Chalsee High
4 Harish Medium


Factors are essential in R for representing categorical data efficiently, and they play a significant role in data analysis, especially in summarizing, visualizing, and modeling categorical data.

Conclusion:
  • Vectors are for homogeneous one-dimensional data.
  • Lists can hold heterogeneous data, including other data structures.
  • Matrices are for two-dimensional homogeneous data.
  • Arrays are for multi-dimensional homogeneous data.
  • Data Frames are for heterogeneous tabular data.
  • Factors are for categorical variables with specific levels.

Each structure has its ideal use case, depending on the type and complexity of the data.

Join thousands of professionals who have transformed their careers and landed in world-class product companies.

Trusted by

Upgrade Your Skills to Achieve Your Dream Job

error text
Thanks! You are being redirected
Oops! Something went wrong while submitting the form.