Introduction to R
Programming Basics in R
Concepts
Data frame: A collection of columns containing data, similar to a spreadsheet or SQL table.
Data structure: A format for organizing and storing data.
Argument: Information needed by a function in R in order to run.
Variable: A representation of a value in R that can be stored for later use.
Vector: A group of data elements of the same type stored in a one-dimensional sequence in R.
Factor: An object that stores categorical data where the data values are limited and usually based on a finite group, such as country or year.
Function: A body of reusable code for performing specific tasks in R.
Nested: Code that performs a particular function and is contained within code that performs a broader function.
Nested function: A function that is completely contained within another function.
List: A vector whose elements can be of any type.
Matrix: A two-dimensional collection of elements with rows and columns.
Operators
Assignment operator: An operator used to assign values to variables and vectors.
Arithmetic operator: An operator used to perform basic math operations such as addition, subtraction, multiplication, and division.
+
-
*
/
Relational operator: An operator used to compare values, also known as a comparator.
Logical operator: An operator that returns a logical data type.
&
and |
or !
not
Conditional statement: A declaration that if a certain condition holds, then a certain event must take place.
If/Else if condition:
if (condition){
operation_1
} else if (condition) {
operation_2
} else {
operation_3
}
Pipe: A tool in R for expressing a sequence of multiple operations, represented with
%>%
.
Packages in R
CRAN(Comprehensive R Archive Network): An online archive with R packages, source, code, manuals, and documentation.
Package: A unit of reproducible R code.
Library: A directory containing all of a data analyst’s installed packages.
Vignette: Documentation for an R package that describes the problem the package is designed to solve, explains how its function can be used, and lists any dependencies on other packages.
Tidyverse: A system of packages in R with a common design philosophy of data manipulation, exploration, and visualization.
readr: An R package in Tidyverse used for importing data.
tidyr: An R package in Tidyverse used for data cleaning to make tidy data.
dplyr: An R package in Tidyverse that offers a consistent set of functions to complete common data-manipulation tasks.
ggplot2: An R package in Tidyverse that creates a variety of data visualizations by applying different visual properties to the data variables in R.