Working with data (sometimes large number of variables, or large number of observations), requires adding and removal of variables.

In this article, we’ll discuss 4 ways:

  • With assigning a new column:
  • Mutate:
  • Using data.frame() command:
  • Using the cbind() command:

First, let’s define the dataset we’re working with:

We’ll use the set.seed() function with the value 1000, so the database can be re-generated:

set.seed(1000)
Gender <- sample(size = 1000, replace = T, x = c("Male", "Female"))
City <- sample(size = 1000, replace = T, 
               x = c("NY", "LA", "Boston", "Chicago", "Denver", "London", "Paris", "Rome"))
Fav_Pet <- sample(size = 1000, replace = T, 
                  x = c("Cat", "Dog", "Fish", "Squirl"))
Salary_yearly <- sample(x = seq(55000, 150000), size = 1000, replace = T)

Weight_kg <- ifelse(Gender == "Male", sample(x = seq(80,120), size = 1000, replace = T), 
                    sample(x = seq(45,70), size = 1000, replace = T))  
Height_m <- ifelse(Gender == "Male", sample(x = seq(175,210), size = 1000, replace = T) / 100, 
                   sample(x = seq(155,180), size = 1000, replace = T) / 100)  

data <- data.frame(Gender, City, Fav_Pet, Height_m, Weight_kg, Salary_yearly)

Let’s say we want to add the variable sweets. This is how we’ll generate it:

sweet <- sample(size = 1000, c("cookie","ice cream", "candy"), replace = T)

Assigning a new column to an existing data frame:

With this method, all we need to do is to use the “$” operator with the name of the new variable, after stating the name of the data frame;

That’s the code for that:

data$sweet <- sweet

The Mutate command:

For this method we create a new variable, with the dplyr package; This is how it’s done:

data <- data %>% mutate(sweet = sweet)

The data.frame() command:

The data.frame function from base R, was designed to wrap a set of vectors to a data frame, or attach two or more data frames together, or to attach one or more vectors with one or more existing data frames.

In this case, this is the command:

data <- data.frame(data, sweet)

The cbind command:

The cbind command also belong to base R, and effectively bind between column, tables, data frames and matrices; That’s the code for that:

data <- cbind(data, sweet)

Note that both with the cbind and the data.frame commands, it is possible to set the order of the vectors and data frames and matrices we’re combining just by setting the order of those elements within the functions.