Financial Data Manipulation in dplyr for Quant Traders

In this post, we’re going to show how a quant trader can manipulate stock price data using the dplyr R package.


Getting set up and loading data

Load the dplyr package via the tidyverse package.

if (!require(‘tidyverse’)) install.packages(‘tidyverse’)

First, load some price data.

energystockprices.RDS contains a data frame of daily price observations for 3 energy stocks.

prices <- readRDS(‘energystockprices.RDS’)

Financial Data

We’ve organised our data so that

  • Every column is variable.
  • Every row is an observation.

In this data set:

  • We have 13,314 rows in our data frame.
  • Each row represents a daily price observation for a given stock.
  • For each observation measure the open, high, low and close prices, and the volume traded.

This is a very helpful way to structure your price data. We’ll see how we can use the dplyr package to manipulate price data for quant analysis.

The main dplyr verbs

There are 6 main functions to master in dplyr.

filter() picks out observations (rows) by some filter criteria
arrange() reorders the observations (rows)
select() picks out the variables (columns)
mutate() creates new variables (columns) by applying transformations to existing variables
summarise() allows you to group and summarise data – reducing the data into a grouped summary with fewer rows.

Finally, the group_by() causes the verbs above to act on a group at a time, rather than the whole dataset.

In the next installment, the author will go through 6 main functions to master in dplyr.

