Data Manipulation and Visualization Techniques in Julia – Part IV

QuantInsti

Contributor:
QuantInsti
Visit: QuantInsti

Learn how to create new dataframes with Part I, how to perform basic mathematical operations in Part II and see Part III for instructions on how to use the package RDatasets.jl.

Dealing with missing data

Julia has a “missing” object that is used for unavailable data. You can use skipmissing() function to perform operations ignoring the missing values.

Output:

ab
Int64?String?
1Apple
missingOrange
3missing
7Grapes

You can use dropmissing() function to remove the missing values.

ab
Int64String
1Apple
7Grapes

More details for dealing with missing values can be found here.


Importing and exporting data as CSV and Excel files

Reading data is the first step in analysing any kind of data. Most of the information we come across is either in CSV or excel format, so we’ll focus on these two. We will work with CSV.jl and XLSX.jl for dealing with CSV and Excel files.

Reading and writing CSV files

We’ll read a CSV file (infy.csv), as a dataframe, containing historical stock price data for Infosys downloaded from Yahoo finance for the period 21-Dec-2020 to 22-Dec-2021.

Here’s a summary for this data.

variablemeanminmedianmaxnmissingeltype
SymbolUnion…AnyUnion…AnyInt64DataType
Date 2020-12-22 2021-12-210Date
Open20.567416.3920.6324.050Float64
High20.716416.6920.77524.50Float64
Low20.409716.3620.5123.940Float64
Close20.568516.5820.72524.220Float64
Adj Close20.342216.266420.545124.220Float64
Volume7.09982e613206006.43815e6229118000Int64

Here, we calculate the range –

DateOpenHighLowCloseAdj CloseVolumerange
DateFloat64Float64Float64Float64Float64Int64Float64
2020-12-2216.3916.7416.3616.5816.266467144000.379999
2020-12-2316.916.9316.5716.5916.276259135000.36
2020-12-2416.6816.6916.5216.616.28613206000.170001
2020-12-2816.7316.8416.7216.7716.452842393000.120001
2020-12-2916.916.916.6716.7616.44384737000.23
2020-12-3016.8717.016.8316.9316.609838772000.17
2020-12-3117.0117.0316.8916.9516.629436937000.140002
2021-01-0417.3917.4317.0617.2516.9237125976000.370001
2021-01-0517.3217.6717.3217.6517.316281099000.35
2021-01-0617.417.7917.3417.7317.394691363000.450001
2021-01-0717.3617.5517.2617.5517.2181102720000.289999
2021-01-0818.0718.6118.0218.5918.2384178024000.590001
2021-01-1118.6818.8618.5518.7618.4052122206000.310002
2021-01-1218.9218.9418.5418.618.2482106291000.4
2021-01-1319.0319.0718.418.4318.0814184099000.67
2021-01-1418.5718.6518.1418.2217.8754132861000.510001
2021-01-1518.1918.3818.1118.1717.826374430000.269998
2021-01-1918.0818.1817.9518.1217.777371796000.229999
2021-01-2018.3718.4718.2918.418.05254085000.179998
2021-01-2118.3918.418.1518.217.855879634000.25
2021-01-2218.2318.2718.0618.1817.836156635000.210001
2021-01-2518.1518.2217.8417.9217.581160126000.379999
2021-01-2617.9217.9217.7517.8517.512454726000.17
2021-01-2717.6517.8917.4417.4717.1396113883000.449998
2021-01-2817.4617.7517.4117.6417.306478776000.34
2021-01-2917.1617.2316.8816.8816.560796714000.350001
2021-02-0117.1917.4217.0517.3817.051358292000.370001
2021-02-0217.4517.5117.3417.4417.110141198000.17
2021-02-0317.617.7517.4917.6517.316246778000.26
2021-02-0417.5417.6417.3617.5917.257344396000.279998

This updated dataframe can be saved using CSV.write() function.

Reading and writing excel files

We’ll use the XLSX.jl package in Julia to read and write excel files.

Here’s how it can be done –

DateOpenHighLowCloseAdj CloseVolume
AnyAnyAnyAnyAnyAnyAny
2020-12-2216.3916.7416.3616.5816.26646714400
2020-12-2316.916.9316.5716.5916.27625913500
2020-12-2416.6816.6916.5216.616.2861320600
2020-12-2816.7316.8416.7216.7716.45284239300
2020-12-2916.916.916.6716.7616.4438473700
2020-12-3016.8717.016.8316.9316.60983877200
2020-12-3117.0117.0316.8916.9516.62943693700
2021-01-0417.3917.4317.0617.2516.923712597600
2021-01-0517.3217.6717.3217.6517.31628109900
2021-01-0617.417.7917.3417.7317.39469136300
2021-01-0717.3617.5517.2617.5517.218110272000
2021-01-0818.0718.6118.0218.5918.238417802400
2021-01-1118.6818.8618.5518.7618.405212220600
2021-01-1218.9218.9418.5418.618.248210629100
2021-01-1319.0319.0718.418.4318.081418409900
2021-01-1418.5718.6518.1418.2217.875413286100
2021-01-1518.1918.3818.1118.1717.82637443000
2021-01-1918.0818.1817.9518.1217.77737179600
2021-01-2018.3718.4718.2918.418.0525408500
2021-01-2118.3918.418.1518.217.85587963400
2021-01-2218.2318.2718.0618.1817.83615663500
2021-01-2518.1518.2217.8417.9217.58116012600
2021-01-2617.9217.9217.7517.8517.51245472600
2021-01-2717.6517.8917.4417.4717.139611388300
2021-01-2817.4617.7517.4117.6417.30647877600
2021-01-2917.1617.2316.8816.8816.56079671400
2021-02-0117.1917.4217.0517.3817.05135829200
2021-02-0217.4517.5117.3417.4417.11014119800
2021-02-0317.617.7517.4917.6517.31624677800
2021-02-0417.5417.6417.3617.5917.25734439600

We can write an excel file using the writetable() function.

Julia has in-built read() and write() open() close() functions to work with text files. More details can be found here.

Data can be written in .jld format as well. .jld is Julia’s data format built using the JLD.jl package.

Details for the following packages can be found here –

Stay tuned for the next installment, in which Anshul Tayal will present how to create scripts for data visualization.

Visit QuantInsti to read the full article: https://blog.quantinsti.com/data-manipulation-visualization-using-julia/.

Disclosure: Interactive Brokers

Information posted on IBKR Traders’ Insight that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Traders’ Insight are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from QuantInsti and is being posted with permission from QuantInsti. The views expressed in this material are solely those of the author and/or QuantInsti and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.

Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.