Traditionally, there have been two general ways of analyzing market data:
- fundamental analysis – focused on underlying fundamental data
- technical analysis – focused on charts and price movements
In recent years, computer science and mathematics revolutionized trading, it has become dominated by computers helping to analyze vast amounts of available data. Algorithms are responsible for making trading decisions faster than any human being could. Machine learning and data mining techniques are growing in popularity, all that falls under one broad category called ‘quantitative trading’ or ‘algorithmic trading’.
Below, I intend to provide you with basic tools for handling and analyzing market data with aim of generating profit from buying and selling financial instruments.
Python Programming Language
Currently, among the hottest programming languages for finance, you’ll find R and Python, alongside languages such as C++, C# and Java. I think Python or R is the right choice for many traders today. In this post, I assume you’re more or less starting from scratch or with very basic knowledge of Python, which by the way is one of the more approachable languages.
It is good to get the feeling of general Python programming before moving on with application to trading, there is a number of books and tutorials most available free or almost free:
- A Byte of Python by Swaroop C H
- Python for Everybody – Prof.Charles Severance
- Python Programming by Wikibooks
- Think Python: How would you Think Like a Computer Scientist by Allen Downey
- Dive Into Python 3 by Mark Pilgrim
All blow examples of code are for Python 3.5 with Anaconda distribution available there – www.continuum.io
The whole point of trading is to predict with certain probability what will be market behavior in future and take advantage of that. Very often it can be as simple as ‘go long’ while expecting market prices to go up or ‘go short’ while expecting market prices to drop.
Defining our ‘view’ on market or expectations about future price changes usually takes some kind of market data analysis, to do it we need data first.
Data Import
There are many ways to import data to python, one of most common is using pandas-datareader package (starting with Pandas 0.19 on), it allows to import data from multiple sources like Yahoo Finance, Google Finance, Quandl, World Bank or OECD.
The easiest way to install it is:
pip install pandas-datareader
or
conda install -c anaconda pandas-datareader
When pandas-datareader is installed getting historical data takes only a few inputs, start and end date of a period we require the data for, ticker symbol and source of data.
Something like this:
import pandas_datareader.data as web import datetime start = datetime.datetime(2000, 1, 1) end = datetime.datetime(2017, 1, 1) data = web.DataReader('AAPL', 'yahoo', start, end)
Let’s check what we have:
print (data.tail())
Yahoo Finance gives back:
- Date – quotation date
- Open – open price
- High – highest price for the day
- Low – lowest price of the day
- Close – close price
- AdjClose – close rice with adjustments eg.stock split or dividend.
- Volume – trade volume for the day
Let’s check Google Finance, we need as above Apple for a period of 01/01/2000 to 01/01/2017.
import pandas_datareader.data as web import datetime start = datetime.datetime(2000, 1, 1) end = datetime.datetime(2017, 1, 1) data = web.DataReader("AAPL", 'google', start, end) print (data.tail())
We get an exact same set of data, except AdjClose value, so stock splits or dividends are not included.
Another very valuable source of financial/economic data can be Quandl.com.
import pandas_datareader.data as web symbol = 'WIKI/AAPL' # or 'AAPL.US' data = web.DataReader(symbol, 'quandl', "2000-01-01", "2017-01-01") print(data.tail())
Again just like Yahoo Finance, Quandl delivers AdjClose information. There is plenty of other sources accessible via pandas-datareader, for more detailed information and examples pelase go there:
Pandas-datareader is very useful and offers plenty of options, although not the only solution, you can also use libraries like Quandl.
Running this line of code installs the package:
pip install quandl
Getting data is very similar to Pandas-datareader:
import quandl data = quandl.get("WIKI/AAPL", start_date="2000-10-01", end_date="2017-01-01") print(data.tail())
After downloading the data it is always useful to save a local copy for further work, as generating an online query every time data is required may be very time-consuming for larger data sets eg. 100 tickers.
import pandas_datareader.data as web import datetime start = datetime.datetime(2000, 1, 1) end = datetime.datetime(2017, 1, 1) data = web.DataReader('AAPL', 'yahoo', start, end) data.to_csv('data.csv')
After securing an own copy of the data, we can quickly read it in using pandas:
import pandas as pd data = pd.read_csv('data.csv', index_col='Date', parse_dates=True) print(data.tail())
Working with Data
Having all data saved, we can start looking at them in more detail, for purpose of this intro, we will use Adjusted Close values only. Let’s select that from the whole dataset:
AdjClose=data['Adj Close'] print(AdjClose.tail())
Next thing useful for trading would be to try and plot it, we can do it using pandas:
AdjClose.plot()
As we are about to use quantitative methods, let’s see some statistics about the Adjusted Close price values:
AdjClose.describe()
For a variety of reasons that are out of scope for this text, it is better to work with daily returns rather than nominal prices of financial instruments. There is a very simple way to returns using pandas:
Rets = AdjClose.pct_change(1)
We can also plot returns:
Rets.plot()
Sometimes instead of simple returns, we may like to use log returns, it is easy to do it using Numpy:
import numpy as np LogRets = np.log(AdjClose.pct_change()+1)
Lets plot log returns:
LogRets.plot()