Trading with Python Intro – Data Import

Traditionally, there have been two general ways of analyzing market data:

  • fundamental analysis – focused on underlying fundamental data
  • technical analysis – focused on charts and price movements

In recent years, computer science and mathematics revolutionized trading, it has become dominated by computers helping to analyze vast amounts of available data.  Algorithms are responsible for making trading decisions faster than any human being could. Machine learning and data mining techniques are growing in popularity, all that falls under one broad category called ‘quantitative trading’ or ‘algorithmic trading’.

Below, I intend to provide you with basic tools for handling and analyzing market data with aim of generating profit from buying and selling financial instruments.

Python Programming Language

Currently, among the hottest programming languages for finance, you’ll find R and Python, alongside languages such as C++, C# and Java. I think Python or R is the right choice for many traders today. In this post, I assume you’re more or less starting from scratch or with very basic knowledge of Python, which by the way is one of the more approachable languages.

It is good to get the feeling of general Python programming before moving on with application to trading, there is a number of books and tutorials most available free or almost free:

  1. A Byte of Python by Swaroop C H
  2. Python for Everybody – Prof.Charles Severance
  3. Python Programming  by Wikibooks
  4. Think Python: How would you Think Like a Computer Scientist by Allen Downey
  5. Dive Into Python 3 by Mark Pilgrim

All blow examples of code are for Python 3.5 with Anaconda distribution available there – www.continuum.io

The whole point of trading is to predict with certain probability what will be market behavior in future and take advantage of that. Very often it can be as simple as ‘go long’ while expecting market prices to go up or ‘go short’ while expecting market prices to drop.

Defining our ‘view’ on market or expectations about future price changes usually takes some kind of market data analysis, to do it we need data first.

Data Import

There are many ways to import data to python, one of most common is using pandas-datareader package (starting with Pandas 0.19 on), it allows to import data from multiple sources like Yahoo Finance, Google Finance, Quandl, World Bank or OECD.

The easiest way to install it is:

pip install pandas-datareader

or

conda install -c anaconda pandas-datareader

When pandas-datareader is installed getting historical data takes only a few inputs, start and end date of a period we require the data for, ticker symbol and source of data.

Something like this:

import pandas_datareader.data as web
import datetime
start = datetime.datetime(2000, 1, 1)
end = datetime.datetime(2017, 1, 1)
data = web.DataReader('AAPL', 'yahoo', start, end)

Let’s check what we have:

print (data.tail())

Yahoo Finance gives back:

  • Date – quotation date
  • Open – open price
  • High – highest price for the day
  • Low – lowest price of the day
  • Close – close price
  • AdjClose – close rice with adjustments eg.stock split or dividend.
  • Volume – trade volume for the day

Let’s check Google Finance, we need as above Apple for a period of 01/01/2000 to  01/01/2017.

import pandas_datareader.data as web
import datetime
start = datetime.datetime(2000, 1, 1)
end = datetime.datetime(2017, 1, 1)
data = web.DataReader("AAPL", 'google', start, end)
print (data.tail())

We get an exact same set of data, except AdjClose value, so stock splits or dividends are not included.

Another very valuable source of financial/economic data can be Quandl.com.

import pandas_datareader.data as web
symbol = 'WIKI/AAPL' # or 'AAPL.US'
data = web.DataReader(symbol, 'quandl', "2000-01-01", "2017-01-01")
print(data.tail())

Again just like Yahoo Finance, Quandl delivers AdjClose information. There is plenty of other sources accessible via pandas-datareader, for more detailed information and examples pelase go there:

https://pandas-datareader.readthedocs.io/en/latest

Pandas-datareader is very useful and offers plenty of options, although not the only solution, you can also use libraries like Quandl.

Running this line of code installs the package:

pip install quandl

Getting data is very similar to Pandas-datareader:

import quandl 
data = quandl.get("WIKI/AAPL", start_date="2000-10-01", end_date="2017-01-01")
print(data.tail())

After downloading the data it is always useful to save a local copy for further work, as generating online query every time data is required may be very time consuming for larger data sets eg. 100 tickers.

import pandas_datareader.data as web
import datetime
start = datetime.datetime(2000, 1, 1)
end = datetime.datetime(2017, 1, 1)
data = web.DataReader('AAPL', 'yahoo', start, end)
data.to_csv('data.csv')

After securing an own copy of the data, we can quickly read it in using pandas:

import pandas as pd
data = pd.read_csv('data.csv', index_col='Date', parse_dates=True)
print(data.tail())

Working with Data

Having all data saved, we can start looking at them in more detail, for purpose of this intro, we will use Adjusted Close values only. Let’s select that from the whole dataset:

AdjClose=data['Adj Close']
print(AdjClose.tail())

Next thing useful for trading would be to try and plot it, we can do it using pandas:

AdjClose.plot()

As we are about to use quantitative methods, let’s see some statistics about the Adjusted Close price values:

AdjClose.describe()

For a variety of reasons that are out of scope for this text, it is better to work with daily returns rather than nominal prices of financial instruments. There is a very simple way to returns using pandas:

Rets = AdjClose.pct_change(1)

We can also plot returns:

Rets.plot()

Sometimes instead of simple returns, we may like to use log returns, it is easy to do it using Numpy:

import numpy as np
LogRets = np.log(AdjClose.pct_change()+1)

Lets plot log returns:

LogRets.plot()

In next part of this introduction, we will move on to code first trading strategy.

Was the above useful? Please share with others on social media.

If you want to look for more information, check some free online courses available at   coursera.orgedx.org or udemy.com.

Recommended reading list:

Pairs Trading: Quantitative Methods and Analysis

The first in-depth analysis of pairs trading
Pairs trading is a market-neutral strategy in its most simple form. The strategy involves being long (or bullish) one asset and short (or bearish) another. If properly performed, the investor will gain if the market rises or falls. Pairs Trading reveals the secrets of this rigorous quantitative analysis program to provide individuals and investment houses with the tools they need to successfully implement and profit from this proven trading methodology. Pairs Trading contains specific and tested formulas for identifying and investing in pairs, and answers important questions such as what ratio should be used to construct the pairs properly.
Ganapathy Vidyamurthy (Stamford, CT) is currently a quantitative software analyst and developer at a major New York City hedge fund.