Project By:

Atish Davda - Email

Parshant Mittal - Email

Faculty Advisor

Michael Kearns - Email - Website





Movements in financial markets are directly influenced by information exchange – between a company and its owners, between the government and its citizens, between one individual and another. The channels of distributing news have expanded from the singular ticker tape in the middle of town to intra-minute delivery to the computer via RSS feeds. With information quickly available markets are becoming increasingly efficient, as humans design intricate algorithms to continuously take advantage of any perceived mispricing in the markets (Kelly, 2007). This phenomenon, which is especially prevalent in the stock market, begs the question: is there still an active need for the human element? After all, machines are faster – given more information and better hardware, their computation power decidedly exceeds that of humans. The answer lies in the challenge of abstraction; deciding the impact of each piece of information is important and more isn’t always better (Greenwald, Jennings, & Stone, 2003).

In this project we explored the field of natural language processing and identified methods we can use to automate stock trading based on news articles. The project was implemented in three phases (see Appendix 1). The first phase included data collection from sources on the web. News articles and headlines were scraped from Yahoo! Finance; historical market data was collected from Google Finance. The data was collected for 600 small market cap stocks (SML), 400 medium market cap stocks (MID) and 500 stocks from S&P 500 index (SP500). The second phase included sentiment analysis on the first half of the dataset, in order to compute sentiments to be tested on the (out of sample) second half. In the final stage, we implemented an NLP approach to quantifying the headlines. This was done using a number of NLP packages available online, including the Stanford Lex Parser, WordNET, and General Inquirer. The last stage of the project comprised of developing a trading module with which we could incorporate the results of historical market, sentiment, and NLP analysis to give a Buy, Sell, or a Hold recommendation for securities under consideration. Using sentiment and NLP analysis we were able to achieve significantly improved returns. In fact we averaged a return of 4.0% over a two month period (27% annualized), while the market fell 8.7% during the same period (-42.1% annualized). With the help of this and other metrics, we explored the value of NLP in automated trading.

Read on...