unsplash-image-aWslrFhs1w4.jpg

Automated, machine-learning driven trading


data

Data is to machine learning as language is to communication. Without data, no machine learning can take place. Without massive amounts of data that might have correlations, however slight, to pricing predictions, even the best machine learning algorithms will fail to accurately forecast price changes. Predictive Technology Systems mines massive data stores, harvesting bid, ask, volumes — among other data provided by the exchanges — at the tick level for years of trading history. One heavily traded equity might generate 200 ticks per second during trading hours. Imagine a simple Excel spreadsheet with a row of data for each tick and columns for bid, ask, volume, and other data elements. Each minute of data generates 12,000 rows of data; each trading hour increases the row count to 720,000; a trading day might reach as many at 5 million rows. A year of trading exceeds 1.3 billion rows. And, of course, that’s only one equity. Predictive Technology Strategies uses the Google Cloud to manage its massive databases.


building automated trading software

Automating human-driven trading strategies has been around for several decades and many broker/dealers offer APIs (application programming interfaces) that allow programmers to build software links between the human interface and the exchanges. PTS has automated its ML-drive trading systems with Interactive Brokers, but it is able to write code to interface with the client’s preferred broker/dealer. Currently, PTS’s system trades every second of the trading day and trades are executed within milliseconds. That said, our systems are not designed to compete with high speed trading systems which are designed to execute trade based on pre-existing algorithms. ML-driven decision making, predicts price changes (1 minute, 5 minutes, 1 hour) every second although it does require multiple seconds to make the trading decisions.


natural language processing (nlp)

NLP technology is used to transform information into digital inputs that can be understood and processed by ML software. NLP is used to assess the positive or negative impact of a recent array of words when compared with the impact of other word arrays that have taken place in the past. For example, NLP technology might “listen in” on an earnings call, translate the spoken works into text arrays and assess the position or negative impact of the text array when compared with past earnings calls. The “sentiment” of the text is recorded in a “column” of digital data with a numerical range spanning from -1.0 to +1. So a sentiment score of 0.975 would be seen as highly positive relative to past earnings calls. However, as any experienced trader knows, positive “sentiments” may or or may not translate into an increase in the price of the given equity. The job of the machine learning software is to search for correlations between these the digital inputs and that particular column of data in conjunction with many other data elements at that precise second of time.


proprietary feature engineering

The real challenge of machine learning applications isn’t mastering the complex mathematics embodied in the machine learning libraries or manipulating the Python or Kafka code. It’s building meaningful “features” which become the columns of data in the increasingly complex spreadsheet described above. Predictive Technology Systems has developed hundreds of proprietary features, some of which are the subject of patent applications features. Designing meaningful, relevant features is the essence of a ML-drive trading strategy. PTS are experts at building, testing, and evaluating new features.


Security & Best Practices

All client proprietary data - strategies, back test or paper test results, annotations, etc. — remain proprietary. We can ONLY access if invited to assist you, AND if you give us credentials – both which automatically expire. Our solutions are built atop our FaaS offering, built  with “security by design” principals​.

  • All user data lives behind a double-layered infrastructure. VPN only access, IP restricted and only point-to point accessible.

  • Databases are not accessible to the public internet, under any circumstances

  • All user data is encrypted at rest and in transit
    At rest: 4096 bit public/private key encryption
    In transit: SSL and HTTPS encryption

  • All database queries and flat file storage require 3 user keys along with encryption keys to access:
    Company ID: your primary organization account ID
    User ID: user within your  Organization
    Application ID: which application (trading strategy, etc) within your account

  • No access whatsoever without all three IDs, assuring that no access can be granted unintentionally

  • Regular penetration and security audit from trusted 3rd party providers

  • All software code undergoes peer-review and automated testing and is built using test-driven development