… for a trading firm (data extraction from web and files)

Stock exch 2


Business Challenge

  • A Scandinavian trading firm wanted to extract key information from text or writing, evaluate its implication and use it for automatic high speed trading quickly after the information is received without human intervention

Analytics Approach

  • Natural Language Processing (NLP) and Machine learning algorithms were developed to understand 8 different languages in which financial reports were filed (Quarterly and Annual reports), including English, German, Swedish, Danish, Finnish, etc.
  • The key to delivering high Precision and Recall is not just a great dictionary in the language of interest but the ability to understand the nuances in the grammar, jargons and context.
  • Filing data were extracted from websites, from PDF files, from text files and more, and key actionable information were extracted
  • Financially important numbers like revenue, margins, costs, EPS, merger, acquisition, etc. (more than 30 data points) were extracted and exported in a format suitable by the client’s systems for automatic high speed trading


  • After 6 months of testing the client switched from their previous market-leading text analytics software to this product for automatic high speed trading in the Oil & Gas space. This has had a remarkable impact on their ability to rapidly execute trades based on predesigned trading strategies

Example of quarterly filings data extracted from web and data classification into appropriate categories.

Text- info extract trading 2


Example of data extraction from PDF files.

600+ pages were scanned and 30+ data points extracted in <1 sec.

Txt- extrac from pdf 2

Example of data extraction from a text file- An unstructured press release (or any text) can be taken and content reorganized and presented in a structured way.

Text- extrac from txt file