FinTech · NLP · Predictive Analytics

TickerLens

AI-powered stock screening that lets you type "find undervalued tech stocks with high growth and low debt" instead of configuring 50 filters. Conversational market intelligence backed by real-time data, predictive ML, and sentiment analysis.

Client: TickerLensRole: ML & Backend EngineerYear: 2024ticker-lens.com ↗

NLPConversational AILLMsPredictive MLFinTechSentiment Analysis

The Problem

Retail investors are drowning in data but starving for signal. Every major brokerage gives you 50+ technical and fundamental filters — P/E, EV/EBITDA, RSI, MACD, beta, free cash flow yield. But these tools are designed for quants who already know what they're looking for. A first-time investor trying to find "AI companies that are actually profitable" has no idea which combination of filters maps to that concept.

On the other end of the spectrum, Bloomberg Terminal and similar institutional tools require full-time analysts to extract value. There's a massive gap between "I know what I want to find" and "I know how to configure the system to find it."

TickerLens's bet: natural language is the right interface for investment research. The same way ChatGPT democratized text analysis, a purpose-built conversational layer on top of financial data can bridge the gap between intent and insight.

The Core Systems

Conversational Screening Engine

The query pipeline translates natural language into a structured screening spec. A user query like "show me profitable small-cap tech stocks with recent insider buying" gets parsed through an intent classification step, then a slot-filling LLM call that extracts filter predicates (market cap range, sector, profitability metrics, insider transaction signals). These predicates are compiled into a SQL query against the normalized financial data store. The challenge was ambiguity. "High growth" means different things in biotech vs. utilities. I built a context-aware normalization layer that adjusts threshold definitions based on inferred sector, making "high revenue growth" a top-quartile screen relative to the sector peer group rather than an absolute number.

AI-Generated Ticker Summaries & SWOT

Every ticker page generates a fresh SWOT analysis on demand, synthesizing: SEC filing data (10-K highlights, MD&A excerpts), recent earnings call transcripts, news sentiment from the past 30 days, and technical signal summary. The output is structured and grounded — each SWOT point cites its source (e.g., "Management flagged supply chain headwinds in Q3 2024 earnings call"). I built a retrieval pipeline using Pinecone to store and query chunked SEC filings and earnings transcripts. The LLM prompt is constructed with the retrieved context plus live price/volume data, ensuring the summary is factual and current rather than hallucinated.

Predictive Price Direction Model

The forecasting module provides probability-based directional forecasts (up/down/flat) over 5, 10, and 30-day horizons. This is deliberately presented as probability ranges, not price targets — the goal is to calibrate user expectations, not give false precision. The underlying model is an ensemble of XGBoost (trained on technical features: RSI, MACD, Bollinger, volume patterns, earnings proximity) and a simple LSTM for momentum pattern recognition. I was careful to validate out-of-sample rigorously using walk-forward validation to prevent data leakage — a common failure mode in financial ML.

Visual Lens Boards

Standard tables hide relationships. A stock might look cheap in isolation but expensive relative to its peer group and to its own historical range. Lens Boards are interactive visualizations that position a ticker within its sector peer set across multiple dimensions simultaneously — valuation, growth, profitability, momentum. Built on D3.js with a custom React wrapper. The graph is force-directed when showing inter-stock relationships (correlation, sector clustering) and switches to a scatter matrix when showing fundamental comparisons.

Engineering Decisions Worth Calling Out

Grounding over generation

Every AI output is anchored to retrieved source documents. The SWOT analysis cites filings. The summary references the earnings call. This prevents the hallucination problem that would make a financial product actively dangerous.

Honest uncertainty quantification

Directional forecasts are presented as probability ranges with confidence scores, not point predictions. The model also surfaces its own out-of-sample accuracy for each ticker category so users can calibrate how much to weight it.

Real-time data freshness

Price data refreshes via WebSocket. Sentiment scores update every 4 hours via a scheduled news ingestion pipeline. Filing data updates within 2 hours of an SEC EDGAR submission. Redis caches expensive computations with TTLs matched to data freshness windows.

Sector-relative screening

Filter thresholds are percentile-relative to sector peers, not absolute. A 15% revenue growth rate is middling in cloud SaaS but exceptional in traditional retail — the platform knows the difference.

Stack

PythonFastAPIOpenAI GPT-4oLangChainPineconePostgreSQLRedisyfinance / Polygon.ioscikit-learnXGBoostNext.jsTypeScriptD3.jsTailwind CSS

← Cohort AI Next: Yuni →