# ORDER FLOW GRAPH - Project Overview

## Status & Progress

**Current Phase:** 🚀 **Multi-Asset Visualizer** (Complete)

**Status:** ✅ **PROTOTYPE COMPLETE**

**What's Working:**
- ✅ Binance order book collection (WebSocket API)
- ✅ Multi-asset support (BTC, ETH, Gold)
- ✅ Order flow graph construction
- ✅ Anomaly detection (absorption, squeeze, exhaustion)
- ✅ Trading signal generation (entry/target/stop)
- ✅ Interactive web visualizer
- ✅ Price stale filtering (auto-remove old signals)

**Latest Update (2026-05-15):**
- Switched from Sierra data to **Binance order book** (better data quality)
- Built **multi-asset visualizer** UI
- Added **BTC/ETH/Gold** support
- Implemented **price proximity filter** (removes stale signals)
- Created **all-in-one collector** script

---
**Root folder:** `/home/ubuntu/.hermes/workspace/projects/ORDER_FLOW_GRAPH/`
**Project file:** `ORDER_FLOW_GRAPH.md`
**Feedback file:** `ORDER_FLOW_GRAPH_FEEDBACK.md`

## Goal

Build an **order flow graph** that models the market as a dynamic marketplace of buyers and sellers at price levels. Use graph structures to:

1. **Visualize order book** as buyer/seller nodes at price levels
2. **Track order flow** as edges between price levels over time
3. **Detect anomalies** in buyer/seller behavior
4. **Identify breakout patterns** before they happen
5. **Find jump-in/jump-out points** for profitable entries

---

## Core Concept

### Traditional Order Book View
```
Price    | Bid Size | Ask Size
---------|----------|----------
2450.00  | 100      | 50
2450.50  | 200      | 150
2451.00  | 150      | 300
```

### Order Flow Graph View
```
[Bid Node: 2450.00, Size: 100] ←→ [Ask Node: 2450.00, Size: 50]
        ↓                                 ↓
[Imbalance Signal]              [Absorption Signal]
        ↓                                 ↓
[Next Level Prediction]        [Breakout Probability]
```

### Graph Schema

**Node Types:**
```
PriceLevel (e.g., 2450.00)
  - Properties: price, timestamp, total_bid, total_ask, imbalance, delta

BidOrder (individual buy orders)
  - Properties: size, order_type (market/limit), timestamp, participant_id

AskOrder (individual sell orders)
  - Properties: size, order_type (market/limit), timestamp, participant_id

Participant (buyers/sellers)
  - Properties: id, type (institutional/retail), aggression_score

Anomaly (unusual activity)
  - Properties: type (absorption/squeeze/exhaustion), severity, timestamp
```

**Relationship Types:**
```
PriceLevel —[BID_AT]→ BidOrder
PriceLevel —[ASK_AT]→ AskOrder
BidOrder —[FROM]→ Participant
AskOrder —[FROM]→ Participant
PriceLevel —[NEXT_LEVEL]→ PriceLevel (sequence)
PriceLevel —[IMBALANCE_SIGNAL]→ PriceLevel (flow direction)
Anomaly —[DETECTED_AT]→ PriceLevel
Anomaly —[LEADS_TO]→ PriceLevel (predicted breakout)
```

---

## Use Cases

### 1. Breakout Detection

**Pattern:** Heavy absorption at ask level + drying up bids = Potential upside breakout

**Graph Query:**
```
MATCH (pl:PriceLevel)-[:HAS_ANOMALY]->(a:Absorption)
WHERE a.direction = 'ask' AND a.severity > 0.8
MATCH (pl)-[:NEXT_LEVEL*1..3]->(future_pl:PriceLevel)
RETURN pl.price, a.severity, future_pl.price
ORDER BY a.severity DESC
```

**Signal:** Jump in long when absorption detected at ask

### 2. Jump-In/Jump-Out Points

**Pattern:** Order exhaustion + aggressive buyers = Jump-in point

**Graph Query:**
```
MATCH (pl:PriceLevel)-[:HAS_ANOMALY]->(a:Exhaustion)
WHERE pl.imbalance > 0.7  // More bids than asks
MATCH (pl)-[:NEXT_LEVEL]->(next_pl:PriceLevel)
RETURN pl.price as entry, next_pl.price as target, a.confidence
```

**Signal:** Enter at exhaustion level, exit 1-2 levels ahead

### 3. Absorption Detection

**Pattern:** Large limit orders absorbing market orders (price doesn't move despite volume)

**Graph Query:**
```
MATCH (pl:PriceLevel)-[:BID_AT|ASK_AT]->(orders)
WHERE orders.type = 'limit' AND orders.size > threshold
WITH pl, sum(orders.size) as total_absorption
WHERE total_absorption > pl.total_volume * 2
RETURN pl.price, total_absorption, pl.total_volume
```

**Signal:** Absorption = potential reversal or strong support/resistance

### 4. Squeeze Detection

**Pattern:** Narrow spread + thin order book + high delta = Volatility squeeze

**Graph Query:**
```
MATCH (pl:PriceLevel)
WHERE pl.spread < avg_spread * 0.5
  AND pl.total_bid + pl.total_ask < avg_liquidity * 0.3
  AND abs(pl.delta) > avg_delta * 2
RETURN pl.price, pl.spread, pl.delta, "SQUEEZE"
```

**Signal:** Squeeze = potential breakout (direction depends on delta)

---

## Data Sources

### 1. Sierra Chart Tick Data
**Location:** `/home/ubuntu/.hermes/workspace/tick_collector_api/ticks.db`

**Available:**
- Tick-by-tick trades with bid/ask
- Can compute order flow metrics (delta, cumulative delta)
- Can reconstruct order book snapshots

**Limitations:**
- No full order book depth (only top of book)
- No individual order IDs (can't track participants)

### 2. MT5 XAUUSD Data
**Location:** `/mnt/mt5/terminal/122160/Common/Files/Data/XAUUSD_PERIOD_M1_0.csv`

**Available:**
- OHLCV data (not order book)
- Tick volume (proxy for activity)
- Can compute volume delta if bid/ask available

**Limitations:**
- No order book depth
- No bid/ask prices

### 3. Binance XAUTUSDT Data
**Location:** `/opt/git/arbitrage/data/XAUTUSDT/klines/1m/*.parquet`

**Available:**
- OHLCV + trade count
- Volume profile (if available)

**Limitations:**
- No order book depth
- No bid/ask prices

### 4. Future: Order Book APIs
**Potential sources:**
- Binance order book API (depth snapshots)
- Sierra Chart Numbers Bars (footprint data)
- Exchange APIs with market depth

---

## Implementation Plan

### Phase 1: Prototype (Sierra Tick Data)

**Goal:** Build order flow graph from Sierra tick data

**Tasks:**
1. Load Sierra tick data from SQLite
2. Compute order flow metrics (delta, cumulative delta)
3. Build price level nodes with bid/ask imbalances
4. Create edges between price levels (temporal sequence)
5. Detect simple anomalies (absorption, exhaustion)
6. Query for breakout patterns

**Deliverables:**
- Order flow graph schema
- Sierra tick → graph pipeline
- Basic anomaly detection
- Simple breakout signals

**Estimated time:** 4-6 hours

### Phase 2: Enhanced Detection

**Goal:** Add sophisticated anomaly detection

**Tasks:**
1. Absorption detection (large orders absorbing flow)
2. Squeeze detection (thin book + high delta)
3. Exhaustion detection (order depletion)
4. Aggressive buyer/seller identification
5. Multi-timeframe order flow (M1/M5/M15)

**Deliverables:**
- Advanced anomaly types
- Multi-timeframe graphs
- Confidence scores for signals

**Estimated time:** 6-8 hours

### Phase 3: Graph RAG Integration

**Goal:** Combine with existing GRAPH_RAG_TRADING_ASSETS

**Tasks:**
1. Link order flow anomalies to price patterns
2. Correlate absorption with lead-lag edges
3. Build multi-hop queries (order flow → price action)
4. Create composite signals (order flow + lead-lag)

**Deliverables:**
- Integrated graph (order flow + price relationships)
- Composite trading signals
- Enhanced breakout prediction

**Estimated time:** 4-6 hours

### Phase 4: Backtesting & Validation

**Goal:** Test order flow signals historically

**Tasks:**
1. Backtest absorption → breakout signals
2. Calculate edge scores, hit rates
3. Optimize thresholds (absorption size, imbalance level)
4. Validate on OOS data

**Deliverables:**
- Backtest results
- Edge scores for each pattern
- OOS validation
- Trading strategy recommendations

**Estimated time:** 6-8 hours

---

## Order Flow Metrics

### Core Metrics

**Delta:**
```
Delta = Ask_Volume - Bid_Volume
Positive delta = More buying pressure
Negative delta = More selling pressure
```

**Imbalance:**
```
Imbalance = (Bid_Volume - Ask_Volume) / (Bid_Volume + Ask_Volume)
Range: -1 (all asks) to +1 (all bids)
```

**Absorption Ratio:**
```
Absorption = Limit_Orders / Market_Orders
High ratio = Large orders absorbing flow
```

**Order Exhaustion:**
```
Exhaustion = Remaining_Orders / Initial_Orders
Low ratio = Order depletion
```

**Spread:**
```
Spread = Ask_Price - Bid_Price
Wide spread = Low liquidity
```

### Derived Metrics

**Aggression Score:**
```
Aggression = Market_Order_Volume / Limit_Order_Volume
High score = Aggressive buying/selling
```

**Participation Rate:**
```
Participation = Participant_Volume / Total_Volume
Identifies institutional activity
```

**Order Flow Toxicity:**
```
Toxicity = Aggressive_Trades / Passive_Trades
High toxicity = Toxic order flow (adverse selection)
```

---

## Anomaly Patterns

### 1. Absorption

**Definition:** Large limit orders absorbing market orders without price moving

**Detection:**
- High volume at price level
- Price fails to break through
- Delta shows rejection

**Graph Representation:**
```
PriceLevel —[ABSORBS]-> (Absorption Anomaly)
Absorption Anomaly —[PREDICTS]-> Reversal
```

### 2. Squeeze

**Definition:** Narrow order book + high delta = Volatility squeeze

**Detection:**
- Spread < 50% of average
- Total liquidity < 30% of average
- |Delta| > 2x average

**Graph Representation:**
```
PriceLevel —[SQUEEZE_SIGNAL]-> (Squeeze Anomaly)
Squeeze Anomaly —[LEADS_TO]-> Breakout
```

### 3. Exhaustion

**Definition:** Order depletion at key level (no more buyers/sellers)

**Detection:**
- Order count decreasing rapidly
- Volume drying up
- Imbalance shifting

**Graph Representation:**
```
PriceLevel —[EXHAUSTION]-> (Exhaustion Anomaly)
Exhaustion Anomaly —[SIGNALS]-> Reversal or Breakout
```

### 4. Iceberg Order

**Definition:** Large hidden orders revealed in small slices

**Detection:**
- Repeated similar-sized orders at same level
- Total volume much larger than visible
- Pattern persistence over time

**Graph Representation:**
```
Participant —[PLACES_ICEBERG]-> PriceLevel
PriceLevel —[HAS_HIDDEN_DEPTH]-> true
```

---

## Signal Generation

### Jump-In Signals

**1. Absorption Reversal:**
```
IF absorption detected at ask
AND delta turning positive
AND next level has room
THEN jump in long
Target: 1-2 levels up
Stop: Below absorption level
```

**2. Exhaustion Breakout:**
```
IF buyers exhausted at resistance
AND new sellers entering
AND price breaks level
THEN jump in short
Target: Next support level
Stop: Above resistance
```

### Jump-Out Signals

**1. Take Profit at Next Level:**
```
IF price approaching next level
AND level has large orders (absorption)
AND momentum slowing
THEN jump out
```

**2. Squeeze Avoidance:**
```
IF squeeze detected
AND unclear direction
AND high volatility
THEN jump out
```

---

## Integration with GRAPH_RAG_TRADING_ASSETS

### Combined Graph Schema

```
[Order Flow Graph] ←→ [Price Relationship Graph]
    ↓                        ↓
Order Flow Anomalies    Lead-Lag Patterns
    ↓                        ↓
[Composite Signals] → Trading Decisions
```

### Example Composite Query

```
// Find order flow absorption + XAUTUSDT lead-lag
MATCH (pl:PriceLevel)-[:HAS_ANOMALY]->(a:Absorption)
MATCH (pl)-[:NEXT_LEVEL*1..2]->(future_pl:PriceLevel)
MATCH (xautusdt:Asset)-[:LEADS_LAG]->(xauusd:Asset)
WHERE xautusdt.name = 'XAUTUSDT'
  AND xauusd.direction = 'bearish'
  AND a.direction = 'bid'  // Absorbing bids
RETURN pl.price, a.severity, xauusd.strength
ORDER BY a.severity * xauusd.strength DESC
```

**Interpretation:** Strong absorption + strong lead-lag = High-confidence short signal

---

## Success Metrics

**Phase 1:**
- ✅ Order flow graph built from Sierra data
- ✅ Basic anomaly detection working
- ✅ Can query for absorption/exhaustion

**Phase 2:**
- ✅ Advanced anomalies detected (absorption, squeeze, exhaustion)
- ✅ Multi-timeframe analysis
- ✅ Confidence scores computed

**Phase 3:**
- ✅ Integrated with GRAPH_RAG_TRADING_ASSETS
- ✅ Composite signals generated
- ✅ Multi-hop queries working

**Phase 4:**
- ✅ Backtested with positive edge
- ✅ OOS validation passed
- ✅ Tradeable signals identified

**Overall:**
- **Hit rate:** > 60% on jump-in signals
- **Risk-reward:** > 1:2 on jump-out signals
- **False positive rate:** < 30%
- **Latency:** < 1 second for signal generation

---

## Open Questions

1. **Data sufficiency:** Sierra tick data has no order book depth - can we detect absorption without it?
2. **Alternative data:** Should we use Binance order book API for full depth?
3. **Timeframe:** Which timeframe for order flow analysis? (M1, M5, tick-level?)
4. **Thresholds:** What absorption ratio indicates true absorption? (2x volume? 3x?)
5. **Validation:** How to validate order flow signals without full order book history?

---

## Risks & Limitations

1. **No full order book:** Sierra tick data only has trades, not full depth
2. **Noise:** Tick data can be noisy, hard to distinguish signal from noise
3. **Latency:** Order flow signals are time-sensitive, need fast computation
4. **Overfitting:** Risk of detecting patterns that don't generalize
5. **Market changes:** Order flow patterns can change over time

---

## Related Projects

- **GRAPH_RAG_TRADING_ASSETS:** Price relationship graph (lead-lag, correlations)
- **SierraChart Integration:** Tick data collector
- **XAUUSD TPO Trade Plans:** Price-based entry signals

**Integration opportunity:** Combine order flow signals with price-based signals for higher confidence

---

**Next Steps:**
1. Confirm project scope and focus
2. Choose data source (Sierra tick vs Binance order book)
3. Decide on Phase 1 deliverables
4. Build prototype order flow graph

---

**Last updated:** 2026-05-15
**Status:** 🆕 Awaiting user confirmation
