Synthetic Order Lifecycle
Generate synthetic order-lifecycle event traces (PLACE / MODIFY / CANCEL / FILL) that satisfy MiFID II / Reg NMS-flavour sequencing rules using a CSP solver.
What this template is for
Banking compliance and market-surveillance teams test their alert engines against synthetic order-event traces — sequences of PLACE, MODIFY, CANCEL, and FILL events that real trading desks generate. Producing realistic traces by hand is hard: the events have to honour temporal precedence (a fill cannot happen before the placement), status-transition validity (no events after a cancel), venue eligibility (the symbol has to be tradable on that venue), and quantity conservation (filled shares cannot exceed the order size).
This template formulates the trace-generation problem as a constraint satisfaction model using RelationalAI’s prescriptive reasoning. A pre-allocated pool of empty event slots is given to the solver; it picks each slot’s type, timestamp, venue, quantity, and tick price so that every sequencing rule holds (the rules are drawn from ESMA RTS 24 for MiFID II and SEC Rule 613 for Reg NMS / Consolidated Audit Trail). The solver (MiniZinc) returns one feasible trace that satisfies every constraint at once.
This template constrains type distribution and ordering. The semantics of MODIFY (how it differs from the prior state) are not enforced — a MODIFY event simply consumes a non-PLACE / non-CANCEL / non-FILL slot. See “Customize this template” below for how to add MODIFY-meaningful constraints (e.g. forbidding MODIFY after FILL, or pinning a tick_price/qty delta).
The same pattern applies to any test-data-generation problem where rows have to satisfy referential integrity, temporal precedence, and cross-row aggregate rules: claim adjudication regression suites, eligibility records, audit logs, IoT event streams.
Who this is for
- Compliance engineers building surveillance-engine regression suites
- Bank IT teams building MiFID II / SEC Rule 613 audit-trail validators
- Software developers who need synthetic events that respect referential integrity and temporal rules
- Operations researchers learning constrained generation as a CSP problem
What you’ll build
- A constraint model with binary type indicators (
is_place,is_modify,is_cancel,is_fill) and integer decision properties forts_ms,qty,tick_price,venue_id - Categorical regular-language transitions (
PLACEfirst, nothing-after-CANCEL) encoded as pairwise temporal rules - Cross-table aggregate constraint: total
FILLquantity per order cannot exceedOrder.original_qty - An auxiliary
fill_qtydecision channeled toqty when is_fill else 0via twoimpliesso the per-order fill-conservation aggregate stays linear - Value-pinning: PLACE event’s
qtyandtick_pricepinned to the order’soriginal_qtyandoriginal_tick_priceviaimplies - Venue eligibility encoded as a relationship lookup against the event’s chosen
venue_id - Post-solve verification via
problem.verify()confirming every re-evaluable constraint in the returned trace (implies-bodied andall_different-bodied ICs are solver-side only and intentionally excluded — see step 5)
What’s included
synthetic_order_lifecycle.py— main script with ontology, decisions, constraints, and solver calldata/symbols.csv— 5 tradable symbols (AAPL, MSFT, GOOG, NVDA, TSLA)data/venues.csv— 5 trading venues (NYSE, NASDAQ, ARCA, BATS, IEX)data/symbol_venues.csv— 13 (symbol, venue) eligible pairs out of 25 possible (sparser than full coverage so venue eligibility visibly binds)data/orders.csv— 6 orders, each withsymbol_id,original_qty, andoriginal_tick_price(in integer ticks of 1c, so17500reads as $175.00)data/events.csv— 36 pre-allocated event slots (6 per order)pyproject.toml— Python package configuration
Prerequisites
Access
- A Snowflake account that has the RAI Native App installed.
- A Snowflake user with permissions to access the RAI Native App.
Tools
- Python >= 3.10
Quickstart
-
Download ZIP:
Terminal window curl -O https://docs.relational.ai/templates/zips/v1/synthetic_order_lifecycle.zipunzip synthetic_order_lifecycle.zipcd synthetic_order_lifecycle -
Create venv:
Terminal window python -m venv .venvsource .venv/bin/activatepython -m pip install --upgrade pip -
Install:
Terminal window python -m pip install . -
Configure (prompts for Snowflake account, role, and profile name):
Terminal window rai init -
Run:
Terminal window python synthetic_order_lifecycle.py -
Expected output. The script prints the formulation (~30 lines, omitted here), the solve-result block, the full event trace (all 36 rows; abridged below to the AAPL block), and per-order fill totals. Rows are sorted by
(order_id, ts_ms); the four binaryis_place/is_modify/is_cancel/is_fillindicators are collapsed into a singletypecolumn on the display side. Exact event types, timestamps, quantities, prices, and venues vary across runs and solver versions:Solve result:• status: OPTIMAL• objective: 0• solve time: 1.11s• num_points: 1• solver: MiniZinc_unknownGenerated event trace (one row per slot, sorted by order then timestamp):order_id symbol event_id ts_ms type qty tick_price venue1 AAPL 3 995 PLACE 100 17500 ARCA1 AAPL 5 996 MODIFY 100 17501 ARCA1 AAPL 6 997 MODIFY 100 17501 ARCA1 AAPL 1 998 MODIFY 81 17501 ARCA1 AAPL 4 999 MODIFY 100 17501 ARCA1 AAPL 2 1000 FILL 100 17501 ARCA... [rows for orders 2-6 omitted for brevity; the script prints all 36 rows -- 6 events per order across orders 1-6]Filled quantity per order (cannot exceed Order.original_qty):order_id original_qty filled_qty0 1 100 1001 2 50 502 3 80 803 4 200 2004 5 120 1205 6 60 60The AAPL block above is one of six orders; the others follow the same shape. PLACE has the smallest
ts_msper order withqty/tick_pricepinned to the order row, and venues are constrained to the symbol’s eligible set (here{NYSE, NASDAQ, ARCA}for AAPL). The conservation IC issum(fill_qty) <= original_qty; the run above happens to fill exactlyoriginal_qtyper order, but feasible traces with lower fill totals are also valid.
Template structure
.├── README.md├── pyproject.toml├── synthetic_order_lifecycle.py└── data/ ├── symbols.csv ├── venues.csv ├── symbol_venues.csv ├── orders.csv └── events.csvHow it works
The solver decides every event’s type, timestamp, venue, and quantity. The script proceeds in five steps.
1. Define the ontology and load data. Symbol, Venue, Order, and OrderEvent concepts are declared with their identifying properties; SymbolVenue and a derived NotAllowedSymbolVenue capture the eligible (and dual disallowed) symbol/venue pairs. CSV rows from data/ populate every concept and relationship. NotAllowedSymbolVenue is an encoding artifact, not a domain concept: the CSP arithmetic supports only !=, *, +, -, so the rule is encoded as “forbid these pairs” rather than the more natural “require an allowed pair”.
2. Declare decision variables. Each event slot gets binary type indicators (is_place, is_modify, is_cancel, is_fill) and integer decisions (ts_ms, qty, tick_price, venue_id). An auxiliary fill_qty decision channels qty when is_fill else 0 so the per-order fill-conservation aggregate stays linear.
3. Add sequencing rules as pairwise temporal constraints. Conditional rules read as ‘if premise then consequent’. PLACE-first and nothing-after-CANCEL are pairwise rules over two refs into the same concept; A.order == B.order asserts same-order without a free Order variable:
A = OrderEvent.ref()B = OrderEvent.ref()
place_first_ic = model.where( A.order == B.order, A.event_id != B.event_id,).require(implies(A.is_place == 1, A.ts_ms < B.ts_ms))Distinctness within a group is one global constraint, not pairwise !=. all_different.per(...) lowers to MiniZinc’s native alldifferent propagator:
distinct_ts_ic = model.require(all_different(OrderEvent.ts_ms).per(OrderEvent.order))4. Walk relationships in line for cross-table rules. Reading the order’s original_qty from an event, or matching disallowed venue pairs through the order’s symbol — no intermediate refs needed:
qty_upper_ic = model.require(OrderEvent.qty <= OrderEvent.order.original_qty)
venue_ok_ic = model.where( OrderEvent.order.symbol.id(NotAllowedSymbolVenue.symbol_id),).require(NotAllowedSymbolVenue.venue_id != OrderEvent.venue_id)Value-pinning couples a decision variable to a data property via implies. The PLACE event’s qty and tick_price are pinned to the order’s original_qty and original_tick_price so the generated trace stays internally consistent with the order’s stated price and size:
place_qty_match_ic = model.require( implies(OrderEvent.is_place == 1, OrderEvent.qty == OrderEvent.order.original_qty))place_price_match_ic = model.require( implies( OrderEvent.is_place == 1, OrderEvent.tick_price == OrderEvent.order.original_tick_price, ))5. Solve and verify. implies and all_different are solver-only. They go to satisfy() but must NOT be passed to verify() — the relational engine cannot re-evaluate wire-format constraint relations and would return silently-OK regardless of whether the constraint actually holds. The remaining ICs are plain relational arithmetic and ARE re-evaluated by verify():
problem.solve("minizinc", time_limit_sec=60)problem.solve_info().display()
problem.verify( type_sum_ic, exactly_one_place_ic, at_most_one_cancel_ic, qty_upper_ic, venue_ok_ic, fill_sum_ic,)model.require(problem.termination_status() == "OPTIMAL")Customize this template
- Use your own pool by replacing the five CSV files with your symbols, venues, allowed (symbol, venue) pairs, orders, and event slots. The constraint structure does not change. Add more events per order to allow longer traces; the model adapts to the number of rows in
events.csv. Two assumptions on your data:venues.csvids should be contiguous1..N— thevenue_iddecision domain is[1, max(venue_id)], so non-contiguous ids would let the solver pick venue ids that don’t exist.- At least one disallowed
(symbol, venue)pair must exist — if yoursymbol_venues.csvcovers every symbol×venue combination, the empty disallowed list breaksmodel.data(...).to_schema(). In that case dropvenue_ok_ic.
- Force a CANCEL on every order by changing
at_most_one_cancel_icfrom<= 1to== 1, then bumping the per-order event count so the order has room for aPLACEplus other events before theCANCEL. - Add the inverse rule “no MODIFY after FILL” with the same shape as
place_first_ic, but filteringBto fills.AandBare the twoOrderEvent.ref()aliases declared earlier in the script alongsideplace_first_ic:# A = OrderEvent.ref(); B = OrderEvent.ref() # already declared earlierno_modify_after_fill_ic = model.where(A.order == B.order,A.event_id != B.event_id,B.is_fill == 1,).require(implies(A.is_modify == 1, A.ts_ms < B.ts_ms)) - Generate a “smallest violating trace” instead of a positive trace by negating one of the rules (e.g. drop
no_after_cancel_icand addmodel.require(sum(A.is_cancel + B.is_fill - 1).per(Order) >= 0)plus a temporal predicate) and minimizing the number of events. The model is already in optimization-ready shape — the termination-status gate stays at"OPTIMAL". - Replace the synthetic time horizon by reading
ts_msbounds from your real session schedule (market open / market close) and updatingTS_MIN/TS_MAX.
Troubleshooting
Import error or AttributeError on relationalai
- Confirm your virtual environment is active:
which pythonshould point to.venv. - Reinstall dependencies:
python -m pip install .. The pinned version (relationalai==1.1.0) ships thesolve_info(),verify(), and chained-where().require()APIs this template uses; older versions lack them and produce attribute errors. - If you share a venv across templates, run
python -m pip install --upgrade --force-reinstall relationalai==1.1.0.
FileNotFoundError on a CSV
- The script resolves data paths as
Path(__file__).parent / "data". Runpython synthetic_order_lifecycle.pyfrom the unzipped template root, not from a parent directory. - Confirm
data/containssymbols.csv,venues.csv,symbol_venues.csv,orders.csv, andevents.csv.
Authentication or configuration errors
- Run
rai initto create or update your RelationalAI/Snowflake configuration. - If you have multiple profiles, set
export RAI_PROFILE=<your_profile>.
MiniZinc solver not available
- This template uses the MiniZinc constraint solver. Ensure the RAI Native App version supports MiniZinc.
- HiGHS is not appropriate here — this is a discrete satisfaction model with categorical decisions, not LP/MILP.
Solver returns INFEASIBLE
- The pool may be too small. If you reduce
events.csvbelow the per-order slot count required by your constraints (e.g. onePLACEplus a forcedCANCELwith no room for fills), no trace can satisfy all the rules. Add more rows toevents.csvfor that order. - A symbol with zero allowed venues in
symbol_venues.csvwill block thevenue_ok_icconstraint — every event has to land on an allowed venue. Confirm at least one (symbol, venue) row exists per symbol used inorders.csv. - Every order in
orders.csvmust haveoriginal_qty >= 1andoriginal_tick_price >= 1. Theqtyandtick_pricedecision domains start at1, and the PLACE-event pinning ICs equate them to the order’s stated values; a row with zero in either column produces an empty domain and immediate INFEASIBLE.
Empty disallowed pairs (full venue coverage)
- If you replace
symbol_venues.csvwith a list that covers every (symbol, venue) combination, the deriveddisallowed_csvis empty andmodel.data(...).to_schema()raisesValueError: empty data(or a similar zero-row schema error from pandas). Drop thevenue_ok_icconstraint when full coverage is intentional.
The generated trace differs between runs
- This is constraint satisfaction, not optimization. Any feasible trace is a valid answer; the solver is free to return different ones across runs.
- To pin a single answer, switch to optimization — e.g.
problem.minimize(sum(OrderEvent.ts_ms))returns the trace with the earliest event timestamps overall.