Behind the scenes

How SanctionsScreening works

From official government feeds to a single deduplicated record and an audit-ready PDF — here's exactly what happens when you run a search.


  OFAC SDN  ─┐
  OFAC Cons ─┤      ┌────────────┐     ┌──────────────┐     ┌─────────────┐
  EU CFSP   ─┼─►   │  Ingest +  │ ─► │  Deduplicate │ ─► │  Fuzzy      │ ─► PDF Report
  UK OFSI   ─┤      │  Normalize │     │  across      │     │  Search +   │
  UN UNSC   ─┘      └────────────┘     │  regimes     │     │  AI context │
                                       └──────────────┘     └─────────────┘
1

Ingest official lists, every few hours

We pull directly from the U.S. Treasury OFAC SDN and Consolidated feeds, the EU Consolidated XML, the UK OFSI list, and the UN Security Council Consolidated List. Each sync uses Last-Modified and ETag headers to skip unchanged files, and SHA-1 row hashes to only update rows that actually changed.

2

Normalize & enrich the raw data

Source XML/CSV is parsed into a unified schema: primary names, aliases, dates of birth, places of birth, passport and national ID numbers, addresses, nationalities, vessels, aircraft tail numbers, crypto wallets and entity-to-entity relationships. Legal authority (e.g. EO 14024, CFSP 2014/145) is mapped onto each listing.

3

Deduplicate the same person across regimes

A normalized name key (accent-stripped, token-sorted) combined with corroborating signals — DOB year, passport number, nationality — merges the same individual across OFAC, EU, UK and UN into one record. So 'Vladimir Putin' appears once, with all four list badges, not four separate hits.

4

Fuzzy search with word-boundary safety

Postgres trigram (pg_trgm) indexes power fuzzy matching, so 'Usama bin Ladin' matches 'Osama Bin Laden'. Word-boundary regex (\\m and \\M) eliminates false positives where a short query accidentally substring-matches inside an unrelated name.

5

Score, label and rank each hit

Every result carries a deterministic confidence score and label (Exact, High, Medium, Possible). Alias-only matches are capped at 0.80 and labelled 'POSSIBLE MATCH' so analysts treat them as leads, not conclusions.

6

AI risk context (Pro and above)

Google Gemini summarises why an entity was designated — the program code, the underlying statute, and the geopolitical context — combining data from every list the person appears on. Cached per entity, regenerated only when the underlying record changes.

7

Timestamped PDF for your audit file

One click produces a signed-style PDF capturing the query, timestamp, every hit, match scores, list sources and the analyst's name. Attach it to the KYC or onboarding record as evidence the check was performed.

Why this architecture matters

Always current

Direct ingestion from official feeds — no third-party reseller delay. Most lists are within hours of the source.

Globally consistent

One record per person across regimes means no missed hits because you only searched OFAC and the EU added them yesterday.

Defensible

Deterministic scoring, word-boundary matching and timestamped PDFs give you a paper trail regulators can follow.

See it on a real search

Run a free check — no signup required.