# Rspamd Architecture **How rspamd provides spam filtering, DKIM signing, and content analysis in the Mailu mail flow.** ## Role in Mail Flow Rspamd integrates with Postfix via the milter protocol. Every incoming and outgoing message passes through rspamd for analysis before delivery. ```{mermaid} sequenceDiagram participant S as Sending MTA participant P as Postfix participant R as rspamd participant U as Unbound DNS participant D as Dovecot S->>P: SMTP delivery (port 25) P->>R: milter check (port 11332) R->>U: RBL/SPF/DKIM DNS queries U-->>R: DNS responses R-->>P: accept / reject / greylist P->>D: LMTP delivery (port 2525) ``` ## Spam Detection Layers Rspamd uses multiple detection methods in parallel: ### 1. DNS-Based Blocklists (RBLs) RBL providers maintain lists of known spam-sending IP addresses and domains. Rspamd queries these via DNS. A positive response (e.g. `127.0.0.2` from Spamhaus) adds score to the message. **Configured RBLs:** - Spamhaus ZEN (SBL + XBL + PBL) - Spamhaus DBL (domain-based) - Barracuda RBL - Spamcop ### 2. Bayesian Classification Statistical analysis of message content. Rspamd learns from previous spam and ham (legitimate mail) to classify new messages. - **Auto-learning**: Messages above spam threshold are automatically learned as spam, below ham threshold as ham - **Storage**: Bayes database persisted in `/var/lib/rspamd` PVC ### 3. Fuzzy Hashing Compares message content hashes against databases of known spam: - **Local fuzzy storage**: Learns from locally identified spam (port 11333) - **Remote fuzzy servers**: rspamd.com maintains a global fuzzy database with known spam hashes ### 4. Protocol Checks - **SPF**: Sender Policy Framework verification - **DKIM**: DomainKeys Identified Mail signature verification - **DMARC**: Domain-based Message Authentication policy enforcement - **ARC**: Authenticated Received Chain for forwarded messages ### 5. Scoring System Each check adds or subtracts from a message score. Actions are taken based on thresholds: | Threshold | Score | Action | |-----------|-------|--------| | Greylist | >= 3 | Temporary rejection (legitimate servers retry) | | Add Header | >= 5 | Mark as spam via X-Spam header | | Reject | >= 12 | Permanent rejection | ## Unbound DNS Sidecar RBL providers like Spamhaus block queries from public DNS resolvers (Cloudflare, Google) because they anonymize the query source and prevent rate limiting. The solution is a dedicated recursive DNS resolver (Unbound) running as a sidecar container in the rspamd pod: ```{mermaid} graph TB subgraph "rspamd Pod" R[rspamd container
port 11332-11334] U[Unbound container
port 53] R -->|DNS queries
127.0.0.1:53| U end U -->|recursive queries| RBL[RBL Providers
Spamhaus, Barracuda, etc.] U -->|forward .cluster.local| KD[kube-dns
10.43.0.10] ``` Key Unbound configuration: - **Recursive resolution**: Queries go directly to authoritative DNS servers, not public resolvers - **Kubernetes integration**: `.cluster.local` queries forwarded to kube-dns for service discovery - **RBL compatibility**: `private-domain` settings allow 127.0.0.x responses from RBL zones - **QNAME minimization disabled**: Required by Spamhaus for correct query handling ## Storage Architecture Rspamd persists learned data in a 5Gi PVC: - **Bayes database**: Statistical model trained on spam/ham samples - **Fuzzy hashes**: Local fuzzy hash database - **Statistics**: Historical processing data This data survives pod restarts and is critical for maintaining spam detection accuracy over time. ## Web Interface Rspamd provides an HTTP API and web UI on port 11334 for: - Viewing message processing history - Checking symbol scores and triggered rules - Monitoring throughput statistics - Manual spam/ham learning Access is protected by Traefik ForwardAuth middleware (Mailu admin authentication). ## See Also - [Configure Spam Filtering](../how-to/configure-spam-filtering.md) - Setup guide for spam filter improvements - [Component Specifications](../reference/component-specifications.md) - Rspamd ports and resources