Rspamd Architecture¶

How rspamd provides spam filtering, DKIM signing, and content analysis in the Mailu mail flow.

Role in Mail Flow¶

Rspamd integrates with Postfix via the milter protocol. Every incoming and outgoing message passes through rspamd for analysis before delivery.

        sequenceDiagram
    participant S as Sending MTA
    participant P as Postfix
    participant R as rspamd
    participant U as Unbound DNS
    participant D as Dovecot

    S->>P: SMTP delivery (port 25)
    P->>R: milter check (port 11332)
    R->>U: RBL/SPF/DKIM DNS queries
    U-->>R: DNS responses
    R-->>P: accept / reject / greylist
    P->>D: LMTP delivery (port 2525)

Spam Detection Layers¶

Rspamd uses multiple detection methods in parallel:

1. DNS-Based Blocklists (RBLs)¶

RBL providers maintain lists of known spam-sending IP addresses and domains. Rspamd queries these via DNS. A positive response (e.g. 127.0.0.2 from Spamhaus) adds score to the message.

Configured RBLs:

Spamhaus ZEN (SBL + XBL + PBL)
Spamhaus DBL (domain-based)
Barracuda RBL
Spamcop

2. Bayesian Classification¶

Statistical analysis of message content. Rspamd learns from previous spam and ham (legitimate mail) to classify new messages.

Auto-learning: Messages above spam threshold are automatically learned as spam, below ham threshold as ham
Storage: Bayes database persisted in /var/lib/rspamd PVC

3. Fuzzy Hashing¶

Compares message content hashes against databases of known spam:

Local fuzzy storage: Learns from locally identified spam (port 11333)
Remote fuzzy servers: rspamd.com maintains a global fuzzy database with known spam hashes

4. Protocol Checks¶

SPF: Sender Policy Framework verification
DKIM: DomainKeys Identified Mail signature verification
DMARC: Domain-based Message Authentication policy enforcement
ARC: Authenticated Received Chain for forwarded messages

5. Scoring System¶

Each check adds or subtracts from a message score. Actions are taken based on thresholds:

Threshold	Score	Action
Greylist	>= 3	Temporary rejection (legitimate servers retry)
Add Header	>= 5	Mark as spam via X-Spam header
Reject	>= 12	Permanent rejection

Unbound DNS Sidecar¶

RBL providers like Spamhaus block queries from public DNS resolvers (Cloudflare, Google) because they anonymize the query source and prevent rate limiting.

The solution is a dedicated recursive DNS resolver (Unbound) running as a sidecar container in the rspamd pod:

        graph TB
    subgraph "rspamd Pod"
        R[rspamd container<br/>port 11332-11334]
        U[Unbound container<br/>port 53]
        R -->|DNS queries<br/>127.0.0.1:53| U
    end

    U -->|recursive queries| RBL[RBL Providers<br/>Spamhaus, Barracuda, etc.]
    U -->|forward .cluster.local| KD[kube-dns<br/>10.43.0.10]

Key Unbound configuration:

Recursive resolution: Queries go directly to authoritative DNS servers, not public resolvers
Kubernetes integration: .cluster.local queries forwarded to kube-dns for service discovery
RBL compatibility: private-domain settings allow 127.0.0.x responses from RBL zones
QNAME minimization disabled: Required by Spamhaus for correct query handling

Storage Architecture¶

Rspamd persists learned data in a 5Gi PVC:

Bayes database: Statistical model trained on spam/ham samples
Fuzzy hashes: Local fuzzy hash database
Statistics: Historical processing data

This data survives pod restarts and is critical for maintaining spam detection accuracy over time.

Web Interface¶

Rspamd provides an HTTP API and web UI on port 11334 for:

Viewing message processing history
Checking symbol scores and triggered rules
Monitoring throughput statistics
Manual spam/ham learning

Access is protected by Traefik ForwardAuth middleware (Mailu admin authentication).