Tuesday, July 8, 2008

Duplicate detection

A basic mechanism to make a service idempotent is duplicate detection. For one-way services, duplicate detection may be sufficient, Request/response services require also that exactly the same response message is returned.

Duplicate detection isn't that hard:
  1. Unique value in the request. Multiple options exist:
    • business identifier such as purchase order number
    • hash of the input message or
    • technical identifier such as WS-Addressing MessageID
  2. Database table with unique constraint to manage the identifiers.
  3. Cleanup job to remove old entries from the above database table.
The ideal location to do duplicate detection is in the back-end services themselves. The service is invoked in a transactional context and has already access to a database. But it can also make sense to do duplicate detection in an intermediate such as an ESB. E.g. when an ESB switches between an unreliable protocol such as HTTP and a reliable persistent queuing protocol.

Integration solutions often need to maintain information about messages anyway. For
message monitoring, message archiving or non-repeduation (keep copy of signed messages). Duplicate detection can also be combined with protection against replay attacks (even required by the core WS-Security spec).

Mule comes with an Idempotent Receiver that provides duplicate detection.
Still, why other ESB's don't come with duplicate detection mechanisms is a mystery to me!