Data Quality inside Salesforce


§ Stack · Salesforce

Data quality inside Salesforce.

Salesforce is the most common mid-market and enterprise CRM. It also accumulates garbage faster than any other system because it’s plugged into every form, every marketing tool, and every import the company has run in the last decade. A back-office process sits next to Salesforce, proposes dedup and cleanup, and lets the admin approve inside the merge tool they already use.

What it needs from Salesforce

A Connected App with API access. Read + Edit on Leads, Contacts, Accounts, Opportunities. Standard OAuth flow — same mechanism every third-party connector uses. No org-wide changes. No custom metadata. Works on Enterprise and Unlimited; works on Professional if API access is enabled.

How the process runs

  1. Scheduled run — typically nightly or weekly, configurable per org volume.
  2. The process queries Leads and Contacts via the Salesforce API, scanning for duplicate candidates by configurable rules (email exact, email + name fuzzy, company + domain, phone + last name).
  3. Each candidate pair gets a confidence score and a proposed winner (most complete record, most recent activity, matching the admin’s policy).
  4. Draft merge proposals are logged — to a custom object in Salesforce, or to a simple daily report — and reviewed by the admin.
  5. The admin approves in the Salesforce native merge UI. The actual merge happens inside Salesforce, respecting every existing workflow rule, trigger, and validation rule.
  6. Rejected merges feed back into the process so it learns which patterns not to propose again.

What the process deliberately does not do

  • Does not merge silently. Every merge is admin-reviewed, because a bad auto-merge is harder to unwind than any manual cleanup.
  • Does not change field values outside the merge flow.
  • Does not bypass Salesforce validation or trigger logic.
  • Does not touch Opportunities or Cases unless explicitly configured.

Typical quarter after go-live

Duplicate rate drops 30–60% in the first two months. Admin spends ~30 minutes per day reviewing merge proposals instead of a full day per week doing manual dedup. Integration-related duplicates (marketing automation, form syncs) get caught within 24 hours of creation instead of accumulating for quarters.

See also: CRM admin role, enrichment inside HubSpot.