What is Data Deduplication?

Data Deduplication is a technique or a process used and created to eliminate duplicate records. This is usually done by organizations to streamline their storage use, as well as minimizing bandwidth use that’ll make opening different objects, records, and fields in their CRM pages faster. Note that storage allocations in Salesforce are deemed important as it is given per user license and is also dependent on the Edition a certain business is subscribed to.

What should be your Objective as a Salesforce Administrator?

Your objective is to be able to set up a clean database free from duplicated data. This involves the following:

  • Removal of redundant records
  • Organize records
  • Updating of existing records with the correct information

All these to ensure efficient record identification experience by all your users, that’ll also lead to cost efficiency for your business in handling your storage capacity.

Here are the steps you can take to improve your deduplication process:

  1. Identify a Model to Use –

    There are different processing models you can use in improving your deduplication process.

    1. Pre-processing – this involves the setup of matching rules. Any data being added to the system will pass through validation that will either create a new record or update an existing record on the database. This will allow you to minimize duplicate records being created even before they get saved on the database.
    2. Post-processing – set a schedule to run deduplication of your data considering all other factors beyond the matching criteria you set up. Examples are data created from a third-party app that supersedes the matching criteria and data entered with human errors.
  2. Set Up Custom Matching Rules –

    Identify the different criteria you want to use in setting up your matching rules. Salesforce provides standard rules in handling your business data but creating custom matching rules is more beneficial as it reflects more of the organization’s needs.

  3. Set a Maintenance Schedule –

    Be consistent in running your deduplication process especially for post-processing deduplication. This will help in managing your time well. Example: Every Friday – almost at the end of the business day.

  4. Create Custom Error Messages –

    Error messages don’t only work if a user is creating something outside of their level of access or if they missed entering a value in a required field. You can also partner with your Salesforce Developers, to create custom error messages that will alarm a user if a duplicate record is about to be created or to direct the user to the existing record and give them the option to update them.

  5. Identify a Software to use for the Deduplication Process

    There are several applications that you can use in running the Deduplication process. For the pre-processing, Salesforce has the Data Import Wizard and Data Loader available for you. For post-processing, there are third-party apps like Apsona and Demand Tools. Research and consult with your business manager the best tool you can use for your organization.

  6. Setup Merge Process Criteria –

    This is mostly looked over by Salesforce Administrators. There will be instances wherein you need to merge records that the de-duplication processes flagged as needing to be merged. This is important as you will encounter conflicting data and will have to select the values that you want to retain in the merged record. For example, you must be able to set a priority when selecting values to retain. i.e., Complete Name over Nickname