Find similar yet misspelled values in Excel

Fuzzy Duplicate Finder is a tool for Microsoft Excel that helps you find and correct similar records. The add-in quickly performs approximate match according to the settings you select and changes all typos into the correct equivalents of your choice.

How to use Find Fuzzy Duplicates

Start Find Fuzzy Duplicates

Open the add-in pane by clicking on the Find Fuzzy Duplicates icon in the Duplicate Remover drop-down list on the Ablebits Data tab, Dedupe group:
Run Fuzzy Duplicates.
This tool is also available in the Search group on the Ablebits Tools tab: in the Find and Replace drop-down list, click Search for Typos:
Find Fuzzy Duplicates.

Configure searching options

There are some settings that you need to tweak to get more precise results:
Tweak searching options.

  1. Select the search range in your table and you will see its address in the Select range field at the top.
    Tip. If you want to find fuzzy duplicates in a single column, select this column or any cell in that column. To fuzzy match within several Excel columns or any block of cells, select them.
  2. Сlick the Expand selection icon and have the entire table selected automatically.
  3. Fine-tune the way typos are found:
    • You can limit the differences to look for using the Max number of different characters field. Simply set from 1 to 10 characters in which the searched entries will differ.
      Note. The bigger the number, the more different values you are likely to get.
    • Use the Min number of characters in a word/cell option to exclude short values from the search. Set from 1 to 50 characters and the words or cells that have fewer characters will not be searched within.
  4. If your cells may contain more than one word, take advantage of The cells contain separate words delimited by option. Tick it, and the add-in will check word combinations within your cells.
    Tip. You can also select or enter delimiters that separate the words in your cells. A space " " is set by default. Choose from a standard set of separators in the drop-down list of the Delimiters field or enter your own.

    Enter delimiters or choose one from the list.

Get the result

Click the Search for typos button and the add-in will perform approximate matching in Excel. All found items will be grouped by duplicates (nodes) in the tool's tree view:
See all found matches in the tree pane.

How to work with found fuzzy matches

Correct all misprints in the duplicate node

  1. Select a node and expand the group. To set the correct value to all similar records, click on the Check icon in the Action column on the line with the correct entry. This value will be automatically assigned to all items of the node and you will see it in the Correct Value column:
    Find the right value in the list and set it as correct.
  2. If the node doesn't contain the right entry, click on the root node and type in the correct value manually. To finish up, press Enter on your keyboard:
    You can type in the necessary value in the Correct Value column.
  3. Once the correct word is chosen, click the Apply button at the bottom of the add-in pane. The color of the records in the Correct Value column will change to blue and the found records will be corrected in your Excel worksheet:
    See the misprints corrected in both, your data and the add-in's pane.

Keep the case of the same values in the node

If you have several correct values in your node that differ in case, you may want to keep their original case or make them all look the same.

Tick the Keep case of the same values in the node option at the top to leave the items that differ only in case unchanged.

So, if you have "Street, street, street, and Streett" in the results and you consider only the value as a misprint, not the case, then select this option and the "street" value will keep its case.

Note. If you don't tick this option, the case of the value chosen as correct will be applied to all fuzzy lookup results automatically.

Remove fuzzy duplicates

If you want to remove an entry from the list of found duplicates, just select it and press the Delete button on your keyboard. Or, you can select several items by holding the Ctrl key and delete them all at once.

Note. The entry will not be removed from the worksheet.

You can remove similar data from your Excel worksheet if necessary. Just click on the value you want to delete in the list of found items. The cell with this value will be automatically highlighted in Excel worksheet, so you can select it in your table and press the Delete button on your keyboard.

Export the search results

To see the found similar records in a separate Excel worksheet, click the Export button on the Fuzzy Duplicate Finder pane.

This option may be helpful if you want to see the changes in the nodes before applying them.

Tip. To export several nodes, select them by holding the Ctrl key and click Export.

To do a new fuzzy match, click New Search at the top of the Fuzzy Duplicate Finder pane.

Responses

I am new to Fuzzy Lookup. I have a single column list of vendors names and some vendors are listed multiple times in various ways. What is the best setting to find similar names. I am wanting it to find from the list below as possible matches:

A1 Golf Cart Leasing
A-1 Industrial Parts and Supplies Inc
A-1 Key Service Inc
A-1 Golf Cart Leasing
A1 Golf Lessons Inc
A-1 Oil Inc
A-1 Pallet Co of Clarksville Inc
A-1 Rental Inc
A-1 Scale Service Inc
A-1 Welding, Inc.
A3 Taxi
AAA - 1 Lock and Key
AAA Business Systems Inc
AAA Cooper Transportation
AAA Galvanizing - Joliet
AAA Pallet & Lumber Co Inc
AAA Pallets & Lumber Company Inc
AAA Quality Services Inc

A1 Golf Cart Leasing an d A-1 Golf Cart Leasing
AAA Pallet & Lumber Co Inc and AAA Pallets & Lumber Company Inc

Hello Debbie,

Thank you for your comment. I am afraid, we can't offer you an easy solution to fulfill your task. Before running our Fuzzy Duplicate Finder tool, we'd recommend making the list of vendor names more convenient to work with. For example, you can remove the substrings before the names using our Remove Characters tool, then get rid of extra spaces in the cells with the help of the Trim Spaces tool. After that, make sure that each cell contains just one vendor name and try to run Fuzzy Duplicate Finder setting the maximum number of different characters from 7 to 10 and check what results you'll get.

Hope these recommendations will turn to be helpful.

Hello, I have a list with 40K rows. Would this fuzzy duplicates tool be able to handle 40K rows?

Katerina Bespalaya (Ablebits Team) says:
November 15, 2019 at 11:57 am

Hello Julia,

Thank you for your question. Our add-in doesn't have clearly defined limitations on the number of rows with the data to process. However, it may take some time for the add-in to handle 40K rows and the processing speed also depends on such things as the formatting of your data, other processes running at the moment, available system resources, etc.

You can install the trial version of our Ultimate Suite for Excel that contains Fuzzy Duplicate Finder and check how the tool works with your data amount. The trial is fully functional and allows you to test the add-in during 30 days for free. If you are interested, please download from this page.

Feel free to contact us at support@ablebits.com if you have any other questions or difficulties.

tanya lebedeva says:
May 22, 2019 at 2:24 am

hello
i have an excel spreadsheet of customer names and addreses and i need to be able to quickly identify all duplicates. the problem is the data looks like this:
ABC school district 123 main street Cleveland Ohio
ABC Schools 123 Main St Cleveland OH

will your duplicate catcher fuzzy matcher work for these purposes?

What is the ideal setting to find typos in a list of company names?

Hello, Anton,
Thank you for using our product and for your question. To find typos in a list of company names, first count the number of characters the company name contains. This will be the number that you should put into the Min number of characters in a word/cell field. And then think of the number of possible different characters that the correct spelling of the company name and the typo may contain. Enter this number into the Max number of different characters field. If the name of the company consists of several words, check the The cells contain separate words delimited by box and in the drop-down list select Space. That’s it! You’ll get a list of the found entries where you can select typos and get them corrected right in the add-in pane.

Please do not hesitate to contact us if you still have questions or doubts.
Thank you.

Post a comment

Seen by everyone, do not publish license keys and sensitive personal info!

If you have any questions or issues with this add-in, please feel free to post your concerns in the comments area. As soon as we answer, a notification message will be sent to your e-mail. If you do not want to share your thoughts in public, please contact us at support@ablebits.com.