Most companies track employee information in multiple databases. For example, you may track employee biographic and demographic in a payroll software tool, and employee contracts, lease agreements and license numbers in a manual spreadsheet. Managing and maintaining multiple databases can be both inefficient and time consuming when you need data on the fly.
The purpose of this article is to provide a step by step guide on how to merge data in Excel that has been extracted into a spreadsheet. By following the steps below, your new "super spreadsheet" can be created and readily available whenever you need it.
Required Tools: Merge Tables Wizard and Microsoft Excel 2010 - 2003.
- Export information from both databases into an Excel spreadsheet.
Make sure there is a common unique identifier in each spreadsheet. For example, employee social security number or employee number.
- Save both files into a secure location. Leave both spreadsheets open.
Tip. If common fields are named differently in both databases, rename the header column on both spreadsheets for quick reference. For example, if your unique identifier is the employee number, however, your payroll system refers to this number as a payroll number, rename the column to match that of your other database.
- Format your spreadsheets. If the fields in your Excel spreadsheets are formatted differently the merge will not work.
Tip. If using social security number as your unique identifier, make sure they are formatted the same. Depending on the spreadsheet the dash (-) may be formatted in one spreadsheet and not the other. To truly test this formatting, click on a cell containing the number and check the formula bar to see if the social security number is formatted with dashes.
- Determine which spreadsheet you will be using as your master spreadsheet. All data will be merged into this spreadsheet.
- Highlight all columns in sheet 1 of the master spreadsheet.
- Open the ablebits.com "Merge Tables Wizard for Excel".
- When the tool opens you will be prompted to "Select the Master Table (First Table):"
Make sure all of your data is highlighted in blue. If not:
- Click the table icon next to the cell range.
- Highlight the data.
- Click "OK".
- Click "Next".
- In the "Select the Lookup Table (second table):" select the other spreadsheet.
- Make sure all of your data is highlighted in blue. If not, refer to step 7 for selecting your data.
- Click "Next".
- In this screen, make sure the top boxes (3) are checked:
- Once you've verified the checkboxes at the top are selected, select your unique identifier. Make sure the "Master Table Columns" and "Lookup Table Columns" are the same (This is where renaming the columns with the same name comes in handy).
There may be several columns auto selected by default. Make sure that you only select the columns you intend to use as the unique identifier. I strongly recommend you use as few unique identifiers as possible when you are working with more than one database due to differences in data.
- Click "Next".
- In the 4th screen on the "Merge Tables Wizard" you select the placement of the data. There are a couple of ways you can merge this data in Excel:
Preferred Method: You can "add to the end". By adding all merged data to the end, none of the data in your master spreadsheet will be updated; instead new columns will be added to the end of the Excel spreadsheet. This method is preferred if there may be discrepancies in data (first name in one database is Christopher vs. Chris in the other database). Once data is merged, you can move data into any position or column that suits your needs.
Custom Method: You can customize how your data is merged. This method may generate errors. Please see below for definitions.
- Select the "Lookup Table Columns".
- In the "Action" section, click on the command name, change to "update values in".
- Click in the "Master Table columns", select where you which column you would like the merged data updated in.
- Click "Next".
- The final step of the wizard helps you determine how the final data will be presented in your Excel worksheet:
For Matching rows: This section assists in customizing how and when data is updated in your master spreadsheet.
- Warning: By checking these boxes you may not be including data from your secondary spreadsheet.
For NON-Matching Rows: This section is used to determine how non-matching data is handled.Tip. Check the box "Add non-matching rows to the end of the Master Table". As long as this box is checked, any person on the secondary spreadsheet that was not included on the master spreadsheet will be added to the end, this is an invaluable tool when merging data in Excel and comparing/auditing databases.
Other: By checking the box in this section, the merge tool will add a column in the master spreadsheet to assist in determining which records were updated. Below are definitions of "MTW Status" codes used:
- Updated: This code will appear when there is a matching unique identifier verified in both spreadsheets and information has been updated.
- Non-updated: This code will appear when no data was matched or updated from the secondary spreadsheet. For example, there was a unique identifier on the master spreadsheet that was not present on the secondary spreadsheet.
- New row: This code appears when there was a unique identifier on the secondary spreadsheet that could not be matched on the master spreadsheet.
- Click "Finish".
- Review merge summary. A pop-up box will appear with an action summary of what events took place during the merge process.
- Click "OK".
- Review the new spreadsheet.
- Prior to beginning the merge process, I review the counts for each spreadsheet. That way I know what "MTW Status" update codes to expect. If the numbers are relatively close, there will be less research required to determine who is missing from either database. If the numbers are off by a lot, I question the quality of the spreadsheets. There may have been a timeout in extracting data from one of the software programs that prevented data from be exported. When this happens, I re-run spreadsheets and troubleshoot before I begin the merge process.
- When working with large databases I may perform the merge in steps. For example, if you are merging more than 10 items for a large number of people (500+) I merge in groups of 5. By breaking it down, errors and/or issues within the data are less time consuming to research and correct. I've also found that when I break it down I am able to fix minor discrepancies in data before completing a second merge making the second merge virtually error free.
- None of the data "Updated": If no data was updated (all information from the secondary spreadsheet were added as "new rows") there is most likely a problem with your unique identifier formatting. Refer to step 3 for more information.
- The number of new rows is the same (or very close) as the number of original rows: Check the original headcounts for both spreadsheets. There may have been a software interruption when you were extracting your data from one of (or both) of your databases. If the headcounts are significantly off in the original spreadsheets, you may have to re-extract your data and restart the merge process.