The article explorers quick and efficient ways to export data from Excel to CSV keeping all special characters and foreign symbols intact. The methods work for all versions of Excel, from 365 to 2007.
Comma separated values (CSV) is a widely used format that stores tabular data (numbers and text) as plain text. Its popularity and viability are due to the fact that CSV files are supported by many different applications and systems at least as an alternative import/export format. The CSV format allows users to glance at the file and immediately diagnose the problems with data, change the delimiter, text qualifier, etc. All this is possible because a CSV document is plain text and an average user or even a novice can easily understand it without any learning curve.
CSV formats supported by Excel
Microsoft Excel allows saving a file in a few different CSV formats, and you may be curious to know the differences between them.
In essence, each CSV format saves data as comma-separated values but performs encoding in a slightly different way. For example, Mac uses a single carriage return (<CR>) represented by \r for a line break, while Windows uses a combination of carriage return and line feed (<CRLF>) represented by \r\n.
To correctly export data to other programs, Excel lets you choose the CSV formatting that best matches the program's expectations.
Here are the CSV options available in Excel 365. In your version, the list may look a little different.
CSV (comma delimited). This format saves data in a comma-separated text file that can be used in another Windows program or another version of the Windows operating system.
CSV (Macintosh). This format saves a workbook as a comma-separated file for use on the Mac operating system.
CSV (MS-DOS). Saves as a comma-separated document for use on the MS-DOS operating system.
CSV UTF-8 (comma delimited). It is Unicode Transformation Format 8-bit encoding that supports many special characters, including hieroglyphs and accented characters, and is backward compatible with ASCII. This format is recommended for files that contain any non-ASCII characters since the classic CSV format destroys them.
Besides CSV, there is one more format that may come in extremely handy for communicating with other programs.
Unicode Text (*.txt). This is a computing industry standard supported by almost all current operating systems including Windows, Macintosh, Linux and Solaris Unix. It can handle characters of almost all modern languages and some ancient ones.
Note. By strict definition, the CSV format implies separating values with commas. In reality, you may come across many other delimiters, a semicolon and tab being most common.
How to convert Excel file to CSV
When Excel data is to be transferred to some other application such as the Outlook Address book or Access database, the easiest way is to save your worksheet as a .csv file, and then import that file to another program.
To save an Excel file (.xlsx or .xls) in the CSV format, here are the steps you need to follow:
- In your workbook, switch to the target worksheet as only the active sheet will be converted.
- On the File tab, click Save As. Or press the F12 key to open the Save As dialog.
- In the Save As dialog box, pick the desired CSV format from the Save as type drop-down menu, On Windows, you'd choose either CSV (Comma delimited) or CSV UTF-8.
- Pick the destination folder and hit Save.
In case your worksheet has any formatting, formulas, charts, shapes or other objects, you will be informed that some features in your workbook might be lost if you save it as CSV (Comma delimited). If that is Okay, click Yes to complete the conversion without the unsupported features.
Export Excel to CSV without destroying special characters
If your spreadsheet contains some special symbols, smart quotes or long dashes (e.g. inherited from a Word document), foreign characters (tildes, accents, etc.) or hieroglyphs, the method described above won't work.
The point is the saving to the CSV (comma delimited) format distorts any characters other than ASCII (American Standard Code for Information Interchange).
To keep non-ASCII characters undamaged, a document should be saved to a format that uses a Unicode character encoding. There exist two Unicode encoding forms: 8-bit (UTF-8) and 16-bit (UTF-16).
Before we move to the exporting steps, let us point out the key features of each encoding, so you can choose the format right for a particular case.
UTF-8 is a more compact encoding since it uses 1 to 4 bytes for each symbol. Generally, this format is recommended if ASCII characters are most prevalent in your file because most such characters are stored in one byte each. Another advantage is that a UTF-8 file containing only ASCII characters has absolutely the same encoding as an ASCII file.
UTF-16 uses 2 to 4 bytes to encode each symbol. However, a UTF-16 file does not always require more storage than UTF-8. For example, Japanese characters take 3 to 4 bytes in UTF-8 and 2 to 4 bytes in UTF-16. So, you may want to use UTF-16 if your data contains any Asian characters, including Japanese, Chinese or Korean. A noticeable disadvantage of this format is that it's not fully compatible with ASCII files and requires some Unicode-aware programs to display them. Please keep that in mind if you are going to import the resulting document somewhere outside of Excel.
Once you've decided on the format, the below instructions will walk you through the process.
How to convert Excel to CSV UTF-8
Suppose you have a worksheet with some foreign characters, Japanese names in our case:
Depending on the Excel version you are using, it may take 3 to 5 steps to convert this file to CSV keeping all special characters.
Export to CSV UTF-8 in Excel 2016 - 365
In Excel 2016 and later versions, you can save a file in the CSV format with UTF-8 encoding directly:
- In the target worksheet, click File > Save As or press the F12 key.
- In the Save As dialog box, select CSV UTF-8 (comma delimited) (*.csv) from the Save as type drop down.
- Click the Save button. Done!
Convert to CSV UTF-8 in Excel 2013 - 2007
As older Excel versions do not support the UTF-8 encoding, you'll need to save your document in the Unicode Text format first, and then convert it to UTF-8.
To export an Excel file to CSV and preserve special characters, follow these steps:
- In your worksheet, click File > Save As or press F12.
- In the Save As dialog box, choose Unicode Text (*.txt) from the Save as type drop-down menu, and click Save.
- Open the txt document using your preferred text editor, for example Notepad.
Note. Some simple text editors do not fully support all Unicode characters, therefore certain characters may be displayed as boxes. In most cases this won't affect the resulting file, so you can simply ignore this or use a more advanced text editor such as Notepad++.
- Since the txt file is tab-delimited while we aim for a comma-separated file, replace the tabs with commas. Here's how:
- Select any tab character, right click it and choose Copy from the context menu, or press the Ctrl + C key combination.
- Press Ctrl + H to open the Replace dialog box and paste the copied tab (Ctrl + V) in the Find what field. After you've done this, the cursor will move rightwards indicating that the tab is pasted. Type a comma in the Replace with field and click Replace All.
If you want a semicolon-delimited CSV, then replace tabs with semicolons.
If all done right, your resulting txt file should look similar to this:
- Select any tab character, right click it and choose Copy from the context menu, or press the Ctrl + C key combination.
- In Notepad, click File > Save As and do three important changes:
- In the File name box, change the .txt extension to .csv.
- In the Save as type box, pick All files (*.*).
- In the Encoding drop-down menu, select UTF-8 with BOM.
When done, click the Save button.
Tips and notes:
- The byte order mark (BOM) is a sequence of bytes at the start of a text stream that indicates Unicode encoding of a text document. In case of UTF-8 with BOM, the sequence 0xEF,0xBB,0xBF signals the reading program that UTF-8 encoding is used in the file. The Unicode standard permits but does not require the BOM in UTF-8. However, it is often crucial for correct UTF-8 recognition in Excel, especially when converting from Asian languages.
- If your text editor does not allow changing the file extension, you can do that in Windows Explorer.
Now, you can open the CSV file in Excel and make sure all data is rendered correctly:
Note. If your file is intended for use in another application where the UTF-8 format is a must, do not make any edits nor save the file in Excel as this may cause encoding problems. If some data does not appear right in Excel, open the file in Notepad and fix the data there. Remember to save the file in the UTF-8 with BOM format again.
How to convert Excel file to CSV UTF-16
Exporting to CSV UTF-16 is done very much the same way as to CSV UTF-8:
- Save the workbook in the Unicode Text (*.txt) file format.
- Open the .txt document in a text editor such as Notepad and replace all tabs with commas.
- Change the file extension to .csv, make sure encoding is set to UTF-16 LE, and save the file.
Convert multiple Excel sheets to CSV
As already mentioned, Excel's Save As command is only able to convert an active worksheet. But what if your workbook contains a lot of different sheets, and you wish to turn them all into separate csv files? The only alternative suggested by Microsoft is saving each sheet under a different file name, which does not sound very inspiring, huh?
So, is there a quick way to save multiple Excel sheets as CSV at once? Yes, it can be done with VBA.
The below code converts all worksheets in the current workbook to individual CSV files, one for each sheet. The file names are created from the workbook and sheet names (WorkbookName_SheetName.csv) and saved to the same folder as the original document.
Please keep in mind that the above code saves sheets in the CSV format. If there are any non-ASCII characters in your data, then you need to convert to UTF-8 CSV. This can be done by changing the file format from xlCSV to xlCSVUTF8. That is, you replace FileFormat:=xlCSV with FileFormat:=xlCSVUTF8.
Also, remember that CSV UTF-8 conversions are possible in Excel 2016 and higher.
The following guidelines will help you with adding the macro to your workbook: How to insert and run VBA code in Excel.
Apart from the methods described above, there exist a handful of other ways to convert Excel sheets to CVS. Below, I will share a couple of my favorite ones.
Excel to CSV via Google Spreadsheets
The use of Google Spreadsheets for .xlsx to .csv conversions seems a very simple workaround:
- In Google Spreadsheet, click File > Import.
- Click Upload and drag-and-drop the file or select from your computer, and then click Import data.
Tip. If you have a relatively small dataset, it may be easier to copy/paste it directly in the spreadsheet.
- Go to the File menu > Download > Comma-separated values (.csv, current sheet).
Open the downloaded file in some text editor to make sure all the data is exported right.
Note. If your original Excel sheet contains special characters, the resulting CSV file may not display the characters correctly when opened in Excel, though it looks perfect in many other spreadsheet programs.
Save .xlsx to .xls and then convert to .csv
This method of converting Excel to CSV hardly needs any further explanations because the heading says it all :)
I came across this solution on some forum, cannot remember which exactly. To be honest, this method has never worked for me, but many users reported that special characters, which got lost when saving .xlsx directly to .csv, are preserved if to save a .xlsx file to .xls first, and and then save .xls as .csv as explained in How to convert Excel to CSV.
Anyway, you can try this method of exporting Excel to CSV on your side and if it works, this can be a real time-saver.
Convert Excel to CSV using OpenOffice
OpenOffice is an open-source suite of six applications. One of them is a spreadsheet app named Calc, which is really good at exporting spreadsheet data to the CSV format. In fact, it provides more options (encodings, delimiters, etc.) than Microsoft Excel and Google Sheets combined.
To convert your Excel file to CSV, follow these steps:
- Open your Excel document with OpenOffice Calc.
- Click File > Save as… and choose Text CSV (.csv) from the Save as type drop-down menu.
- Next, you will be asked to define encoding and delimiters. If your goal is the CVS format that correctly handles special characters, then choose:
- Unicode (UTF-8) for Character set.
- Comma for Field delimiter. If you need a semicolon-delimited csv file, then select semicolon (;) or whatever delimiter you want.
- Quotation mark for Text delimiter.
To complete the conversion, click OK.
It would be really nice if Excel provided similar options to perform fast and painless CSV conversions, wouldn't it?
These are the ways of converting Excel to CSV I am aware of. If you know other more efficient methods, please do share in comments. Thank you for reading!