by Svetlana Cheusheva, updated on
The article explorers quick and efficient ways to export data from Excel to CSV keeping all special characters and foreign symbols intact. The methods work for all versions of Excel, from 365 to 2007.
Comma separated values (CSV) is a widely used format that stores tabular data (numbers and text) as plain text. Its popularity and viability are due to the fact that CSV files are supported by many different applications and systems at least as an alternative import/export format. The CSV format allows users to glance at the file and immediately diagnose the problems with data, change the delimiter, text qualifier, etc. All this is possible because a CSV document is plain text and an average user or even a novice can easily understand it without any learning curve.
Microsoft Excel allows saving a file in a few different CSV formats, and you may be curious to know the differences between them.
In essence, each CSV format saves data as comma-separated values but performs encoding in a slightly different way. For example, Mac uses a single carriage return (<CR>) represented by \r for a line break, while Windows uses a combination of carriage return and line feed (<CRLF>) represented by \r\n.
To correctly export data to other programs, Excel lets you choose the CSV formatting that best matches the program's expectations.
Here are the CSV options available in Excel 365. In your version, the list may look a little different.
CSV (comma delimited). This format saves data in a comma-separated text file that can be used in another Windows program or another version of the Windows operating system.
CSV (Macintosh). This format saves a workbook as a comma-separated file for use on the Mac operating system.
CSV (MS-DOS). Saves as a comma-separated document for use on the MS-DOS operating system.
CSV UTF-8 (comma delimited). It is Unicode Transformation Format 8-bit encoding that supports many special characters, including hieroglyphs and accented characters, and is backward compatible with ASCII. This format is recommended for files that contain any non-ASCII characters since the classic CSV format destroys them.
Besides CSV, there is one more format that may come in extremely handy for communicating with other programs.
Unicode Text (*.txt). This is a computing industry standard supported by almost all current operating systems including Windows, Macintosh, Linux and Solaris Unix. It can handle characters of almost all modern languages and some ancient ones.
Note. By strict definition, the CSV format implies separating values with commas. In reality, you may come across many other delimiters, a semicolon and tab being most common.
When Excel data is to be transferred to some other application such as the Outlook Address book or Access database, the easiest way is to save your worksheet as a .csv file, and then import that file to another program.
To save an Excel file (.xlsx or .xls) in the CSV format, here are the steps you need to follow:
In case your worksheet has any formatting, formulas, charts, shapes or other objects, you will be informed that some features in your workbook might be lost if you save it as CSV (Comma delimited). If that is Okay, click Yes to complete the conversion without the unsupported features.
If your spreadsheet contains some special symbols, smart quotes or long dashes (e.g. inherited from a Word document), foreign characters (tildes, accents, etc.) or hieroglyphs, the method described above won't work.
The point is the saving to the CSV (comma delimited) format distorts any characters other than ASCII (American Standard Code for Information Interchange).
To keep non-ASCII characters undamaged, a document should be saved to a format that uses a Unicode character encoding. There exist two Unicode encoding forms: 8-bit (UTF-8) and 16-bit (UTF-16).
Before we move to the exporting steps, let us point out the key features of each encoding, so you can choose the format right for a particular case.
UTF-8 is a more compact encoding since it uses 1 to 4 bytes for each symbol. Generally, this format is recommended if ASCII characters are most prevalent in your file because most such characters are stored in one byte each. Another advantage is that a UTF-8 file containing only ASCII characters has absolutely the same encoding as an ASCII file.
UTF-16 uses 2 to 4 bytes to encode each symbol. However, a UTF-16 file does not always require more storage than UTF-8. For example, Japanese characters take 3 to 4 bytes in UTF-8 and 2 to 4 bytes in UTF-16. So, you may want to use UTF-16 if your data contains any Asian characters, including Japanese, Chinese or Korean. A noticeable disadvantage of this format is that it's not fully compatible with ASCII files and requires some Unicode-aware programs to display them. Please keep that in mind if you are going to import the resulting document somewhere outside of Excel.
Once you've decided on the format, the below instructions will walk you through the process.
Suppose you have a worksheet with some foreign characters, Japanese names in our case:
Depending on the Excel version you are using, it may take 3 to 5 steps to convert this file to CSV keeping all special characters.
In Excel 2016 and later versions, you can save a file in the CSV format with UTF-8 encoding directly:
As older Excel versions do not support the UTF-8 encoding, you'll need to save your document in the Unicode Text format first, and then convert it to UTF-8.
To export an Excel file to CSV and preserve special characters, follow these steps:
Note. Some simple text editors do not fully support all Unicode characters, therefore certain characters may be displayed as boxes. In most cases this won't affect the resulting file, so you can simply ignore this or use a more advanced text editor such as Notepad++.
If you want a semicolon-delimited CSV, then replace tabs with semicolons.
If all done right, your resulting txt file should look similar to this:
When done, click the Save button.
Tips and notes:
Now, you can open the CSV file in Excel and make sure all data is rendered correctly:
Note. If your file is intended for use in another application where the UTF-8 format is a must, do not make any edits nor save the file in Excel as this may cause encoding problems. If some data does not appear right in Excel, open the file in Notepad and fix the data there. Remember to save the file in the UTF-8 with BOM format again.
Exporting to CSV UTF-16 is done very much the same way as to CSV UTF-8:
As already mentioned, Excel's Save As command is only able to convert an active worksheet. But what if your workbook contains a lot of different sheets, and you wish to turn them all into separate csv files? The only alternative suggested by Microsoft is saving each sheet under a different file name, which does not sound very inspiring, huh?
So, is there a quick way to save multiple Excel sheets as CSV at once? Yes, it can be done with VBA.
The below code converts all worksheets in the current workbook to individual CSV files, one for each sheet. The file names are created from the workbook and sheet names (WorkbookName_SheetName.csv) and saved to the same folder as the original document.
Please keep in mind that the above code saves sheets in the CSV format. If there are any non-ASCII characters in your data, then you need to convert to UTF-8 CSV. This can be done by changing the file format from xlCSV to xlCSVUTF8. That is, you replace FileFormat:=xlCSV with FileFormat:=xlCSVUTF8.
Also, remember that CSV UTF-8 conversions are possible in Excel 2016 and higher.
The following guidelines will help you with adding the macro to your workbook: How to insert and run VBA code in Excel.
Apart from the methods described above, there exist a handful of other ways to convert Excel sheets to CVS. Below, I will share a couple of my favorite ones.
The use of Google Spreadsheets for .xlsx to .csv conversions seems a very simple workaround:
Tip. If you have a relatively small dataset, it may be easier to copy/paste it directly in the spreadsheet.
Open the downloaded file in some text editor to make sure all the data is exported right.
Note. If your original Excel sheet contains special characters, the resulting CSV file may not display the characters correctly when opened in Excel, though it looks perfect in many other spreadsheet programs.
This method of converting Excel to CSV hardly needs any further explanations because the heading says it all :)
I came across this solution on some forum, cannot remember which exactly. To be honest, this method has never worked for me, but many users reported that special characters, which got lost when saving .xlsx directly to .csv, are preserved if to save a .xlsx file to .xls first, and and then save .xls as .csv as explained in How to convert Excel to CSV.
Anyway, you can try this method of exporting Excel to CSV on your side and if it works, this can be a real time-saver.
OpenOffice is an open-source suite of six applications. One of them is a spreadsheet app named Calc, which is really good at exporting spreadsheet data to the CSV format. In fact, it provides more options (encodings, delimiters, etc.) than Microsoft Excel and Google Sheets combined.
To convert your Excel file to CSV, follow these steps:
To complete the conversion, click OK.
It would be really nice if Excel provided similar options to perform fast and painless CSV conversions, wouldn't it?
These are the ways of converting Excel to CSV I am aware of. If you know other more efficient methods, please do share in comments. Thank you for reading!
Table of contents