I found this issue quite frustrating, and due to the lack of information on the internet, I thought I would post a quick tutorial on the subject. The issue is with the Mac and opening a CSV file with excel, which has Japanese, Chinese, or Korean text in it. Upon opening the file, the CSV text is parsed properly and you see your records of data, however the asian font is garbled up (in Japanese terms, we say “Mojibake” – in other words – junk text). It’s basically unreadable garbage.
A quick google search, you will see that many people have this issue, and none of the solutions I found actually worked. Don’t you just love that. They mention things like opening the file first with TextEdit and making sure encoding is set to UTF-8 in preferences before saving. They also tell you to convert the file from US-ASCII to UTF-8. I also saw people mention that Libreoffice (which is free) handles the files correctly. This is true, I confirmed it with the software, however I couldn’t stand the interface and wanted this to work in excel. I also saw a reference to Numbers, however in the same article they mention it has many problems even dealing with CSV files (which sounds extremely strange, seeing that it’s a spreadsheet program, hah). Anyway, I want to use excel, so I had to find a solution which involved that.
If you want, on a mac, open up terminal and type:
file -I filename.csv
This will tell you what format your file is currently set to, in my case it was UTF-8, however the problem persisted. I recalled dealing with this some time ago a PC and had found the solution. So I jogged the memory and recalled it was notepad++ that sorted this out. So I know this is an incredibly ridiculous process for what should be solved fairly easily, or not even be a problem in the first place, but here is the solution.
I use VMware Fusion with a virtual Windows system. You will need to download the free Notepad++ software. Open the file, click on the encoding tab in the menu bar, select Encode in UTF-8-BOM. Now save your file. Now, open the file in excel on Windows (if you have it), you will see all the text appear just fine. Move the file back to your Mac and open in excel. You will see the Japanese, Korean, or Chinese text appear just fine. Oddly, if you do another check of the file with file -l filename.csv, the charset (format) is still UTF-8. So I have no idea where this data gets changed or how, but it works. So if you are pulling out your hair in frustration, I hope this saves your day!