
I'm writing a game in Lua, specifically using Love2D, but this question is more oriented towards Lua in general.
I need to take files in a specific format, but the files may be encoded with UTF8, simple ASCII, or SHIFT-JIS. Is there a simple, easy way to determine the encoding of that specific file via a library? If I can do that, then it would be pretty easy to write some helper functions to translate the text into something I can work with.
As far as I can tell, the file format doesn't have any sort of "doctype" field that identifies the format. I opened up one of the files in a hex editor, and there's nothing at the start that isn't visible in a text editor.
For anyone curious about the project itself, I'm writing a BMS player, so I'm working with files that could be as old as 1998, which is why I'm having to deal with SHIFT-JIS sometimes.
EDIT SOLUTION:
This entire thing is a bit convoluted, but I used /u/PhilipRoman's heuristic method outlined here to determine if a given text file was either SHIFT-JIS or not. I default to UTF-8 if it's determined to not be SHIFT-JIS. I made a simple conversion lookup table by scraping the contents of a web page and doing some small manual editing. Here it is, in case anyone else wants it. Seems accurate enough from just typing some Japanese phrases via my IME. Here is the lookup table itself in case anyone was curious to use for themself. From there you just plug the relevant bytes into the lookup table and you have valid unicode to print to the screen. Thanks for the suggestions everybody, this is super helpful.