My job occasionally requires that I handle files written and named in international character sets. Sometimes that is inconvenient, not in the least because file names can cause odd behavior at the command line, or interfere with the way files are managed or opened. My worst-case scenario was a file I couldn’t save after I had edited it, because the name of the file was interfering with the application’s ability to write to disk.
Manually renaming a file is the obvious solution, or if there is more than one, a fun little tool called detox will systematically convert nonstandard characters into boring equivalents. In my case, a Japanese file name might be changed from an unreadable sequence to something like “K-U_e_R_yen_ae_yo_o.doc” — which is easier for me to open, edit and send along.
By default detox will handle mundane things like converting spaces (which sometimes annoy me) to underscores, changing unusual character sets with analogues in common keysets, and weeding out characters which otherwise interfere with life at the command line — like certain quote marks or keycodes.
One of the nice things about detox is that it is configurable to a very low level, so if you don’t like the particular conversion it picks on its own, you can adjust it slightly for different results. That also means that specific sequences and character-to-character translations are probably doable.
As it is a command line tool there’s nothing really to show for it in action. The documentation is excellent and it makes a provisions for dry-runs with the
-n flag, so you can test it once or twice if you have a fear of committment. There may be other ways to circumvent this issue but as an easy one-step solution to a lesser inconvenience, I find this acceptable. :D