I know the web is mostly standardizing towards UTF-8 lately and I was just wondering if there was any place where using UTF-8 would be a bad thing.
There is an argument to be made that adding unnecessary conversions is adding complexity for little benefit. So if your inputs and your outputs use the same format then there is an argument for working in that format too.
Both UTF-8 and UTF-16 are relatively well-designed multi-unit encodings. A smaller sequence of code units never appears as a sub-sequence of a longer sequence and a decoder that detects an error can resume decoding at the next valid code unit.
Some argue that UTF-32 is "better" because it uses one code unit for every Unicode code point. What makes this more questionable though is that there is not a 1:1 mapping between unicode code points and what most users would regard as "characters". So being able to rapidly get the nth code point from a sequence is less useful than it would first appear.
Also, what about in Windows programs, Linux shell and things of that nature -- can you safely use UTF-8 there?
Windows and Unix-like systems took different approaches to the introduction of Unicode. Both approaches had their pros and cons.
Windows introduced 16 bit Unicode (initially UCS-2, later UTF-16) by introducing a parallel set of APIs. Applications or frameworks that wanted Unicode support had to switch to the new APIs. This was further complicated by the fact that while windows NT offered Unicode support in all APIs, windows 9x only offered it in a subset.
On the filesystem side, windows NT's native NTFS filesystem used 16 bit unicode filenames from the start. For the FAT filesystem which pre-dated windows NT, Unicode was introduced as part of Long filename support. Similarly for CDs, the Joliet extension added Unicode long filenames.
So did the long filename extensions for FAT, and the Joliet long filename extensions for CDs.
Unix-like systems on the other hand introduced Unicode by using UTF-8 and treating it like any other extended-ascii character set. Filenames on Unix filesystems have always been sequences of bytes, where the meaning assigned to those bytes is down to the user's environment.
There are pros and cons to both approaches. The Unix approach allowed even non unicode aware programs to handle Unicode text to some extent. On the other hand it meant users had to essentially choose between a "Unicode" environment where everything was UTF-8 and where any pre-unicode files would need conversion and a "legacy" environment where Unicode was not supported.
Some programming languages or frameworks will attempt to settle on an encoding and convert everything to that encoding. This is however complicated by the fact that on both Windows and Unix-like systems a program may encounter strings from the operating system that do not pass validation for their nominal encoding. This can happen for a number of reasons, including legacy data from pre-transition software, truncation that does not take account of the multi-unit encodings and use of what are nominally text strings to pass non-text data and just plain old errors.