As far as I'm concerned, the misconceptions go back at least as far as when I learned relational databases, in the early 1980s. My easiest conceptual handle in 1NF is that the value that lies at the intersection of a row and a column must be a simple value. However, the definition of "simple" is not all that simple (little joke).
A character string is a simple value, even though it is made up of components, namely characters. A timestamp is simple even though it is made up of a date and a time, each of which are made up of things like year, month, etc.
I treat a list of values separated by commas as a violation of 1NF, although this may simply mean that I use the informal definition.
In terms of relations and the relational data model, there is nothing in relational math to forbid the intersection of a tuple and an attribute from itself being a relation. And it was this construct that I think E.F. Codd was intending to exclude when he came up with normalization. (Later named first normal form, once the second form was discovered.)
Why did he use the term "normalization". Well, here is what he was driving at, IMO. For any defined collection of information requirements, there is a set of relational models that all express those requirements adequately. The members of that set can be regarded as "equivalent" in this sense. One can come up with a rule for picking one member of that set and calling it the "normal" representation of the entire set. The rule Codd came up with was what became the 1NF rule: no subrelations.
Another place where the concept of "normalized" had come up earlier in computing was floating point numbers. The numbers 0.1 and 1e-1 both represent the same number, and there is an uncountable number of ways of representing that same number. One of them can be called the "normalized" representation. As soon as (binary) floating point numbers began to be supported in computer instruction sets, there was a way to "normalize" a binary floating point number in such a way that the mantissa was always between one-half and one, unless the number was zero. It's not important to understand these details. My point is just that the term "normalized" was already part of computer jargon in 1970, when Codd wrote his paper.
A larger question, IMO, is "what's so bad about violating 1NF? Why is 1NF important?"
The answer to that question has two parts: one has to do with the logical model, and one has to do with the physical implementation of the first relational DBMS, which didn't come until about 1978.
In the logical model, it's desirable to say that specifying a tuple (by providing a key value) and specifying an attribute (by name) ought to be sufficient information to pin down the value being specified to a single value, at the level of detail the DBMS deals with. The early literature called this "keyed access to all data".
In the physical model, it's important to avoid having to do an entire table scan in order to do a simple search for all occurrences of a simple value. An index ought to do the trick in a few disk reads instead of maybe millions of disk reads for a table scan. To my knowledge, no DBMS ever built can create an index on the individual values stashed in a CSV text stored in a column. And Codd surely wanted to avoid the need for such an index when the first relational DBMS was to be built.
It's this last point that brings me back to the "theory is practical" motto that has been so helpful to me. Neophyte database designers come in here all the time, asking why it wouldn't be "more efficient" to store a list of course codes in the student table, instead of resorting to the "complexity" of creating a junction table with two foreign keys, studentId and courseId. The answer that convinces the neophyte usually has to do with doing table scans, and the associated delay, for operations that a good DBMS will do via an index, provided that the appropriate index has been created. IMO, that's what I think Codd was driving at, too.
Just to further muddy the waters, it's possible (at least in Oracle) to define a table that contains sub-tables in one of the columns. The question might arise whether such a database is or is not in 1NF. My answer is no.
(sorry this is so long.)