2

I have a table with different numerical values in the Microsoft SQL database. All columns defined as float in SQL. Some of the values are NULL as well.

When operating on the data, using the Pandas library (version 0.23.4) in Python (version 3.6), some of the NULL values are imported as NaN and some as None.

One row for example, as observed from within the SQL Management Studio:

NULL, NULL, -1.17, 0.67, 0.42, -0.14, 0.84, -0.30, NULL

Importing with read_sql I get the below result in Python:

nan, nan, -1.17, 0.67, 0.42, -0.14, 0.84, -0.30, None

I am curious what is the difference from Pandas point of view?

Should I treat the values differently?

I understand that NaN is treated as a numerical value, allowing faster operations in Pandas, while None is simply a notion of non-existing object. But why is there an inconsistency?

Tim Mironov
  • 849
  • 8
  • 22
  • 1
    [What is the difference between NaN and None?](https://stackoverflow.com/questions/17534106/what-is-the-difference-between-nan-and-none) – Parfait Apr 01 '19 at 18:59
  • @Parfait, thank you for the reference, I've read it, but it still doesn't explain why some of `NULL`s in the same query were imported as `NaN`s and some of `NULL`s as `None`s – Tim Mironov Apr 02 '19 at 14:09
  • 1
    Is any column in table of database entirely blank (i.e., all NULLs)? – Parfait Apr 02 '19 at 14:32
  • 2
    @Parfait, I think you are on it. The last column is all `NULL`s in the time of the `read_sql` call. But it is defined as `float` in SQL Server as well. So my conclusion is that Pandas figures the type of data from the actual values in the column and not the definition of it in SQL. Since in the time of the call the column is blank - Pandas cannot associate any type to the column, thus returns `None`. Did I get it right? – Tim Mironov Apr 02 '19 at 14:40
  • 1
    I have encountered this inconsistency too (in sqlite3), and I suspect from patterns in my data that it is related to the value types inferred by pandas on the first record or something like that. Unfortunately, I cannot see any documentation for this behavior or ways to control this behavior in the arguments to this function. – matanster Dec 19 '20 at 14:46
  • 1
    This might be an explanation: https://stackoverflow.com/a/53330130/1509695 – matanster Dec 19 '20 at 15:20
  • Does this answer your question? [Pandas read\_sql\_query returning None for all values in some columns](https://stackoverflow.com/questions/53315035/pandas-read-sql-query-returning-none-for-all-values-in-some-columns) – maxschlepzig Apr 19 '21 at 21:59

0 Answers0