Environment details
- OS type and version: Ubuntu 20.04.2
- Python version:
3.9.4
- pip version:
21.1.2
google-cloud-bigquery version: 2.20.0
Steps to reproduce
- Run the code example below
Code example
import pandas
from google.cloud import bigquery
df = pandas.DataFrame({
"series_a": [1, 2, pandas.NA]
})
json_iter = bigquery._pandas_helpers.dataframe_to_json_generator(df)
for row in json_iter:
print(row)
Stack trace
{'series_a': 1}
{'series_a': 2}
Traceback (most recent call last):
File "/home/christian/code/bug_example.py", line 11, in <module>
for row in json_iter:
File "/home/christian/miniconda3/envs/data-services-prod/lib/python3.9/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 783, in dataframe_to_json_generator
if value != value:
File "pandas/_libs/missing.pyx", line 360, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous
Suggested fix
Starting with pandas 1.0, an experimental pandas.NA value (singleton) is available to represent scalar missing values as
opposed to numpy.nan. Comparing the variable with itself (value != value) results in a TypeError as the pandas.NA value doesn't support type-casting to boolean.
I am planning to make a PR that switches the syntax value != value on _pandas_helpers.py#L783 to use the pandas.isna function but wanted to check if there is a better solution before I submit a patch?
Environment details
3.9.421.1.2google-cloud-bigqueryversion:2.20.0Steps to reproduce
Code example
Stack trace
Suggested fix
Starting with pandas 1.0, an experimental pandas.NA value (singleton) is available to represent scalar missing values as
opposed to numpy.nan. Comparing the variable with itself (
value != value) results in aTypeErroras thepandas.NAvalue doesn't support type-casting to boolean.I am planning to make a PR that switches the syntax
value != valueon _pandas_helpers.py#L783 to use thepandas.isnafunction but wanted to check if there is a better solution before I submit a patch?