pandas.DataFrame.isna
returns the index of any NA-like value. However, I wish to treat differently the numeric, string or timestamp NAs.
Example that shows that pd.NA
and pd.NaT
are caught identically in .replace()
:
import pandas as pd
import numpy as np
from datetime import datetime
floaty = pd.Series([np.nan, 2.0, 3.0], dtype=float)
stringy = pd.Series(["one", pd.NA, "two"], dtype=str)
timy = pd.Series([datetime(2000, 1, 1), datetime(2000, 1, 2), pd.NaT], dtype="datetime64[ns]")
df = pd.DataFrame({"floaty": floaty, "stringy": stringy, "timy": timy})
print(df)
# Result:
# floaty stringy timy
# 0 NaN one 2000-01-01
# 1 2.0 <NA> 2000-01-02
# 2 3.0 two NaT
df_removed_na_string = df.replace({pd.NA: "fake_string"})
print(df_removed_na_string)
# Expected:
# floaty stringy timy
# 0 NaN one 2000-01-01
# 1 2.0 fake_string 2000-01-02
# 2 3.0 two NaT
# Actual (in pandas 2.2.2 at least):
# floaty stringy timy
# 0 NaN one 2000-01-01
# 1 2.0 fake_string 2000-01-02
# 2 3.0 two fake_string
Is there a way to test for a specific NA type? My workaround would be to loop through columns and act according to their dtype
, but that is harder to do and would not be able to deal with object
columns of mixed types.