returns the index of any NA-like value. However, I wish to treat differently the numeric, string or timestamp NAs.
Example that shows that pd.NA
and pd.NaT
are caught identically in .replace()
import pandas as pd
import numpy as np
from datetime import datetime
floaty = pd.Series([np.nan, 2.0, 3.0], dtype=float)
stringy = pd.Series(["one", pd.NA, "two"], dtype=str)
timy = pd.Series([datetime(2000, 1, 1), datetime(2000, 1, 2), pd.NaT], dtype="datetime64[ns]")
df = pd.DataFrame({"floaty": floaty, "stringy": stringy, "timy": timy})
# Result:
# floaty stringy timy
# 0 NaN one 2000-01-01
# 1 2.0 <NA> 2000-01-02
# 2 3.0 two NaT
df_removed_na_string = df.replace({pd.NA: "fake_string"})
# Expected:
# floaty stringy timy
# 0 NaN one 2000-01-01
# 1 2.0 fake_string 2000-01-02
# 2 3.0 two NaT
# Actual (in pandas 2.2.2 at least):
# floaty stringy timy
# 0 NaN one 2000-01-01
# 1 2.0 fake_string 2000-01-02
# 2 3.0 two fake_string
Is there a way to test for a specific NA type? My workaround would be to loop through columns and act according to their dtype
, but that is harder to do and would not be able to deal with object
columns of mixed types.