ehrdata.harmonize_missing_values

ehrdata.harmonize_missing_values#

ehrdata.harmonize_missing_values(edata, *, layer=None, missing_values=['nan', 'np.nan', '<NA>', 'pd.NA'], copy=False)#

Harmonize missing values in the EHRData object.

This function will replace strings that are considered to represent missing values with np.nan.

Parameters:
edata EHRData

Data object.

layer str | None (default: None)

The layer to use from the EHRData object. If None, the X layer is used.

missing_values Iterable[str] | None (default: ['nan', 'np.nan', '<NA>', 'pd.NA'])

The strings that are considered to represent missing values and should be replaced with np.nan

copy bool (default: False)

Whether to return a copy of the EHRData object with the missing values replaced.

Return type:

EHRData | None

Examples

>>> import ehrdata as ed
>>> edata = ed.dt.mimic_2()
>>> ed.harmonize_missing_values(edata)