ehrdata.infer_feature_types#
- ehrdata.infer_feature_types(edata, *, layer=None, output='tree', binary_as='categorical', verbose=True)#
Infer feature types of an
EHRDataobject.For each feature in
edata.var_names, the method infers one of the following types:'date','categorical', or'numeric'. The inferred types are stored inedata.var['feature_type']. Please check the inferred types and adjust if necessary usingedata.var['feature_type']['feature1']='corrected_type'or withreplace_feature_types(). Be aware that not all features stored numerically are of'numeric'type, as categorical features might be stored in a numerically encoded format. For example, a feature with values [0, 1, 2] might be a categorical feature with three categories. This is accounted for in the method, but it is recommended to check the inferred types.- Parameters:
- edata
EHRData Data object.
- layer
str|None(default:None) The layer to use from the EHRData object. If
None, theXfield is used.- output
Literal['tree','dataframe'] |None(default:'tree') The output format. Choose between
'tree','dataframe', orNone. If'tree', the feature types will be printed to the console in a tree format. If'dataframe', aDataFramewith the feature types will be returned. IfNone, nothing will be returned.- binary_as
Literal['categorical','numeric'] (default:'categorical') How to classify binary features with values 0 and 1. If
'categorical'(default), binary features are classified as categorical. If'numeric', binary features are classified as numeric.- verbose
bool(default:True) Whether to print warnings for uncertain feature types.
- edata
- Return type:
Examples
>>> import ehrdata as ed >>> edata = ed.dt.mimic_2() >>> ed.infer_feature_types(edata)