ehrdata.io.to_pandas#
- ehrdata.io.to_pandas(edata, *, layer=None, obs_cols=None, var_col=None, format='wide')#
Transform an
EHRDataobject to aDataFrame.- Parameters:
- edata
EHRData Central data object.
- layer
str|None(default:None) The layer to access the values of. If not specified, uses
X.- obs_cols
Iterable[str] |None(default:None) The columns of
obsto add to the dataframe.- var_col
str|None(default:None) The column of
varto create the column names from in the created dataframe. If not specified, thevar_nameswill be used.- format
Literal['wide','long'] (default:'wide') The format of the output dataframe. This is relevant for longitudinal data. If
"wide", the output dataframe will write a column for each (variable, time) tuple, naming the column as<variable_name>_t_<tem.index value>. If"long", the output dataframe will be in long format, with columns"observation_id","variable","time", and"value".
- edata
- Return type:
Examples
>>> import ehrdata as ed >>> edata = ed.dt.ehrdata_blobs(n_observations=2, n_variables=2, base_timepoints=3) >>> edata
>>> EHRData object with n_obs × n_vars × n_t = 2 × 2 × 3 >>> obs: "cluster" >>> tem: '0', '1', '2' >>> shape of .X: (2, 2) >>> shape of .R: (2, 2, 3)
>>> df_wide = ed.io.to_pandas(edata, format="wide") >>> df_wide
feature_0_t_0
feature_0_t_1
feature_0_t_2
feature_1_t_0
feature_1_t_1
feature_1_t_2
0
3.060372
3.827524
4.680650
-1.697623
-1.816282
-2.775774
1
-3.395852
-4.948999
-5.401154
-7.347151
-9.427101
-11.793235
>>> df_long = ed.io.to_pandas(edata, format="long") >>> df_long
observation_id
variable
time
value
0
feature_0
0
3.060372
0
feature_0
1
3.827524
0
feature_0
2
4.680650
0
feature_1
0
-1.697623
0
feature_1
1
-1.816282
0
feature_1
2
-2.775774
1
feature_0
0
-3.395852
1
feature_0
1
-4.948999
1
feature_0
2
-5.401154
1
feature_1
0
-7.347151
1
feature_1
1
-9.427101
1
feature_1
2
-11.793235