ehrdata.integrations.torch.OMOPEHRDataset#
- class ehrdata.integrations.torch.OMOPEHRDataset(con, edata, *, data_tables, target='mortality', datetime=True, idxs=None)#
A
Datasetbuilt from an OMOP CDM database.This class is a
Datasetfrom an OMOP CDM database. It is a Dataset structure for the tensor in ehrdata.R, in a suitable format forDataLoader. This allows to stream the data in batches from the RDBMS, not requiring to load the entire dataset in memory.Note: Each item in the dataset represents an observation unit (e.g., a visit, observation period, or person), not necessarily a unique patient. A single patient can have multiple observation units.
- Parameters:
- con
DuckDBPyConnection The connection to the database.
- edata
EHRData Central data object.
- data_tables
Sequence[Literal['measurement','observation','specimen']] The OMOP data tables to extract.
- target
Literal['mortality'] (default:'mortality') The target variable to be used.
- datetime
bool(default:True) If True, use datetime, if False, use date.
- idxs
Sequence[int] |None(default:None) The indices of the observation units to be used, can be used to include only a subset of the data, for e.g. train-test splits.
- con
Methods table#
|
Methods#
- OMOPEHRDataset.__getitem__(obs_index)#