ehrdata.io.read_csv#
- ehrdata.io.read_csv(filename, *, layer=None, sep=',', index_column=None, columns_obs_only=None, format='flat', wide_format_time_suffix=None, long_format_keys=None, **kwargs)#
Read a comma-separated values (csv) file into an
EHRDataobject.It first reads the csv file using
pandas.read_csv(), and then passes the resultingDataFrametoehrdata.io.from_pandas(). See the documentation ofehrdata.io.from_pandas()for more details of table layouts.- Parameters:
- filename
Path|str Path to the file or directory to read. Delegates to
pandas.read_csv().- layer
str|None(default:None) The layer to store the data in. If not specified, it uses the
X. Delegates tofrom_pandas().- sep
str(default:',') Separator in the file. Delegates to
pandas.read_csv().- index_column
str|None(default:None) If specified, this column of the csv file will be used for the
.obsdataframe. Delegates tofrom_pandas().- columns_obs_only
Iterable[str] |None(default:None) These columns will be added to the
.obsdataframe only. Delegates tofrom_pandas().- format
Literal['flat','wide','long'] (default:'flat') The format of the input dataframe. If the data is not longitudinal, choose
format="flat". If the data is longitudinal in the long format, chooseformat="long". If the data is longitudinal in a wide format, chooseformat="wide". Delegates tofrom_pandas().- wide_format_time_suffix
str|None(default:None) Use only if
format="wide". Suffices in the variable columns that indicate the time of the observation. The collected suffices will be sorted lexicographically, and the variables ordered accordingly along the 3rd axis of theEHRDataobject. Delegates tofrom_pandas().- long_format_keys
dict[Literal['observation_column','variable_column','time_column','value_column'],str] |None(default:None) Use only if
format="long". The keys of the dataframe in the long format. The dictionary should have the following structure: {“observation_column”: “<the column name of the observation ids>”, “variable_column”: “<the column name of the variable ids>”, “time_column”: “<the column name of the time>”, “value_column”: “<the column name of the values>”}. Delegates tofrom_pandas().- **kwargs
Passed to
pandas.read_csv().
- filename
- Return type:
Examples
>>> import ehrdata as ed >>> edata = ed.io.read_csv("myfile.csv")