databricks.koalas.read_delta¶
-
databricks.koalas.
read_delta
(path: str, version: Optional[str] = None, timestamp: Optional[str] = None, index_col: Union[str, List[str], None] = None, **options) → databricks.koalas.frame.DataFrame[source]¶ Read a Delta Lake table on some file system and return a DataFrame.
If the Delta Lake table is already stored in the catalog (aka the metastore), use ‘read_table’.
- Parameters
- pathstring
Path to the Delta Lake table.
- versionstring, optional
Specifies the table version (based on Delta’s internal transaction version) to read from, using Delta’s time travel feature. This sets Delta’s ‘versionAsOf’ option.
- timestampstring, optional
Specifies the table version (based on timestamp) to read from, using Delta’s time travel feature. This must be a valid date or timestamp string in Spark, and sets Delta’s ‘timestampAsOf’ option.
- index_colstr or list of str, optional, default: None
Index column of table in Spark.
- options
Additional options that can be passed onto Delta.
- Returns
- DataFrame
Examples
>>> ks.range(1).to_delta('%s/read_delta/foo' % path) >>> ks.read_delta('%s/read_delta/foo' % path) id 0 0
>>> ks.range(10, 15, num_partitions=1).to_delta('%s/read_delta/foo' % path, mode='overwrite') >>> ks.read_delta('%s/read_delta/foo' % path) id 0 10 1 11 2 12 3 13 4 14
>>> ks.read_delta('%s/read_delta/foo' % path, version=0) id 0 0
You can preserve the index in the roundtrip as below.
>>> ks.range(10, 15, num_partitions=1).to_delta( ... '%s/read_delta/bar' % path, index_col="index") >>> ks.read_delta('%s/read_delta/bar' % path, index_col="index") ... id index 0 10 1 11 2 12 3 13 4 14