databricks.koalas.read_delta

databricks.koalas.read_delta(path: str, version: Optional[str] = None, timestamp: Optional[str] = None, index_col: Union[str, List[str], None] = None, **options) → databricks.koalas.frame.DataFrame[source]

Read a Delta Lake table on some file system and return a DataFrame.

If the Delta Lake table is already stored in the catalog (aka the metastore), use ‘read_table’.

Parameters
pathstring

Path to the Delta Lake table.

versionstring, optional

Specifies the table version (based on Delta’s internal transaction version) to read from, using Delta’s time travel feature. This sets Delta’s ‘versionAsOf’ option.

timestampstring, optional

Specifies the table version (based on timestamp) to read from, using Delta’s time travel feature. This must be a valid date or timestamp string in Spark, and sets Delta’s ‘timestampAsOf’ option.

index_colstr or list of str, optional, default: None

Index column of table in Spark.

options

Additional options that can be passed onto Delta.

Returns
DataFrame

Examples

>>> ks.range(1).to_delta('%s/read_delta/foo' % path)
>>> ks.read_delta('%s/read_delta/foo' % path)
   id
0   0
>>> ks.range(10, 15, num_partitions=1).to_delta('%s/read_delta/foo' % path, mode='overwrite')
>>> ks.read_delta('%s/read_delta/foo' % path)
   id
0  10
1  11
2  12
3  13
4  14
>>> ks.read_delta('%s/read_delta/foo' % path, version=0)
   id
0   0

You can preserve the index in the roundtrip as below.

>>> ks.range(10, 15, num_partitions=1).to_delta(
...     '%s/read_delta/bar' % path, index_col="index")
>>> ks.read_delta('%s/read_delta/bar' % path, index_col="index")
... 
       id
index
0      10
1      11
2      12
3      13
4      14