databricks.koalas.
read_delta
Read a Delta Lake table on some file system and return a DataFrame.
If the Delta Lake table is already stored in the catalog (aka the metastore), use ‘read_table’.
Path to the Delta Lake table.
Specifies the table version (based on Delta’s internal transaction version) to read from, using Delta’s time travel feature. This sets Delta’s ‘versionAsOf’ option.
Specifies the table version (based on timestamp) to read from, using Delta’s time travel feature. This must be a valid date or timestamp string in Spark, and sets Delta’s ‘timestampAsOf’ option.
Index column of table in Spark.
Additional options that can be passed onto Delta.
See also
DataFrame.to_delta, read_table, read_spark_io, read_parquet
DataFrame.to_delta
read_table
read_spark_io
read_parquet
Examples
>>> ks.range(1).to_delta('%s/read_delta/foo' % path) >>> ks.read_delta('%s/read_delta/foo' % path) id 0 0
>>> ks.range(10, 15, num_partitions=1).to_delta('%s/read_delta/foo' % path, mode='overwrite') >>> ks.read_delta('%s/read_delta/foo' % path) id 0 10 1 11 2 12 3 13 4 14
>>> ks.read_delta('%s/read_delta/foo' % path, version=0) id 0 0
You can preserve the index in the roundtrip as below.
>>> ks.range(10, 15, num_partitions=1).to_delta( ... '%s/read_delta/bar' % path, index_col="index") >>> ks.read_delta('%s/read_delta/bar' % path, index_col="index") ... id index 0 10 1 11 2 12 3 13 4 14