Version 0.20.0¶

Disable Arrow 0.15¶

Apache Arrow 0.15.0 was released on the 5th of October, 2019, which Koalas depends on to execute Pandas UDF, but the Spark community reports an issue with PyArrow 0.15.

We decided to set an upper bound for pyarrow version to avoid such issues until we are sure that Koalas works fine with it.

Set an upper bound for pyarrow version. (#918)

Multi-index columns support¶

We continue improving multi-index columns support. We made the following APIs support multi-index columns:

pivot_table (#908)
melt (#920)

Other new features and improvements¶

We added the following new features:

koalas.DataFrame:

xs (#892)

koalas.Series:

drop_duplicates (#896)
replace (#903)

koalas.GroupBy:

shift (#910)

Along with the following improvements:

Implement nested renaming for groupby agg (#904)
Add ‘index_col’ parameter to DataFrame.to_spark (#906)
Add more options to read_csv (#916)
Add NamedAgg (#911)
Enable DataFrame setting value as list of labels (#905)

Version 0.21.0 Version 0.19.0