Version 0.28.0

pandas 1.0 support

We added pandas 1.0 support (#1197, #1299), and Koalas now can work with pandas 1.0.

map_in_pandas

We implemented DataFrame.map_in_pandas API (#1276) so Koalas can allow any arbitrary function with pandas DataFrame against Koalas DataFrame. See the example below:

>>> import databricks.koalas as ks
>>> df = ks.DataFrame({'A': range(2000), 'B': range(2000)})
>>> def query_func(pdf):
...     num = 1995
...     return pdf.query('A > @num')
...
>>> df.map_in_pandas(query_func)
         A     B
1996  1996  1996
1997  1997  1997
1998  1998  1998
1999  1999  1999

Standardize code style using Black

As a development only change, we added Black integration (#1301). Now, all code style is standardized automatically via running ./dev/reformat, and the style is checked as a part of ./dev/lint-python.

Other new features and improvements

We added the following new feature:

DataFrame:

Other improvements

  • Fix DataFrame.describe() to support multi-index columns. (#1279)

  • Add util function validate_bool_kwarg (#1281)

  • Rename data columns prior to filter to make sure the column names are as expected. (#1283)

  • Add an faq about Structured Streaming. (#1298)

  • Let extra options have higher priority to allow workarounds (#1296)

  • Implement ‘keep’ parameter for drop_duplicates (#1303)

  • Add a note when type hint is provided to DataFrame.apply (#1310)

  • Add a util method to verify temporary column names. (#1262)