We added pandas 1.0 support (#1197, #1299), and Koalas now can work with pandas 1.0.
We implemented DataFrame.map_in_pandas API (#1276) so Koalas can allow any arbitrary function with pandas DataFrame against Koalas DataFrame. See the example below:
DataFrame.map_in_pandas
>>> import databricks.koalas as ks >>> df = ks.DataFrame({'A': range(2000), 'B': range(2000)}) >>> def query_func(pdf): ... num = 1995 ... return pdf.query('A > @num') ... >>> df.map_in_pandas(query_func) A B 1996 1996 1996 1997 1997 1997 1998 1998 1998 1999 1999 1999
As a development only change, we added Black integration (#1301). Now, all code style is standardized automatically via running ./dev/reformat, and the style is checked as a part of ./dev/lint-python.
./dev/reformat
./dev/lint-python
We added the following new feature:
DataFrame:
query (#1273)
query
unstack (#1295)
unstack
Fix DataFrame.describe() to support multi-index columns. (#1279)
DataFrame.describe()
Add util function validate_bool_kwarg (#1281)
Rename data columns prior to filter to make sure the column names are as expected. (#1283)
Add an faq about Structured Streaming. (#1298)
Let extra options have higher priority to allow workarounds (#1296)
Implement ‘keep’ parameter for drop_duplicates (#1303)
drop_duplicates
Add a note when type hint is provided to DataFrame.apply (#1310)
Add a util method to verify temporary column names. (#1262)