Version 0.30.0

Slice column selection support in loc

We continue to improve loc indexer and added the slice column selection support (#1351).

>>> from databricks import koalas as ks
>>> df = ks.DataFrame({'a':list('abcdefghij'), 'b':list('abcdefghij'), 'c': range(10)})
>>> df.loc[:, "b":"c"]
   b  c
0  a  0
1  b  1
2  c  2
3  d  3
4  e  4
5  f  5
6  g  6
7  h  7
8  i  8
9  j  9

Slice row selection support in loc for multi-index

We also added the support of slice as row selection in loc indexer for multi-index (#1344).

>>> from databricks import koalas as ks
>>> import pandas as pd
>>> df = ks.DataFrame({'a': range(3)}, index=pd.MultiIndex.from_tuples([("a", "b"), ("a", "c"), ("b", "d")]))
>>> df.loc[("a", "c"): "b"]
     a
a c  1
b d  2

Slice row selection support in iloc

We continued to improve iloc indexer to support iterable indexes as row selection (#1338).

>>> from databricks import koalas as ks
>>> df = ks.DataFrame({'a':list('abcdefghij'), 'b':list('abcdefghij')})
>>> df.iloc[[-1, 1, 2, 3]]
   a  b
1  b  b
2  c  c
3  d  d
9  j  j

Support of setting values via loc and iloc at Series

Now, we added the basic support of setting values via loc and iloc at Series (#1367).

>>> from databricks import koalas as ks
>>> kser = ks.Series([1, 2, 3], index=["cobra", "viper", "sidewinder"])
>>> kser.loc[kser % 2 == 1] = -kser
>>> kser
cobra        -1
viper         2
sidewinder   -3

Other new features and improvements

We added the following new feature:

DataFrame:

Series:

Index:

MultiIndex:

Other improvements

  • Compute Index.is_monotonic/Index.is_monotonic_decreasing in a distributed manner (#1354)

  • Fix SeriesGroupBy.apply() to respect various output (#1339)

  • Add the support for operations between different DataFrames in groupby() (#1321)

  • Explicitly don’t support to disable numeric_only in stats APIs at DataFrame (#1343)

  • Fix index operator against Series and Frame to use iloc conditionally (#1336)

  • Make nunique in DataFrame to return a Koalas DataFrame instead of pandas’ (#1347)

  • Fix MultiIndex.drop() to follow renaming et al. (#1356)

  • Add column axis in ks.concat (#1349)

  • Fix iloc for Series when the series is modified. (#1368)

  • Support MultiIndex for duplicated, drop_duplicates. (#1363)