loc
iloc
We improved loc and iloc indexers. Now, loc can support scalar values as indexers (#1172).
>>> import databricks.koalas as ks >>> >>> df = ks.DataFrame([[1, 2], [4, 5], [7, 8]], ... index=['cobra', 'viper', 'sidewinder'], ... columns=['max_speed', 'shield']) >>> df.loc['sidewinder'] max_speed 7 shield 8 Name: sidewinder, dtype: int64 >>> df.loc['sidewinder', 'max_speed'] 7
In addition, Series derived from a different Frame can be used as indexers (#1155).
>>> import databricks.koalas as ks >>> >>> ks.options.compute.ops_on_diff_frames = True >>> >>> df1 = ks.DataFrame({'A': [0, 1, 2, 3, 4], 'B': [100, 200, 300, 400, 500]}, ... index=[20, 10, 30, 0, 50]) >>> df2 = ks.DataFrame({'A': [0, -1, -2, -3, -4], 'B': [-100, -200, -300, -400, -500]}, ... index=[20, 10, 30, 0, 50]) >>> df1.A.loc[df2.A > -3].sort_index() 10 1 20 0 30 2
Lastly, now loc uses its natural order according to index identically with pandas’ when using the slice (#1159, #1174, #1179). See the example below.
>>> df = ks.DataFrame([[1, 2], [4, 5], [7, 8]], ... index=['cobra', 'viper', 'sidewinder'], ... columns=['max_speed', 'shield']) >>> df.loc['cobra':'viper', 'max_speed'] cobra 1 viper 4 Name: max_speed, dtype: int64
We added the following new features:
koalas.Series:
get (#1153)
get
koalas.Index
drop (#1117)
drop
len (#1161)
len
set_names (#1134)
set_names
argmin (#1162)
argmin
argmax (#1162)
argmax
koalas.MultiIndex:
from_product (#1144)
from_product
Add support from_pandas for Index/MultiIndex. (#1170)
from_pandas
Add a hidden column __natural_order__. (#1146)
__natural_order__
Introduce _LocIndexerLike and consolidate some logic. (#1149)
_LocIndexerLike
Refactor LocIndexerLike.__getitem__. (#1152)
LocIndexerLike.__getitem__
Remove sort in GroupBy._reduce_for_stat_function. (#1147)
GroupBy._reduce_for_stat_function
Randomize index in tests and fix some window-like functions. (#1151)
Explicitly don’t support Index.duplicated (#1131)
Index.duplicated
Fix DataFrame._repr_html_(). (#1177)
DataFrame._repr_html_()