Version 0.27.0¶
head
ordering¶
Since Koalas doesn’t guarantee the row ordering, head
could return
some rows from distributed partition and the result is not
deterministic, which might confuse users.
We added a configuration compute.ordered_head
(#1231), and if it is
set to True
, Koalas performs natural ordering beforehand and the
result will be the same as pandas’. The default value is False
because the ordering will cause a performance overhead.
>>> kdf = ks.DataFrame({'a': range(10)})
>>> pdf = kdf.to_pandas()
>>> pdf.head(3)
a
0 0
1 1
2 2
>>> kdf.head(3)
a
5 5
6 6
7 7
>>> kdf.head(3)
a
0 0
1 1
2 2
>>> ks.options.compute.ordered_head = True
>>> kdf.head(3)
a
0 0
1 1
2 2
>>> kdf.head(3)
a
0 0
1 1
2 2
Other improvements¶
Fix identical and equals for the comparison between the same object. (#1220)
Select the series correctly in SeriesGroupBy APIs (#1224)
Fixes
DataFrame/Series.clip
function to preserve its index. (#1232)Throw a better exception in
DataFrame.sort_values
when multi-index column is used (#1238)Fix
fillna
not to change index values. (#1241)Fix
DataFrame.__setitem__
with tuple-named Series. (#1245)Fix
corr
to support multi-index columns. (#1246)Fix output of
print()
matches with pandas of Series (#1250)Fix fillna to support partial column index for multi-index columns. (#1244)
Add as_index check logic to groupby parameter (#1253)
Raising NotImplementedError for elements that actually are not implemented. (#1256)
Fix where to support multi-index columns. (#1249)