Version 1.5.0¶
Index operations support¶
We improved Index operations support (#1944, #1955).
Here are some examples:
Before
>>> kidx = ks.Index([1, 2, 3, 4, 5]) >>> kidx + kidx Int64Index([2, 4, 6, 8, 10], dtype='int64') >>> kidx + kidx + kidx Traceback (most recent call last): ... AssertionError: args should be single DataFrame or single/multiple Series
>>> ks.Index([1, 2, 3, 4, 5]) + ks.Index([6, 7, 8, 9, 10]) Traceback (most recent call last): ... AssertionError: args should be single DataFrame or single/multiple Series
After
>>> kidx = ks.Index([1, 2, 3, 4, 5]) >>> kidx + kidx + kidx Int64Index([3, 6, 9, 12, 15], dtype='int64')
>>> ks.options.compute.ops_on_diff_frames = True >>> ks.Index([1, 2, 3, 4, 5]) + ks.Index([6, 7, 8, 9, 10]) Int64Index([7, 9, 13, 11, 15], dtype='int64')
Other new features and improvements¶
We added the following new features:
DataFrame:
Series:
Index:
to_list
(#1948)
MultiIndex:
to_list
(#1948)
Other improvements and bug fixes¶
Support DataFrame parameter in Series.dot (#1931)
Add a best practice for checkpointing. (#1930)
Remove implicit switch-ons of “compute.ops_on_diff_frames” (#1953)
Fix Series._to_internal_pandas and introduce Index._to_internal_pandas. (#1952)
Fix first/last_valid_index to support empty column DataFrame. (#1923)
Use pandas’ transpose when the data is expected to be small. (#1932)
Fix tail to use the resolved copy (#1942)
Avoid unneeded reset_index in DataFrameGroupBy.describe. (#1951)
TypeError when Index.name / Series.name is not a hashable type (#1883)
Adjust data column names before attaching default index. (#1947)
Add plotly into the optional dependency in Koalas (#1939)
Add plotly backend test cases (#1938)
Don’t pass stacked in plotly area chart (#1934)
Set upperbound of matplotlib to avoid failure on Ubuntu (#1959)
Fix GroupBy.descirbe for multi-index columns. (#1922)
Upgrade pandas version in CI (#1961)
Compare Series from the same anchor (#1956)
Add videos from Data+AI Summit 2020 EUROPE. (#1963)
Set PYARROW_IGNORE_TIMEZONE for binder. (#1965)