Version 1.6.0¶

Improved Plotly backend support¶

We improved plotting support by implementing pie, histogram and box plots with Plotly plot backend. Koalas now can plot data with Plotly via:

DataFrame.plot.pie and Series.plot.pie (#1971)
DataFrame.plot.hist and Series.plot.hist (#1999)
Series.plot.box (#2007)

In addition, we optimized histogram calculation as a single pass in DataFrame (#1997) instead of launching each job to calculate each Series in DataFrame.

Operations between Series and Index¶

The operations between Series and Index are now supported as below (#1996):

>>> kser = ks.Series([1, 2, 3, 4, 5, 6, 7])
>>> kidx = ks.Index([0, 1, 2, 3, 4, 5, 6])

>>> (kser + 1 + 10 * kidx).sort_index()
   2
  13
  24
  35
  46
  57
  68
dtype: int64
>>> (kidx + 1 + 10 * kser).sort_index()
  11
  22
  33
  44
  55
  66
  77
dtype: int64

Support setting to a `Series` via attribute access¶

We have added the support of setting a column via attribute assignment in DataFrame, (#1989).

>>> kdf = ks.DataFrame({'A': [1, 2, 3, None]})
>>> kdf.A = kdf.A.fillna(kdf.A.median())
>>> kdf
     A
0  1.0
1  2.0
2  3.0
3  2.0

Other new features, improvements and bug fixes¶

We added the following new features:

Series:

factorize (#1972)
sem (#1993)

DataFrame

insert (#1983)
sem (#1993)

In addition, we also implement new parameters:

Add min_count parameter for Frame.sum. (#1978)
Added ddof parameter for GroupBy.std() and GroupBy.var() (#1994)
Support ddof parameter for std and var. (#1986)

Along with the following fixes:

Fix stat functions with no numeric columns. (#1967)
Fix DataFrame.replace with NaN/None values (#1962)
Fix cumsum and cumprod. (#1982)
Use Python type name instead of Spark’s in error messages. (#1985)
Use object.__setattr__ in Series. (#1991)
Adjust Series.mode to match pandas Series.mode (#1995)
Adjust data when all the values in a column are nulls. (#2004)
Fix as_spark_type to not support “bigint”. (#2011)

Version 1.7.0 Version 1.5.0

Version 1.6.0¶

Improved Plotly backend support¶

Operations between Series and Index¶

Support setting to a Series via attribute access¶

Other new features, improvements and bug fixes¶

Support setting to a `Series` via attribute access¶