databricks.koalas.DataFrame.kde

DataFrame.kde(bw_method=None, ind=None, **kwds)[source]

Generate Kernel Density Estimate plot using Gaussian kernels.

Parameters
bw_methodscalar

The method used to calculate the estimator bandwidth. See KernelDensity in PySpark for more information.

indNumPy array or integer, optional

Evaluation points for the estimated PDF. If None (default), 1000 equally spaced points are used. If ind is a NumPy array, the KDE is evaluated at the points passed. If ind is an integer, ind number of equally spaced points are used.

**kwargsoptional

Keyword arguments to pass on to Koalas.DataFrame.plot().

Returns
matplotlib.axes.Axes or numpy.ndarray of them

Examples

For DataFrame, it works in the same way as Series:

>>> df = ks.DataFrame({
...     'x': [1, 2, 2.5, 3, 3.5, 4, 5],
...     'y': [4, 4, 4.5, 5, 5.5, 6, 6],
... })
>>> ax = df.plot.kde(bw_method=0.3)
../../_images/databricks-koalas-DataFrame-kde-1.png
>>> ax = df.plot.kde(bw_method=3)
../../_images/databricks-koalas-DataFrame-kde-2.png
>>> ax = df.plot.kde(ind=[1, 2, 3, 4, 5, 6], bw_method=0.3)
../../_images/databricks-koalas-DataFrame-kde-3.png