logo
  • Getting started
  • User Guide
  • API Reference
  • Development
  • Release Notes
  • Options and settings
  • Working with pandas and PySpark
  • Transform and apply a function
  • Type Support In Koalas
  • Type Hints In Koalas
  • Best Practices
  • FAQ

User Guide¶

  • Options and settings
    • Getting and setting options
    • Operations on different DataFrames
    • Default Index type
    • Available options
  • Working with pandas and PySpark
    • pandas
    • PySpark
  • Transform and apply a function
    • transform and apply
    • koalas.transform_batch and koalas.apply_batch
  • Type Support In Koalas
    • Type casting between PySpark and Koalas
    • Type casting between pandas and Koalas
    • Internal type mapping
  • Type Hints In Koalas
    • Koalas DataFrame and Pandas DataFrame
    • Type Hinting with Names
  • Best Practices
    • Leverage PySpark APIs
    • Check execution plans
    • Use checkpoint
    • Avoid shuffling
    • Avoid computation on single partition
    • Avoid reserved column names
    • Do not use duplicated column names
    • Specify the index column in conversion from Spark DataFrame to Koalas DataFrame
    • Use distributed or distributed-sequence default index
    • Reduce the operations on different DataFrame/Series
    • Use Koalas APIs directly whenever possible
  • FAQ
    • What’s the project’s status?
    • Is it Koalas or koalas?
    • Should I use PySpark’s DataFrame API or Koalas?
    • Does Koalas support Structured Streaming?
    • How can I request support for a method?
    • How is Koalas different from Dask?
    • How can I contribute to Koalas?
    • Why a new project (instead of putting this in Apache Spark itself)?
Koalas Talks and Blogs Options and settings

© Copyright 2020, Databricks.
Created using Sphinx 3.0.4.