Skip to content

On Kubernetes

As of Apache Spark 3.1 release in March 2021, spark on kubernetes is generally available. This is great in so many ways, including but not limited to:

  • you have full control of your infra
  • takes less than 10 seconds to start a spark cluster
  • can store logs in a central location, to be viewed later via spark history server
  • can use minio as local storage backend (better throughput compared to calling S3 via home/work internet)
  • cheaper than all managed solutions, even serverless variants (more on this later)

Read Mores