This self-paced guide is the “Hello World” tutorial for Apache Spark using Databricks. In the following tutorial modules, you will learn the basics of creating Spark jobs, loading data, and working with data. You’ll also get an introduction to running machine learning algorithms and working with streaming data. Databricks lets you start writing Spark queries instantly so you can focus on your data problems.
In the sidebar and on this page you can see five tutorial modules, each representing a stage in the process of getting started with Apache Spark on Databricks. Each of these modules refers to standalone usage scenarios with ready-to-run notebooks and preloaded datasets; you can jump ahead if you feel comfortable with the basics.
- Get started with Apache Spark
- DataFrames tutorial
- Datasets tutorial
- Machine learning with MLlib tutorial
- Structured Streaming tutorial
- What’s next