Graph analysis tutorial with GraphFrames

This tutorial notebook shows you how to use GraphFrames to perform graph analysis. Databricks recommends using a cluster running Databricks Runtime for Machine Learning, as it includes an optimized installation of GraphFrames.

To run the notebook:

  1. If you are not using a cluster running Databricks Runtime ML, use one of these methods to install the GraphFrames library.

  2. Download the SF Bay Area Bike Share data from Kaggle and unzip it. You must sign into Kaggle using third-party authentication or create and sign into a Kaggle account.

  3. Upload station.csv and trip.csv using the add data UI.

    The tables are named station_csv and trip_csv.

Graph Analysis with GraphFrames notebook

Open notebook in new tab