Did you know that the Dask scheduler has a --jupyter flag that will start a Jupyter server running within the Dask Dashboard?
--jupyter the dashboard will have a link to Jupyter in the top-right. You can also manually nagivate to http://[scheduler ip]:[dashboard port]/jupyter/lab.
Dask Kubernetes
When launching Dask clusters on Kubernetes with dask-kubernetes you can also set this flag in your cluster config to run Jupyter in your KubeCluster.
jupyterlab package needs to be installed in your container image. If you’re using the default Dask images you can install this at runtime by setting the EXTRA_PIP_PACKAGES environment variable to jupyterlab.
from dask_kubernetes.operator import KubeCluster, make_cluster_spec
# Create a cluster spec
spec = make_cluster_spec(
name="jupyter-example",
n_workers=2,
env={"EXTRA_PIP_PACKAGES": "jupyterlab"},
)
# Append the --jupyter flag to the scheduler command args
spec["spec"]["scheduler"]["spec"]["containers"][0]["args"].append("--jupyter")
# Create your cluster
cluster = KubeCluster(custom_cluster_spec=spec)
Now that your cluster is up and running let’s find where our Jupyter is running.
>>> print(cluster.dashboard_link)
http://localhost:56952/status
Ok we can see the Dask dashboard is being port forwarded to port 56952 on localhost. So we can access Jupyter at http://localhost:56952/jupyter/lab.
The Jupyter environment will also be pre-configured to be able to connect to the Dask cluster so in your notebooks all you need to do is create a dask.distributed.Client.
from dask.distributed import Client
client = Client()