Manage process workflows and provenance across distributed resources
Geoweaver is a web based plaform that enables you to manage deep learning workflows with provenance across distributed resources.
In Geoweaver, hosts are the resources where we run our processes. Here, we are registering a machine that we connect to through SSH.
We can also add Jupyter Servers as host resources, allowing us to interact with and edit our workflow processes.
We can add a variety of different types of processes, here we add a bash script that will download imagery prior to processing.
And we can upload or import Jupyter notebooks.
Jupyter Notebook source: Intermediate Earth Data Science Textbook, an online course provided by Earth Lab CU Boulder at earthdatascience.org
Geoweaver stores a detailed history of every process we run. We can view the logs, and examine outputs on the server.
Processes can be linked together to form workflows, so that we can run our processes in parallel, or sequentially, across separate resources, and manage it all in one place.
Geoweaver intercepts websocket traffic between you and your Jupyter Server, saving versions of your notebook as you edit, allowing you to return to previous states.
We are developing support for Google Earth Engine as a Geoweaver host resource.
Ziheng Sun, Center for Spatial Information Science and Systems, George Mason University
Andrew Magill, Texas Advanced Computing Center, The University of Texas at Austin
Liping Di, Center for Spatial Information Science and Systems, George Mason University
Annie Burgess, Earth Science Information Partners (ESIP)
Jason A. Tullis, Department of Geosciences and Center for Advanced Spatial Technologies, University of Arkansas
Contact