UCSC-SOE-21-06: Comparing Emulation Methods for a High-resolution Storm Surge Model

Grant Hutchings, Bruno Sanso, James Gattiker, Devin Francom, Donatella Pasqualini
06/01/2021 04:16 PM
Statistics
The availability of powerful computing resources has led scientists to increasingly utilize simulation as a research tool. The statistical analysis of simulations, referred to as computer experiments, has similarly grown. Gaussian Process (GP) models have proven themselves exceptionally useful in this domain and have become a standard methodology for emulation of simulator response. However, with moderately large training data, GP's require careful implementation to scale appropriately. There are a number of reasonable emulation methods available from ready to use software packages. In this paper we compare four such models: BASS; BART; SEPIA; and RobustGaSP, by applying them to high-resolution hurricane inundation (flooding) data obtained from the Sea, Lake, and Overland Surges from Hurricanes (SLOSH) simulator. Both SEPIA and RobustGaSP are based on Gaussian Process modeling, while BASS implements a model based on adaptive splines, and BART is based on sums of regression trees. We will describe the modeling strategies implemented in these four packages, which run on R and Python, and then compare them in terms of computation time and a variety of predictive metrics. The four models included in this comparison study were chosen for their proven and distinct methodologies, their availability through easily accessible software, and their ability to quantify prediction uncertainty in the context of our application. The data in our case study form a large spatial grid with millions of response values. We find that SEPIA and RobustGaSP provide exceptional predictive power, but cannot scale to accommodate computer experiments as large as the one considered in this paper as effectively as BASS and BART.

UCSC-SOE-21-06