Marc Genton

Department of Statistics, Texas A&M University


Functional Boxplots for Visualization of Complex Curve/Image Data: An Application to Precipitation and Climate Model Output

In many statistical experiments, the observations are functions by nature, such as temporal curves or spatial surfaces/images, where the basic unit of information is the entire observed function rather than a string of numbers. For example the temporal evolution of several cells, the intensity of medical images of the brain from MRI, the spatio-temporal records of precipitation in the U.S., or the output from climate models, are such complex data structures. Our interest lies in the visualization of such data and the detection of outliers. With this goal in mind, we have defined functional boxplots and surface boxplots. Based on the center outwards ordering induced by band depth for functional data or surface data, the descriptive statistics of such boxplots are: the envelope of the 50% central region, the median curve/image and the maximum non-outlying envelope. In addition, outliers can be detected in a functional/surface boxplot by the 1.5 times the 50% central region empirical rule, analogous to the rule for classical boxplots. We illustrate the construction of a functional boxplot on a series of sea surface temperatures related to the El Nino phenomenon and its outlier detection performance is explored by simulations. As applications, the functional boxplot is demonstrated on spatio-temporal U.S. precipitation data for nine climatic regions and on climate general circulation model (GCM) output. Further adjustments of the functional boxplot for outlier detection in spatio-temporal data are discussed as well. The talk is based on joint work with Ying Sun.