Visual Analysis of High-Dimensional Event Sequence Data via Dynamic Hierarchical Aggregation
Temporal event data are collected across a broad range of domains, and a variety of visual analytics techniques have been developed to empower analysts working with this form of data. These techniques generally display aggregate statistics computed over sets of event sequences that share common patterns. Such techniques are often hindered, however, by the high-dimensionality of many real-world event sequence datasets because the large number of distinct event types within such data prevents effective aggregation. A common coping strategy for this challenge is to group event types together as a pre-process, prior to visualization, so that each group can be represented within an analysis as a single event type. However, computing these event groupings as a pre-process also places significant constraints on the analysis. This paper presents a dynamic hierarchical aggregation technique that leverages a predefined hierarchy of dimensions to computationally quantify the informativeness of alternative levels of grouping within the hierarchy at runtime. This allows users to dynamically explore the hierarchy to select the most appropriate level of grouping to use at any individual step within an analysis. Key contributions include an algorithm for interactively determining the most informative set of event groupings from within a large-scale hierarchy of event types, and a scatter-plus-focus visualization that supports interactive hierarchical exploration. While these contributions are generalizable to other types of problems, we apply them to high-dimensional event sequence analysis using large-scale event type hierarchies from the medical domain. We describe their use within a medical cohort analysis tool called Cadence, demonstrate an example in which the proposed technique supports better views of event sequence data, and report findings from domain expert interviews.
READ FULL TEXT