You are tasked with creating a custom deep neural network in Keras to forecast customer purchases based on their purchase history. To assess the performance across various model architectures, while storing training data and comparing evaluation metrics on a unified dashboard, what approach should you adopt?