Question Answering (QA) systems can support health coaches in facilitating clients' lifestyle behavior changes (e.g., in adopting healthy sleep habits). In this paper, we formulate a domain-specific QA task for sleep coaching. To this end, we release SleepQA, a dataset created from 7,005 passages comprising 4,250 training examples with single annotations and 750 examples with 5-way annotations. We train a bi-encoder retrieval system on our dataset and perform extensive automated and human evaluations of the resulting end-to-end QA system. Comparisons of our model with various baselines shows improvements for domain-specific natural language processing on real-world questions. We hope that this dataset will lead to wider research interest in this important health domain.
SleepQA: A Health Coaching Dataset on Sleep for Extractive Question Answering
Iva Bojic, Qi Ong, Megh Thakkar, Esha Kamran, Irving Shua, Rei Pang, Jessica Chen, Vaaruni Nayak, Shafiq Joty, and Josip Car. In 2022 Machine Learning for Health (Proceedings Track) (ML4H@NeurIPS'22) 2022.
PDF Abstract BibTex Slides