|Title||Data Quality in Mobile Sensing Datasets for Pervasive Healthcare|
|Publication Type||Book Chapter|
|Year of Publication||2017|
|Authors||Hernández, N, Castro, LA, Favela, J, Michán, L, Arnrich, B|
|Editor||Khan, SU, Zomaya, AY, Abbas, A|
|Book Title||Handbook of Large-Scale Distributed Computing in Smart Healthcare|
|Publisher||Springer International Publishing|
Mobile sensing is becoming a popular approach for inferring patterns of activity and behavior to determine how they affect health and wellbeing. This data-driven approach has the potential to become a major tool in the field of epidemiology, aimed at determining the causes of disease in populations, as well as motivating behavior change. These sensing technologies are generating large datasets that demand significant processing and data management resources. Studies in mobile sensing for healthcare have motivated the creation of large, complex datasets with information opportunistically gathered from distributed sensors in mobile devices. In this chapter, we discuss some of the architectural challenges regarding data gathering in this distributed data-intensive environment such as the healthcare industry, as well as issues regarding the organization and sharing of the large amounts of data collected. Some of these issues include the heterogeneity of the devices, diversity of sensors used, and the need for data provenance when integrating datasets from diverse studies. We highlight that assessing data quality is of paramount importance for conducting longitudinal studies and building on historical knowledge as new data become available. Finally, we identify future research topics in the growing field of mobile sensing and its application to healthcare and wellbeing. We discuss aspects of data curation, data quality, and data provenance, and we provide suggestions on how these challenges could be addressed in the near future.