Advancing Organizational Science Through Synthetic Data: A Path to Enhanced Data Sharing and Collaboration

By Pengda Wang
Rice University

Andrew C. Loignon
Center for Creative Leadership

Sirish Shrestha
Center for Creative Leadership

George C. Banks
University of North Carolina at Charlotte

Frederick L. Oswald
Rice University

Summary

The importance of data sharing in organizational science is well-acknowledged, yet the field faces hurdles that prevent this, including concerns around privacy, proprietary information, and data integrity. We propose that synthetic data generated using machine learning (ML) could offer one promising solution to surmount at least some of these hurdles. Although this technology has been widely researched in the field of computer science, most organizational scientists are not familiar with it. To address the lack of available information for organizational scientists, we propose a systematic framework for the generation and evaluation of synthetic data. This framework is designed to guide researchers and practitioners through the intricacies of applying ML technologies to create robust, privacy-preserving synthetic data. Additionally, we present two empirical demonstrations using the ML method of Generative Adversarial Networks (GANs) to illustrate the practical application and potential of synthetic data in organizational science. Through this exploration, we aim to furnish the community with a foundational understanding of synthetic data generation and encourage further investigation and adoption of these methodologies. By doing so, we hope to foster scientific advancement by enhancing data-sharing initiatives within the field.

Citation

Wang, P., Loignon, A.C., Shrestha, S., Banks, G.C., & Oswald, F. (in press). Advancing Organizational Science Through Synthetic Data: A Path to Enhanced Data Sharing and Collaboration. Journal of Business and Psychology. 1-27. https://doi.org/10.1007/s10869-024-09997-w.

LINK

Leave a Comment