My work aims to ensure everyone is responsibly represented in data.

Research Interests: statistical data privacy | statistical disclosure control | differential privacy | synthetic data | data equity | statistical computing | STEM Education

Data Governance and Privacy

The Data Governance and Privacy (DGP) practice area at the Urban Institute empowers data curators, practitioners, and public policymakers to safely expand access to data sources that responsibly represent data subjects. We work at the intersection of data governance and public policy to improve data access for better evidence-based decisionmaking. We advance our goal under these four areas:​

  • accessibility: collaborating with partner agencies to improve data access ​
    (e.g., Secure Transfer, Restricted-Use Data Lake)​
  • privacy: developing and implementing cutting edge and practical privacy-preserving technologies and methodologies that safely expand data access (e.g., Safe Data Technologies) ​
  • accuracy: applying statistical methods to increase data quality and efficiency ​
    (e.g., The Nation’s Data at Risk report)​
  • usability: creating data privacy and analysis tools as well as improving data privacy and access communications and education (e.g., creating tools or hosting training sessions for local, state, and federal government agencies like the state of Nebraska)​

Data Governance and Public Policy

To make evidence-based decisions, public policymakers need access to data that accurately represents the population. However, a growing concern among researchers and policymakers is that statistical data privacy methods (or statistical disclosure control) — which allow access to data while preserving participants’ privacy — often fail to explicitly consider participants’ representation such as in rural regions. Techniques such as data suppression, the addition of random noise under differential privacy, or the generation of synthetic data aim to balance the need for accurate information with privacy considerations. However, each approach involves a utility-risk tradeoff that can have equity implications for different racial groups. Without accounting for people’s representation in data, researchers risk unintentionally perpetuating harm by disproportionately affecting either the privacy risks or the utility of the information derived from the data.

This work represents my efforts at the intersection of data privacy and public policy, where I strive to address these critical issues and promote a more equitable approach to data protection and governance.

  • Bowen, CMK. & Snoke, J. (2023) “Do No Harm Guide: Applying Equity Awareness In Data Privacy Methods.” Urban Institute.
    • Click here to view the article on the project landing page (open-access).

Data Synthesis and Differentially Private Data Synthesis Methods

Within the extensive field of statistical data privacy literature, both differential privacy and data synthesis, and their integration, have gained significant popularity as solutions for releasing analytically valuable data while protecting individual privacy. It’s crucial to recognize that there is no methodological “silver bullet” that applies to all data. Therefore, ongoing development and refinement of differentially private methods and data synthesis techniques remain essential.

Below, I present my work in the development and practical application of differentially private and synthetic data methods to real-world datasets.

    • Bowen, CMK., Bryant, V., Burman, L., Khitatrakun, S., McClelland, R., Mucciolo, L., Pickens, M., and Williams, A. (2022) “Synthetic Individual Income Tax Data: Promises and Challenges.” National Tax Journal, 75(4), 767-790.
      • Click here to view the article on the journal page.
    • Bowen, CMK., Bryant, V., Burman, L., Czajka, J., Khitatrakun, S., MacDonald, G., … & Zwiefel, N. (2022). Synthetic Individual Income Tax Data: Methodology, Utility, and Privacy Implications. In International Conference on Privacy in Statistical Databases (pp. 191-204). Springer, Cham.
      • Click here to view the article on the journal page.
    • Liu, F., Eugenio, E., Jin, I., and Bowen, CMK. (2022) “Differentially Private Synthesis and Sharing of Network Data via Bayesian Exponential Random Graph Models.” Journal of Survey Statistics and Methodology, DOI 10.1093/jssam/smac017
      • Click here to view the article on the journal page.
    • Bowen, CMK., Liu, F., & Su, B. (2021) “Differentially Private Data Release via Statistical Election to Partition Sequentially.” METRON.
      • Click here to view the article on the journal page.
    • Bowen, CMK., Narayanan, A., Scally, C. (2021) “Using Differential Privacy to Advance Rural Economic Development: Applying Data Privacy and Confidentiality Methods to Industry Employment Data.” Urban Institute.
      • Click here to view the research brief (open-source).
    • Bowen, CMK., Bryant, V., Burman, L., Khitatrakun, S., McClelland, R., Stallworth, P., Ueyama,K., Williams, A. (2020) “A Synthetic Supplemental Public Use File of Low-Income Information Return Data: Methodology, Utility, and Privacy Implications.” International Conference on Privacy in Statistical Databases (pp. 257-270). Springer, Cham.
    • Eugenio, E., Liu, F., Jin, I., and Bowen, CMK. (2020) “Differentially Private Synthesis of Social Networks via Exponential Random Graph Models” Proceedings of 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Pages: 1695-1700, DOI 10.1109/COMP-SAC48688.2020

Evaluating Statistical Data Privacy Methods

Balancing data utility against data privacy risks is a complex task. Many individuals seeking an answer to this question expect a one-size-fits-all utility or disclosure risk metric that perfectly assesses the quality of any data or statistic released under a privacy-preserving method or technology. But, such a metric doesn’t exist.

Below, I outline my work in evaluating the efficacy of different statistical data privacy methods and offer insights into how we should approach the delicate balance between privacy and utility.

    • Barrientos, A. F., Williams, A. R., Snoke, J., & Bowen, CMK. (2024) “A Feasibility Study of Differentially Private Summary Statistics and Regression Analyses with Evaluations on Administrative and Survey Data.” Journal of the American Statistical Association.
      • Click here to view the article on the journal page.
      • Click here to see the arXiv copy.
    • Bowen, CMK. & Snoke, J. (2021) “Comparative Study of Differentially Private Synthetic Data Algorithms from the NIST PSCR Differential Privacy Synthetic Data Challenge” Journal of Privacy and Confidentiality, 11 (1).
      • Click here to view the article on the journal page (open-source).
    • Bowen, CMK. & Liu, F. (2020) “Comparative Study on Differentially Private Data Synthesis Methods.” Statistical Science.
      • Click here to view the article on the journal page.

Communications and Education

The following represents my efforts in introducing data synthesis, differential privacy, and various statistical data privacy methods and technologies to a wider audience, including the scientific and non-technical communities.

  • Joshua Snoke and I chatted on the podcast, Stats + Stories, discussing how the data privacy landscape is changing.
    • Click here to listen.
  • I was a guest on the podcast, Data Science Imposters, discussing what is differential privacy how it impacts everyone.
    • Click here to listen.
    • Seeman, J., Williams, A. R., & Bowen, CMK. (2025) “Synthetic Data for the Nebraska Statewide Workforce & Educational Reporting System.” Urban Institute.
      • Click here to view the non-technical brief (open-access).
    • Bowen, CMK., (2024) “Government Data of the People, by the People, for the People: Navigating Citizen Privacy Concerns” Journal of Economic Perspectives.
      • Click here to view the article.
    • Hu, J. & and Bowen, CMK., (2024) “Advancing microdata privacy protection: A review of synthetic data methods” WIREs.
      • Click here to view the article.
    • Williams, A.R. & and Bowen, CMK., (2023) “The Promise and Limitations of Formal Privacy” WIREs.
      • Click here to view the article.
    • Bowen, CMK., Williams, A. R., & Pickens, M. (2022) “Decennial Disclosure: An Explainer on formal Privacy and the TopDown Algorithm.” Urban Institute.
      • Click here to view the research brief (open-access).
    • Garfinkel, S. & and Bowen, CMK., (2022) “Preserving Privacy While Sharing Data” MITSloan Management Review.
      • Click here to view the article.
    • Bowen, CMK. & Garfinkel, S., (2021) “The Philosophy of Differential Privacy” Notices of the American Mathematical Society.
      • Click here to view the article (open-access).
    • Bowen, CMK. (2021) “Personal Privacy and the Public Good: Balancing Data Privacy and Data Utility.” Urban Institute.
      • Click here to view the research brief (open-access).
    • Snoke, J. & Bowen, CMK., (2020) “How Statisticians Should Grapple with Privacy in a Changing Data Landscape.” CHANCE, Special Issue: A New Generation of Statisticians Tackles Data Privacy.
      • Click here to view the article (open-access).
    • Bowen, CMK & Eugenio, E., (2017) “Where’s Wenda: An Activity on Teaching Middle School Students Data Privacy.” Statistics Teacher.
      • Click here to find the article (open-access).

Other Research Projects

Endurance Sports

These publication(s) are on endurance sports, which is something I am personally passionate instead of being an expert in the field.

  • Bowen, CMK. (2024) “Significant strides: Women’s advancement in endurance sports.” Significance Magazine, Volume 21, Issue 3, 18-21.
    • Click here to find the article.

Statistical Computing

With the rapid advancement in computational power, we increasingly rely on simulated experiments over physical ones, primarily due to cost considerations. For instance, at Los Alamos National Laboratory, there is a physical experiment that lasts less than a second but costs upwards of $10 million. This underscores the value of supercomputers in resource conservation. However, it’s important to note that these high-performance systems are vulnerable to various factors, including heat, power fluctuations, and cosmic radiation. Moreover, while computational power has seen significant growth, data storage and transfer rates continue to lag behind. The following papers delve into these critical issues.

  • Bowen, CMK., DeBardeleben, N., Blanchard, S., & Anderson-Cook, C. (2019) “Do Solar Proton Events Reduce the Number of Faults in Supercomputers?: A Comparative Analysis of Faults during and without Solar Proton Events.” 2019 IEEE International Reliability Physics.
    • Click here to find the article.
  • Myers, K., Lawrence, E., Fugate, M., Bowen, CMK., Ticknor, L., Woodring, J., Wendelberger, J., & Ahrens, J. (2016) “Partitioning a Large Simulation as It Runs.” Technometrics., doi: 10.1080/00401706.2016.1158740.
    • Click here to find the article.

Consulting

These publication(s) showcase projects for which I have provided consulting services.

  • Bowen, CMK., Liu, F., & Wheeler, J. (2015) “Are More of My Patients Developing Side Effects than Expected.” Practical Radiation Oncology, Volume 5, Issue 3, e255-e261.
    • Click here to find the article.