My work aims to ensure everyone is responsibly represented in data.
Research Interests: statistical data privacy | statistical disclosure control | differential privacy | synthetic data | data equity | statistical computing | STEM Education
Data Governance and Privacy
The Data Governance and Privacy (DGP) practice area at the Urban Institute empowers data curators, practitioners, and public policymakers to safely expand access to data sources that responsibly represent data subjects. We work at the intersection of data governance and public policy to improve data access for better evidence-based decisionmaking. We advance our goal under these four areas:
- accessibility: collaborating with partner agencies to improve data access
(e.g., Secure Transfer, Restricted-Use Data Lake) - privacy: developing and implementing cutting edge and practical privacy-preserving technologies and methodologies that safely expand data access (e.g., Safe Data Technologies)
- accuracy: applying statistical methods to increase data quality and efficiency
(e.g., The Nation’s Data at Risk report) - usability: creating data privacy and analysis tools as well as improving data privacy and access communications and education (e.g., creating tools or hosting training sessions for local, state, and federal government agencies like the state of Nebraska)
Data Governance and Public Policy
To make evidence-based decisions, public policymakers need access to data that accurately represents the population. However, a growing concern among researchers and policymakers is that statistical data privacy methods (or statistical disclosure control) — which allow access to data while preserving participants’ privacy — often fail to explicitly consider participants’ representation such as in rural regions. Techniques such as data suppression, the addition of random noise under differential privacy, or the generation of synthetic data aim to balance the need for accurate information with privacy considerations. However, each approach involves a utility-risk tradeoff that can have equity implications for different racial groups. Without accounting for people’s representation in data, researchers risk unintentionally perpetuating harm by disproportionately affecting either the privacy risks or the utility of the information derived from the data.
This work represents my efforts at the intersection of data privacy and public policy, where I strive to address these critical issues and promote a more equitable approach to data protection and governance.
Related Reports and Public Comments
- Opportunities for the New Administration to Improve Americans’ Well-Being (December 20, 2024)
- Response to a Request for Information from the White House Office of Science and Technology Policy on the Federal Evidence Agenda on Disability Equity (July 18, 2024)
- The Nation’s Data at Risk: Meeting America’s information needs for the 21st century (July 2024)
- Public Comment on Proposed Rulemaking to Expand Tax Data Sharing with the US Census Bureau (April 30, 2024)
- Toward a 21st Century National Data Infrastructure: Managing Privacy and Confidentiality Risks with Blended Data (January 2024)
- Public Comment on Initial Proposals from the Federal Interagency Technical Working Group on Race and Ethnicity Standards for Revising OMB’s 1997 Statistical Policy Directive No. 15 (April 27, 2023)
Related Papers
- Bowen, CMK. & Snoke, J. (2023) “Do No Harm Guide: Applying Equity Awareness In Data Privacy Methods.” Urban Institute.
- Click here to view the article on the project landing page (open-access).
Related Blogs
- A blog on Data@Urban my intern wrote and led titled, Analyzing the Privacy and Utility Trade-off fro Synthetic Datasets with Imbalanced Demographic Groups
- I wrote a blog on the Urban Wire with my colleagues titled, To Advance Racial Equity, Releasing Disaggregated Data while Protecting Privacy Will Be Key
- A blog I co-authored on the Urban Wire about the Presidential Executive Order, How the Federal Government Can Use Data to Make the Most of the Executive Order on Racial Equity
- I wrote a blog on the Urban Wire titled, Will the Census’s Data Privacy Efforts Erase Rural America?
Data Synthesis and Differentially Private Data Synthesis Methods
Within the extensive field of statistical data privacy literature, both differential privacy and data synthesis, and their integration, have gained significant popularity as solutions for releasing analytically valuable data while protecting individual privacy. It’s crucial to recognize that there is no methodological “silver bullet” that applies to all data. Therefore, ongoing development and refinement of differentially private methods and data synthesis techniques remain essential.
Below, I present my work in the development and practical application of differentially private and synthetic data methods to real-world datasets.
Related Papers
-
- Bowen, CMK., Bryant, V., Burman, L., Khitatrakun, S., McClelland, R., Mucciolo, L., Pickens, M., and Williams, A. (2022) “Synthetic Individual Income Tax Data: Promises and Challenges.” National Tax Journal, 75(4), 767-790.
- Click here to view the article on the journal page.
- Bowen, CMK., Bryant, V., Burman, L., Czajka, J., Khitatrakun, S., MacDonald, G., … & Zwiefel, N. (2022). Synthetic Individual Income Tax Data: Methodology, Utility, and Privacy Implications. In International Conference on Privacy in Statistical Databases (pp. 191-204). Springer, Cham.
- Click here to view the article on the journal page.
- Liu, F., Eugenio, E., Jin, I., and Bowen, CMK. (2022) “Differentially Private Synthesis and Sharing of Network Data via Bayesian Exponential Random Graph Models.” Journal of Survey Statistics and Methodology, DOI 10.1093/jssam/smac017
- Click here to view the article on the journal page.
- Bowen, CMK., Liu, F., & Su, B. (2021) “Differentially Private Data Release via Statistical Election to Partition Sequentially.” METRON.
- Click here to view the article on the journal page.
- Bowen, CMK., Narayanan, A., Scally, C. (2021) “Using Differential Privacy to Advance Rural Economic Development: Applying Data Privacy and Confidentiality Methods to Industry Employment Data.” Urban Institute.
- Click here to view the research brief (open-source).
- Bowen, CMK., Bryant, V., Burman, L., Khitatrakun, S., McClelland, R., Stallworth, P., Ueyama,K., Williams, A. (2020) “A Synthetic Supplemental Public Use File of Low-Income Information Return Data: Methodology, Utility, and Privacy Implications.” International Conference on Privacy in Statistical Databases (pp. 257-270). Springer, Cham.
- Eugenio, E., Liu, F., Jin, I., and Bowen, CMK. (2020) “Differentially Private Synthesis of Social Networks via Exponential Random Graph Models” Proceedings of 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Pages: 1695-1700, DOI 10.1109/COMP-SAC48688.2020
- Bowen, CMK., Bryant, V., Burman, L., Khitatrakun, S., McClelland, R., Mucciolo, L., Pickens, M., and Williams, A. (2022) “Synthetic Individual Income Tax Data: Promises and Challenges.” National Tax Journal, 75(4), 767-790.
Evaluating Statistical Data Privacy Methods
Balancing data utility against data privacy risks is a complex task. Many individuals seeking an answer to this question expect a one-size-fits-all utility or disclosure risk metric that perfectly assesses the quality of any data or statistic released under a privacy-preserving method or technology. But, such a metric doesn’t exist.
Below, I outline my work in evaluating the efficacy of different statistical data privacy methods and offer insights into how we should approach the delicate balance between privacy and utility.
Related Blogs
- I wrote a blog on the Differential Privacy: NIST Blog Series titled, Utility Metrics for Differential Privacy: No One-Size-Fits-All.
Related Papers
-
- Barrientos, A. F., Williams, A. R., Snoke, J., & Bowen, CMK. (2024) “A Feasibility Study of Differentially Private Summary Statistics and Regression Analyses with Evaluations on Administrative and Survey Data.” Journal of the American Statistical Association.
- Bowen, CMK. & Snoke, J. (2021) “Comparative Study of Differentially Private Synthetic Data Algorithms from the NIST PSCR Differential Privacy Synthetic Data Challenge” Journal of Privacy and Confidentiality, 11 (1).
- Click here to view the article on the journal page (open-source).
- Bowen, CMK. & Liu, F. (2020) “Comparative Study on Differentially Private Data Synthesis Methods.” Statistical Science.
- Click here to view the article on the journal page.
Communications and Education
The following represents my efforts in introducing data synthesis, differential privacy, and various statistical data privacy methods and technologies to a wider audience, including the scientific and non-technical communities.
Related Blogs
- Hu, J. & Bowen, CMK., (2022) “Prescribing Privacy: Human and Computational Resource Limitations?” Amstat News, September Issue.
- Click here to view the article.
- I wrote a blog on the Statisticians React to the News titled, How data privacy methods can hide the real data story
- Snoke, J. & Bowen, CMK., (2019) “Differential Privacy: What Is It?” Amstat News, March Issue.
- Click here to view the article.
Related Social Media
- Joshua Snoke and I chatted on the podcast, Stats + Stories, discussing how the data privacy landscape is changing.
- Click here to listen.
- I was a guest on the podcast, Data Science Imposters, discussing what is differential privacy how it impacts everyone.
- Click here to listen.
Related Papers
-
- Seeman, J., Williams, A. R., & Bowen, CMK. (2025) “Synthetic Data for the Nebraska Statewide Workforce & Educational Reporting System.” Urban Institute.
-
- Click here to view the non-technical brief (open-access).
- Bowen, CMK., (2024) “Government Data of the People, by the People, for the People: Navigating Citizen Privacy Concerns” Journal of Economic Perspectives.
- Click here to view the article.
- Hu, J. & and Bowen, CMK., (2024) “Advancing microdata privacy protection: A review of synthetic data methods” WIREs.
- Click here to view the article.
- Williams, A.R. & and Bowen, CMK., (2023) “The Promise and Limitations of Formal Privacy” WIREs.
- Click here to view the article.
- Bowen, CMK., Williams, A. R., & Pickens, M. (2022) “Decennial Disclosure: An Explainer on formal Privacy and the TopDown Algorithm.” Urban Institute.
- Click here to view the research brief (open-access).
- Garfinkel, S. & and Bowen, CMK., (2022) “Preserving Privacy While Sharing Data” MITSloan Management Review.
- Click here to view the article.
- Bowen, CMK. & Garfinkel, S., (2021) “The Philosophy of Differential Privacy” Notices of the American Mathematical Society.
- Click here to view the article (open-access).
- Bowen, CMK. (2021) “Personal Privacy and the Public Good: Balancing Data Privacy and Data Utility.” Urban Institute.
- Click here to view the research brief (open-access).
- Snoke, J. & Bowen, CMK., (2020) “How Statisticians Should Grapple with Privacy in a Changing Data Landscape.” CHANCE, Special Issue: A New Generation of Statisticians Tackles Data Privacy.
- Click here to view the article (open-access).
- Bowen, CMK & Eugenio, E., (2017) “Where’s Wenda: An Activity on Teaching Middle School Students Data Privacy.” Statistics Teacher.
- Click here to find the article (open-access).
Other Research Projects
Endurance Sports
These publication(s) are on endurance sports, which is something I am personally passionate instead of being an expert in the field.
Related Papers
- Bowen, CMK. (2024) “Significant strides: Women’s advancement in endurance sports.” Significance Magazine, Volume 21, Issue 3, 18-21.
- Click here to find the article.
Statistical Computing
With the rapid advancement in computational power, we increasingly rely on simulated experiments over physical ones, primarily due to cost considerations. For instance, at Los Alamos National Laboratory, there is a physical experiment that lasts less than a second but costs upwards of $10 million. This underscores the value of supercomputers in resource conservation. However, it’s important to note that these high-performance systems are vulnerable to various factors, including heat, power fluctuations, and cosmic radiation. Moreover, while computational power has seen significant growth, data storage and transfer rates continue to lag behind. The following papers delve into these critical issues.
Related Papers
- Bowen, CMK., DeBardeleben, N., Blanchard, S., & Anderson-Cook, C. (2019) “Do Solar Proton Events Reduce the Number of Faults in Supercomputers?: A Comparative Analysis of Faults during and without Solar Proton Events.” 2019 IEEE International Reliability Physics.
- Click here to find the article.
- Myers, K., Lawrence, E., Fugate, M., Bowen, CMK., Ticknor, L., Woodring, J., Wendelberger, J., & Ahrens, J. (2016) “Partitioning a Large Simulation as It Runs.” Technometrics., doi: 10.1080/00401706.2016.1158740.
- Click here to find the article.
Consulting
These publication(s) showcase projects for which I have provided consulting services.
Related Papers
- Bowen, CMK., Liu, F., & Wheeler, J. (2015) “Are More of My Patients Developing Side Effects than Expected.” Practical Radiation Oncology, Volume 5, Issue 3, e255-e261.
- Click here to find the article.