Chapter 11: Community Outreach and Engagement
One of the key goals of the project is to ensure that the user community – both internal to the agency and external – is engaged in substantive ways through workshops, webinars, and a dedicated website that provides opportunity for comment and feedback. Such engagement is not only mandated by the legislation identified in Section 2.3, but also often recommended by National Academies report.
While each agency will have its own way of engaging with its internal constituents, there are some common threads that could shape the external engagement with researchers, survey respondents, data users, and specific under-represented groups. However, it is expected that each agency will identify the target user community to participate in the early workshops.
The general format of a user community early workshop would be to explain the platform, provide hands-on experience using Jupyter Notebooks, and to gather feedback on: 1. how to improve the functionality of the platform; 2. usability of the platform; and 3. future possible collaborations between the agency and the user community and within the user community.
Multiple workshops could be structured to serve the different potential constituencies. One might focus on survey respondents, who would react to the usage information in the dashboard. Another might focus on users of the Standard Application Process for the Federal Statistical Research Data Centers. A third might include graduate students, postdocs and other junior scholars who have yet to develop the connections to the empirical knowledge base in a research field.
The workshops will include participation from all the project partners but will primarily be supported by University of Maryland, NYU, and the agencies. The partners are committed to working with the agencies to bring in a diverse and inclusive range of participants, particularly from academic institutions such as Historically Black Colleges and Universities and Hispanic Serving Institutions.
Subsequent workshops could provide input into the theory of change – how investing in data creates value. That theory of change can provide the framework for developing well grounded usage metrics and inform the development of agency questions. As such, a researcher engagement workshop might bring together both active and potential data users, senior and junior researchers interested in the agency mission areas, as well as evaluation experts.
It is also possible that subsequent workshops include the broader federal community. The Evidence Act requires that agencies engage with the user community, and charged three key federal entities with fulfilling that task. These include statistical officials (through the Interagency Committee on Statistical Policy), Chief Data Officers (the Chief Data Officer Council) and Chief Evaluation Officers.
Other outreach activities are likely to include presentations at professional conferences of researchers and data users, presentations to federal cross-agency councils, such as the Chief Data Officers Council and the Interagency Council on Statistical Policy, and associations such as the Council of Professional Associations on Federal Statistics. Each agency will be at the center of planning for their outreach activities, with support from the partners.
Much previous work can be used in designing the workshops. The Show Us the Data workshop provides a strong basis, since information (reproduced below) was provided by Chief Data Officers, the research community, publishers, and academic institutions.
Since the competition focused on uses of data sets in research, the outcomes were most immediately appliable to agencies with scientific mission components. CDOs from agencies for which discovery activities occurred in the competition were invited to review results in one-on-one sessions and then to attend this panel discussion. The Agencies represented in the discussion were Commerce (NOAA), NSF, USDA, Transportation. Given that the breadth of data work in an Agency may cross many mission teams, some agencies had multiple team members participate in the discussion session Their detailed responses are summarized in Appendix D: CDOS.
A set of academic and agency researchers were asked a series of structured questions including: (1) how they might use the tools to advance their research; (2) how the tools might advance the work of junior researchers; (3) how the tools might inspire researchers to do their work differently; and (4) how might the researcher community become engaged in this effort? Their detailed responses are summarized in Appendix D: Researchers.
Several benefits for researchers at institutions included improved discovery of what data exist and are available, better access to data, and opportunities for collaboration, especially across disciplines. More use of the data would also create motivation to improve the metadata, e.g., developing and conforming to metadata and citation standards and making sure data are complete. This would also help improve existing governance structures and help integration across existing infrastructures. Institutions want to understand usage and improve discovery and access from their data repositories. Institutions also use a lot of state and other data, so there could be wider applications beyond federal data. Detailed responses are available in Appendix D: Institutions.
The workshop participants were asked structured questions to get feedback on what the publisher stakeholder community thinks about the potential of the Rich Context Content project and its machine learning and natural language processing components. The questions related to: (1) Concerns about the Machine Learning / Natural Language Processing (ML/NLP) approach to capturing data use; (2) Additional functionality that would be useful; (3) the value proposition for publishers to participate; (4) How publishers could participate; and (5) where should the application reside and be managed? Their responses are summarized in Appendix D: Publishers.
As the project matures, it is hoped that the community will provide additional input. Most immediately, the community should provide input into the theory of change and what measures should be used to measure the value of data. In subsequent activities, the community could also support the development of the broader information infrastructure. This would include improving the ML tools themselves, filling gaps in the corpus that the ML models missed, and incentivizing both data users and producers to contribute documentation, code, and analytical uses to the platform.
Initially the engagement mechanism will be workshops and webinars. However, it is expected that the platform will include substantial modalities that allow for human-computer interaction and error correction.
The initial “ask” will be to get internal staff and researchers to comment on the user tools, their functionality, and the usage measures through staff briefings using the provided tools – the Jupyter Notebooks, the API and the usage/researcher dashboard – as well as develop their own questions.
Initially, these will be small workshops focused on training, exploring the platform, and collecting user experience data. During these early workshops, usability testing will also be taking place to inform and improve the tool set.
Agencies have identified a framework that they would like tested with their internal and external communities.