Meeting

Driving Institutional Change for
Research Assessment Reform

October 21-23, 2019

Summary

What was this meeting about?

DORA and the Howard Hughes Medical Institute (HHMI) convened a diverse group of stakeholders to consider how to improve research assessment policies and practices.
By exploring different approaches to cultural and systems change, we discussed practical ways to reduce the reliance on proxy measures of quality and impact in hiring, promotion, and funding decisions. To focus on practical steps forward that will improve research assessment practices, we did not discuss the well-documented deficiencies of the Journal Impact Factor (JIF) as a measure of quality.

What was discussed?

What are the different approaches to developing, introducing, and implementing new research assessment policies and practices?
How might institutions align their research assessment practices with their values and policies?
What obstacles prevent change from taking place? What can positively influence behavior?
Once alignment between policy and practice has been achieved, how do we build community support to make it normative?

What do we hope to achieve?

Encourage stakeholders at all levels, including departments, institutions, and funders to establish working groups to evaluate research assessment practices, ask how well practices align with policies, and develop and recommend new research assessment policies and practices. One important goal is to reduce the role of journal-based metrics and other proxy measures of research quality and impact in assessment.
Inspire stakeholders to look to culture change experts to help them translate new policies into practice.
Foster multi-stakeholder collaborations in research assessment reform.

How can I participate now?

In-person participation was invitation-only due to space and budget constraints.
Everyone across all academic disciplines is welcome to engage online before, during, and after the meeting. Here’s how:
- Tune in to watch! Select sessions were recorded and are available on YouTube.
- Join the conversation on Twitter any time using the hashtag #AssessingResearch.
- Let us know what you think! We encourage everyone to review the agenda, read the participant commentary, and look at the background readings (which provide insight into the current gaps between policy and practice). Please feel welcome to contact DORA directly and share thoughts inspired by what you read.
While the emphasis of the meeting was on institutions in the United States, we hope the resources and webcast will inspire conversation more broadly.

Improving research assessment

Academic institutions and funders assess their scientists’ research outputs to help allocate their limited resources. Research assessments are codified in policies and enacted through practices. Both can be problematic: policies if they do not accurately reflect institutional mission and values; and practices if they do not reflect institutional policies.

Even if new policies and practices are developed and introduced, their adoption often requires significant cultural change and buy-in from all relevant parties – applicants, reviewers and decision makers.

We will discuss how to develop and adopt new research assessment policies and practices through panel discussions, short plenary talks and breakout sessions. We will use the levels of intervention described in the “Changing a Research Culture” pyramid (Nosek, 2019), to organize the breakout sessions.

We want your input on the agenda:

Looking at the brainstorming sessions, are there questions you would want to include?
Are there other topics we should consider for brainstorming sessions and what questions would be discussed?
What’s missing from the panel discussions, short talks and brainstorming sessions?

Monday, October 21, 2019
5:00 pm	Reception, Great Hall
6:00 pm	Dinner, Dining Room
7:00 pm	 Opening Remarks, Cowan Auditorium Boyana Konforti (HHMI) Erika Shugart (ASCB and DORA) – slides
7:20 pm	 Keynotes, Cowan Auditorium Erin McKiernan (UNAM) – slides How the journal impact factor is used in review, promotion, and tenure in the United States and Canada Paula Stephan (Georgia State University) – slides How we can change the research assessment culture Bodo Stern (HHMI) – slides HHMI’s approach to evaluating scientists
8:20 pm	 Joint Q&A, Cowan Auditorium Moderated by Stephen Curry (DORA)
8:45 pm	 Evening wrap-up, Cowan Auditorium Stephen Curry (DORA) – slides
9:00 – 11:00 pm	Social, The Pilot Lounge
Tuesday, October 22, 2019
7:00 am	Breakfast, Dining Room
8:30 am	How to develop new research assessment policies and practices  Panel Discussion, Cowan Auditorium Moderated by Erika Shugart (ASCB and DORA) Stephen Curry (Imperial College and DORA) Frank Miedema (UMC Utrecht / Utrecht University) Sandra Schmid (UT Southwestern Medical Center) Miriam Kip (BIH QUEST Center for Transforming Biomedical Research) Omar Quintero (University of Richmond) Needhi Bhalla (University of California, Santa Cruz)
9:30 am	Breakout Sessions Participants will work in preassigned groups for all breakout sessions. Each group will have a designated discussion leader and note taker. To encourage open and honest discussion, breakout sessions will be held under the Chatham House Rule; information shared in breakout sessions may NOT be attributed to the speaker. Breakout group and room assignments Discussion leader instructions Note taker instructions Breakout topic 1. Infrastructure – using working groups to establish policies and practices How might institutional working groups be established? What different stakeholders need to be at the table? What information and support do working groups need to be positioned to evaluate research assessment practices? Are there any unintended consequences of using working groups to effect change? What would they be? How might working groups establish new policies? Breakout topic 2. User experience Many things can prejudice assessors: gender, race, ethnicity, institutional reputation, and more. How do organizations stay aware of and responsive to barriers the different groups might encounter? What parts of the research assessment process can lead to assessor bias? What nudges can an organization consider to influence the behavior of individuals submitting applications and those assessing them? How do organizations make it easy for different groups to understand and implement their policies and practices?
10:45 am	Break, Great Hall
11:00 am	Reports from Breakout Sessions, Cowan Auditorium Moderated by Anna Hatch (DORA)
12:45 pm	Lunch
1:45 pm	Discussion of Morning, Cowan Auditorium Moderated by Bodo Stern (HHMI)
2:15 pm	How to adopt new research assessment practices  Short Plenary Talks, Cowan Auditorium Moderated by Boyana Konforti (HHMI) Brian Belcher (Royal Roads University) – slides How conceptual clarity can improve how we assess research George Santangelo (NIH Office of Portfolio Analysis) – slides Ways to measure impact without the JIF David Mellor (Center for Open Science) – slides Lessons learned from open science initiatives about how to change research culture Ruth Schmidt (Institute of Design IIT) – slides How to apply a user-design approach to research assessment reform Lou Woodley (CSCCE) – slides How systems thinking applies to research assessment reform
3:15 pm	Break, Great Hall
3:30 pm	Breakout Sessions Breakout group and room assignments Discussion leader instructions Note taker instructions Breakout topic 3. Communities How can the researcher community encourage institutions to establish new research assessment policies and practices? Once alignment between policies and practices has been achieved, how can organizations build the support of the research community to make good practices normative? Are there other ways an organization could more clearly articulate to the research community how their priorities translate into practices? How can different stakeholder communities, including research institutions, funders, scholarly societies, publishers, and researchers, signal their priorities, policies, and practices to each other? Breakout topic 4. Incentives What are the incentives for funders and research institutions to align their research assessment policies and practices? How do incentives differ between assessors and those being assessed? How can funders and research institutions incentivize the behaviors they would like to promote? Breakout topic 5. Policy Is the vision, mission, and values of an organization accurately reflected in its policies? If not, what is the process by which policies can be modified? What are the benefits of publicly sharing an organization’s policies? What are the risks? What are the benefits of publicly sharing an organization’s practices? What are the risks? Is there a role for funders to influence how an organization’s policies are translated into its research assessment practices? Why or why not?
4:45 pm	Break, Great Hall
5:00 pm	Reports from Breakout Sessions, Cowan Auditorium Moderated by Stephen Curry (DORA)
6:00 pm	Discussion of Afternoon, Cowan Auditorium Moderated by Erin McKiernan (DORA and UNAM)
7:00 pm	Dinner, Dining Room
8:30 – 11:00 pm	Social, The Pilot Lounge
Wednesday, October 23, 2019
7:00 am	Breakfast, Dining Room
8:30 am	Applying systems thinking to research assessment reform Introduction of Breakout Sessions, Cowan Auditorium Moderated by Anna Hatch (DORA)Research assessment reform involves multiple stakeholders, including funders, research institutes, libraries, publishers, and researchers. How can stakeholders work together to implement change in research assessment practices? What do different stakeholder groups need from each other to succeed?Participants will divide into groups of 6-8 people to discuss topics inspired by the ideas presented in the participant commentaries.
9:00 am	Breakout Sessions Discussion leader and note taker instructions Finding the right balance between top-down and bottom-up approaches for research assessment reform Facilitators: Diego Baptista (Wellcome) & Connie Lee (University of Chicago) Room D113 How might we use an institution’s stated values as a starting point to improve research assessment policies and align them with practices? Facilitator: Erin McKiernan (UNAM) Room D115 Building trust in policies and practices—how do researchers know that research institutions and funders really mean what they say? Facilitators: Dave Carr (Wellcome) and Prachee Avasthi (University of Kansas Medical Center) Room D116 How can scholarly societies lead from the outside to influence research assessment reform? Facilitators: Erika Shugart (ASCB) and Brooks Hanson (AGU) Room D124 What can departments and institutions do to improve the triage phase of assessment for faculty searchers? Facilitators: Lee Ligon (RIT) and Sandra Schmid (UT Southwestern Medical Center) Room D125 Where do university rankings fit into research assessment reform? Facilitator: Stephen Curry (DORA) South Lounge How can departments, institutions, and funders evaluate contributions to team science? Facilitator: Sue Biggins (Fred Hutchinson Cancer Research Center) and Miriam Kip (BIH QUEST Center For Transforming Biomedical Research) North Lounge How might we improve equity and inclusion in academia (e.g. by examining how models of scarcity and exclusivity influence our current research assessment practices and concepts of rigor)? Facilitators: Needhi Bhalla (UC Santa Cruz) and Olivia Rissland (University of Colorado School of Medicine) Room C123 What do preprints need to be more useful in evaluation? Facilitator: Jessica Polka (ASAPbio) Room N140
10:30 am	Break, Great Hall
10:45 am	Reports from Breakout Sessions, Cowan Auditorium Moderated by Boyana Konforti (HHMI)
11:30 am	Lunch, Dining Room
12:30 pm	Meeting Reflections and Next Steps, Cowan Auditorium Anna Hatch (DORA)
1:00 pm	Departures

Background Reading

1. Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations

McKiernan EC, et al. Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations (2019). https://doi.org/10.7287/peerj.preprints.27638v2

To better understand how research institutions use the journal impact factor (JIF) for researcher evaluations, Erin McKiernan and colleagues analyzed review, promotion, and tenure (RPT) documents from universities across the United States and Canada. Table 1 shown below summarizes their key findings. Overall, 23% of the universities sampled mention the JIF, and this number increases to 40% for R-type institutions. Eighty-seven percent of institutions had at least one supportive mention of using the JIF for evaluation purposes. None of the mentions prohibit its use.

The authors found inconsistencies in RPT documents between academic units of the same university. For example, one unit may caution the use of the JIF while another one is supportive. There is also variability in what institutions are measuring with the JIF. In addition to impact, the JIF can also be used to measure quality, importance, significance, prestige, reputation, and status.

The analysis was limited to terms that were highly similar to JIF. It did not include indirect but probable references to the JIF, such as top-tier journal, prominent journal, or leading journal. This study provides a benchmark for future studies to compare and contrast how the JIF is used.

https://doi.org/10.7287/peerj.preprints.27638v2

2. Bias against novelty in science: a cautionary tale for the users of bibliometric indicators

Wang J, and Veugelers R, and Stephan, PE. Bias against Novelty in Science: A Cautionary Tale for Users of Bibliometric Indicators (2017). http://dx.doi.org/10.2139/ssrn.3051728

Research assessment practices that rely on citation counts and journal impact factors may inadvertently discourage academics from pursuing explorative, novel research approaches. In this study, Wang and colleagues dissect how novelty affects research impact.

Novel approaches can lead to substantial scientific advances. But not always. Novel papers have a higher mean and a larger variance in their citation performance, illustrating the risky nature of explorative research. Despite the uncertainty, however, the authors find novel approaches can make significant scientific progress. Highly novel research articles are more likely:

to be a top 1% highly cited paper over time,
to result in highly cited follow-up studies,
and to be cited by a wider set of disciplines.

However, recognition takes time. Novel papers are less likely to be top cited when looking at shorter citation windows. Highly novel papers are significantly more likely to become top cited after four years, well outside the 2-year period over which JIF is calculated; or nine years for moderately novel papers. Yet the time required to realize the impacts of novel research is often incompatible with the time frame for faculty appointments or review, promotion, and tenure (RPT) decisions.

The authors show that novel papers are more likely to be published in journals with impact factors that are lower than expected. Current research assessment policies and practices favor individuals who publish their work in high impact factor journals, which could discourage innovation.

A short summary of the article can be found here.

3. Aligning practice to policies: changing the culture to recognize and reward teaching at research universities

Dennin M, et al. Aligning Practice to Policies: Changing the Culture to Recognize and Reward Teaching at Research Universities (2017). https://doi.org/10.1187/cbe.17-02-0032

Scholarship is not limited to publishing papers. Many university promotion and tenure policies include language that labels teaching as a valuable component of scholarship, but often without clear criteria of how contributions to teaching should be assessed beyond the use of student evaluations, which are known to be biased against women and underrepresented groups. In many cases, research excellence is perceived to compensate for teaching and academic service.

Here, the authors suggest general guidelines for the evaluation of teaching and present three approaches that universities are taking to evaluate and reward teaching:

Three Bucket Model of merit review (University of California, Irvine; Figure 1)
Evaluation of Teaching Rubric (University of Kansas)
Teaching Quality Framework (University of Colorado, Boulder)

4. The evaluation of scholarship in academic promotion and tenure processes: past, present, and future

Schimanski LA and Alperin JP. The evaluation of scholarship in academic promotion and tenure processes: Past, present, and future (2018). https://doi.org/10.12688/f1000research.16493.1

In this review, Schimanski and Alperin name three pressing challenges in review, promotion, and tenure:

The increased value placed on research comes at the expense of teaching and service contributions.
Proxy measures of prestige are often used as shortcuts to measure the quality of a research article.
The integration of article-level metrics into RPT processes is too slow.

They also identify a number of other issues. Evaluation for RPT is typically divided into research, teaching, and service. Service is the least valued of the three, cannot compensate for the other two, and diverts time away from the most important contribution of research. To fulfill university diversity requirements, underrepresented groups spend more time serving on committees. But this may be a disadvantage for RPT. Women are also more likely to spend time on service to the detriment of their careers.

Another challenge is level of detail provided in RPT guidelines. Ambiguous language provides a certain degree of flexibility. But there is a cost too—candidates may not always be assessed using the same standards.

5. How significant are the public dimensions of faculty work in review, promotion and tenure documents?

Alperin JP, et al. How significant are the public dimensions of faculty work in review, promotion, and tenure documents? (2019). https://doi.org/10.7554/eLife.42254

Public dimensions of faculty work are often neglected in RPT guidelines, despite regular references to terms and concepts related to “public” and “community.” While 87% of institutions mention “community” and 75% mention “public,” they do not typically point to the public dimensions of faculty work. For example, community is often used in relation to the academic community.

Furthermore, terms and concepts related to “public” and “community” are associated with academic service, which is not a valued aspect of RPT. And the highly valued traditional outputs like research papers largely disregard the public dimensions of research. How we measure the success of research papers is at odds with the public dimensions of faculty work too. Traditional bibliometric indicators, including journal impact factors, encourage uptake within a scholarly discipline rather than the broader community.

RPT guidelines can discourage researchers from publishing their work in open access venues. Here, the authors found the term “open access” is explicitly mentioned by 5% of the institutions sampled, and all of the mentions are cautionary.

6. Strategy for culture change

Nosek, B. Strategy for culture change (2019). https://www.cos.io/blog/strategy-for-culture-change

Culture change requires a comprehensive strategy. Efforts aimed at changing the behavior of individuals can fall flat if social and cultural systems are not addressed. These systems guide individual behavior, so if systems do not change, behavior won’t either. In this blog, Brian Nosek describes the strategy the Center for Open Science is taking to change the research culture to promote openness, integrity, and reproducibility. It requires five levels of progressive intervention (see image below). This strategy and others can guide our thinking about research assessment reform.

7. Fewer numbers, better science

Benedictus R, et al. Fewer numbers, better science (2016). https://doi.org/10.1038/538453a

In this comment, Rinze Benedictus and Frank Miedema describe the strategy used by the University Medical Center (UMC) Utrecht to develop research assessment policies that focus on scholarly contributions rather than publications. Community engagement was a key factor in cultivating culture change at the university. Researchers helped define new assessment standards through a series of internal discussions. PhD students, principal investigators, professors, and department heads were all invited to participate. Reports from these meetings and additional interviews were published on UMC Utrechts’ internal website to engage the wider academic community. But change takes time. Once the discussion phase was complete, policies were developed over the next year based on the input that was received. A few years on, the evaluative approach focusing on contributions was used in a formal institutional evaluation. For professors and associate professors a portfolio-based assessment has become standard.

8. Strategies to improve equity in faculty hiring

Bhalla, N. Strategies to improve equity in faculty hiring (2019). https://doi.org/10.1091/mbc.E19-08-0476

In this piece, Needhi Bhalla outlines several proven strategies to improve equity in faculty hiring. Despite the number of underrepresented minorities in trainee positions increasing, the number of underrepresented faculty members in academic science remains low. In addition, these strategies also increase transparency and consistency of faculty searches, which builds trust in the process.

Additional recommended readings from participants:

Guthrie S, et al. 100 Metrics to assess and communicate the value of biomedical research: an ideas book (2016).
Moher D, et al. Assessing scientists for hiring, promotion, and tenure (2018).
Miedema F. Setting the agenda: ‘who are we answering to?’ (2018).
Hatch A, et al. Research assessment: reducing bias in the evaluation of researchers (2019).
Stewart AJ and Valian V. An inclusive academy: achieving diversity and excellence (2018).

Participants

As of October 19, 2019

First name	Last name	Organization
Jennifer	Acker	American Society for Cell Biology
Nicky	Agate	Columbia University / HuMetricsHSS
Tom	Arrison	National Academies
Prachee	Avasthi	University of Kansas Medical Center
Diego	Baptista	Wellcome Trust
Brian	Belcher	Royal Roads University
Needhi	Bhalla	UCSC
Susan	Biggins*	Fred Hutchinson Cancer Research Center
Maria Elena	Bottazzi	Baylor College of Medicine
Chris	Bourg	MIT
Robin	Broughton	HHMI
Curtis	Brundy	Iowa State University
Andrew	Campbell	Brown University
David	Carr	Wellcome Trust
David	Conover	University of Oregon
Stephen	Curry	DORA
Eric	Eich	University of British Columbia
Kathleen	Gould	Vanderbilt University School of Medicine
Barbara	Graves	HHMI
Sarah	Greene	Rapid Science
Brooks	Hanson	American Geophysical Union
Anna	Hatch	DORA
Leslie	Henderson	Geisel School of Medicine at Dartmouth
Michelle	Juarez	HHMI
Miriam	Kip	BIH QUEST Center For Transforming Biomedical Research
Boyana	Konforti	HHMI
Connie	Lee	University of Chicago
Lee	Ligon	Rensselaer Polytechnic Institute
Christopher	Long	Michigan State University
Jon	Lorsch	NIGMS – NIH
Gary	McDowell	Self-employed
Erin	McKiernan	Universidad Nacional Autónoma de México
Ross	McKinney	Association of American Medical Colleges
Nathan	Meier	University of Nebraska-Lincoln
David	Mellor	Center for Open Science
Frank	Miedema	UMC Utrecht / Utrecht University
Dan	Morgan	Public Library of Science
Diane	O’Dowd**	UC, Irvine
Brad	Orr	University of Michigan
Erin	O’Shea	HHMI President
Philip	Perlman	HHMI
Christopher	Pickett	Rescuing Biomedical Research
Jessica	Polka	ASAPbio
Omar	Quintero	University of Richmond
Guru	Rao	Iowa State University
Kristen	Ratan	Strategies for Open Science
Olivia	Rissland	University of Colorado School of Medicine
George	Santangelo	NIH Office of Portfolio Analysis
Sandra	Schmid	UT Southwestern Medical Center
Ruth	Schmidt	Institute of Design IIT
Caitlin	Schrein	HHMI
Courtney	Schroeder	Fred Hutchinson Cancer Research Center
Jason	Sheltzer	Cold Spring Harbor Laboratory
Erika	Shugart	American Society for Cell Biology
Erik	Snapp	Janelia Research Campus
Paula	Stephan	Georgia State University & NBER
Katie	Steen	Association of American Universities
Bodo	Stern	HHMI
Kevin	Wilson	American Society for Cell Biology
Denis	Wirtz	Johns Hopkins University
Lou	Woodley	CSCCE
Keith	Yamamoto	UCSF

*HHMI Investigator
**HHMI Professor

Meeting

Driving Institutional Change for
Research Assessment Reform

Summary

Webcast Archive

Agenda

Improving research assessment

Background Reading

1. Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations

2. Bias against novelty in science: a cautionary tale for the users of bibliometric indicators

3. Aligning practice to policies: changing the culture to recognize and reward teaching at research universities

4. The evaluation of scholarship in academic promotion and tenure processes: past, present, and future

5. How significant are the public dimensions of faculty work in review, promotion and tenure documents?

6. Strategy for culture change

7. Fewer numbers, better science

8. Strategies to improve equity in faculty hiring

Additional recommended readings from participants:

Participants

Sign up to receive updates about DORA

Meeting

Driving Institutional Change for Research Assessment Reform

Summary

Webcast Archive

Agenda

Improving research assessment

Background Reading

1. Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations

2. Bias against novelty in science: a cautionary tale for the users of bibliometric indicators

3. Aligning practice to policies: changing the culture to recognize and reward teaching at research universities

4. The evaluation of scholarship in academic promotion and tenure processes: past, present, and future

5. How significant are the public dimensions of faculty work in review, promotion and tenure documents?

6. Strategy for culture change

7. Fewer numbers, better science

8. Strategies to improve equity in faculty hiring

Additional recommended readings from participants:

Participants

Driving Institutional Change for
Research Assessment Reform