DORA and the Howard Hughes Medical Institute (HHMI) convened a diverse group of stakeholders to consider how to improve research assessment policies and practices.
By exploring different approaches to cultural and systems change, we discussed practical ways to reduce the reliance on proxy measures of quality and impact in hiring, promotion, and funding decisions. To focus on practical steps forward that will improve research assessment practices, we did not discuss the well-documented deficiencies of the Journal Impact Factor (JIF) as a measure of quality.
What was discussed?
What are the different approaches to developing, introducing, and implementing new research assessment policies and practices?
How might institutions align their research assessment practices with their values and policies?
What obstacles prevent change from taking place? What can positively influence behavior?
Once alignment between policy and practice has been achieved, how do we build community support to make it normative?
What do we hope to achieve?
Encourage stakeholders at all levels, including departments, institutions, and funders to establish working groups to evaluate research assessment practices, ask how well practices align with policies, and develop and recommend new research assessment policies and practices. One important goal is to reduce the role of journal-based metrics and other proxy measures of research quality and impact in assessment.
Inspire stakeholders to look to culture change experts to help them translate new policies into practice.
Foster multi-stakeholder collaborations in research assessment reform.
How can I participate now?
In-person participation was invitation-only due to space and budget constraints.
Everyone across all academic disciplines is welcome to engage online before, during, and after the meeting. Here’s how:
Tune in to watch! Select sessions were recorded and are available on YouTube.
Join the conversation on Twitter any time using the hashtag #AssessingResearch.
Let us know what you think! We encourage everyone to review the agenda, read the participant commentary, and look at the background readings (which provide insight into the current gaps between policy and practice). Please feel welcome to contact DORA directly and share thoughts inspired by what you read.
While the emphasis of the meeting was on institutions in the United States, we hope the resources and webcast will inspire conversation more broadly.
Academic institutions and funders assess their scientists’ research outputs to help allocate their limited resources. Research assessments are codified in policies and enacted through practices. Both can be problematic: policies if they do not accurately reflect institutional mission and values; and practices if they do not reflect institutional policies.
Even if new policies and practices are developed and introduced, their adoption often requires significant cultural change and buy-in from all relevant parties – applicants, reviewers and decision makers.
We will discuss how to develop and adopt new research assessment policies and practices through panel discussions, short plenary talks and breakout sessions. We will use the levels of intervention described in the “Changing a Research Culture” pyramid (Nosek, 2019), to organize the breakout sessions.
We want your input on the agenda:
Looking at the brainstorming sessions, are there questions you would want to include?
Are there other topics we should consider for brainstorming sessions and what questions would be discussed?
What’s missing from the panel discussions, short talks and brainstorming sessions?
Erin McKiernan (UNAM) – slides
How the journal impact factor is used in review, promotion, and tenure in the United States and Canada
Paula Stephan (Georgia State University) – slides
How we can change the research assessment culture
Bodo Stern (HHMI) – slides
HHMI’s approach to evaluating scientists
Joint Q&A, Cowan Auditorium Moderated by Stephen Curry (DORA)
Evening wrap-up, Cowan Auditorium
Stephen Curry (DORA) – slides
9:00 – 11:00 pm
Social, The Pilot Lounge
Tuesday, October 22, 2019
Breakfast, Dining Room
How to develop new research assessment policies and practices Panel Discussion, Cowan Auditorium Moderated by Erika Shugart (ASCB and DORA)
Stephen Curry (Imperial College and DORA)
Frank Miedema (UMC Utrecht / Utrecht University)
Sandra Schmid (UT Southwestern Medical Center)
Miriam Kip (BIH QUEST Center for Transforming Biomedical Research)
Omar Quintero (University of Richmond)
Needhi Bhalla (University of California, Santa Cruz)
Participants will work in preassigned groups for all breakout sessions. Each group will have a designated discussion leader and note taker. To encourage open and honest discussion, breakout sessions will be held under the Chatham House Rule; information shared in breakout sessions may NOT be attributed to the speaker.
How can the researcher community encourage institutions to establish new research assessment policies and practices?
Once alignment between policies and practices has been achieved, how can organizations build the support of the research community to make good practices normative?
Are there other ways an organization could more clearly articulate to the research community how their priorities translate into practices?
How can different stakeholder communities, including research institutions, funders, scholarly societies, publishers, and researchers, signal their priorities, policies, and practices to each other?
Breakout topic 4. Incentives
What are the incentives for funders and research institutions to align their research assessment policies and practices?
How do incentives differ between assessors and those being assessed?
How can funders and research institutions incentivize the behaviors they would like to promote?
Breakout topic 5. Policy
Is the vision, mission, and values of an organization accurately reflected in its policies? If not, what is the process by which policies can be modified?
What are the benefits of publicly sharing an organization’s policies? What are the risks?
What are the benefits of publicly sharing an organization’s practices? What are the risks?
Is there a role for funders to influence how an organization’s policies are translated into its research assessment practices? Why or why not?
Break, Great Hall
Reports from Breakout Sessions, Cowan Auditorium Moderated by Stephen Curry (DORA)
Discussion of Afternoon, Cowan Auditorium Moderated by Erin McKiernan (DORA and UNAM)
Dinner, Dining Room
8:30 – 11:00 pm
Social, The Pilot Lounge
Wednesday, October 23, 2019
Breakfast, Dining Room
Applying systems thinking to research assessment reform Introduction of Breakout Sessions, Cowan Auditorium Moderated by Anna Hatch (DORA)Research assessment reform involves multiple stakeholders, including funders, research institutes, libraries, publishers, and researchers. How can stakeholders work together to implement change in research assessment practices? What do different stakeholder groups need from each other to succeed?Participants will divide into groups of 6-8 people to discuss topics inspired by the ideas presented in the participant commentaries.
Finding the right balance between top-down and bottom-up approaches for research assessment reform
Facilitators: Diego Baptista (Wellcome) & Connie Lee (University of Chicago) Room D113
How might we use an institution’s stated values as a starting point to improve research assessment policies and align them with practices?
Facilitator: Erin McKiernan (UNAM) Room D115
Building trust in policies and practices—how do researchers know that research institutions and funders really mean what they say?
Facilitators: Dave Carr (Wellcome) and Prachee Avasthi (University of Kansas Medical Center) Room D116
How can scholarly societies lead from the outside to influence research assessment reform?
Facilitators: Erika Shugart (ASCB) and Brooks Hanson (AGU) Room D124
What can departments and institutions do to improve the triage phase of assessment for faculty searchers?
Facilitators: Lee Ligon (RIT) and Sandra Schmid (UT Southwestern Medical Center) Room D125
Where do university rankings fit into research assessment reform?
Facilitator: Stephen Curry (DORA) South Lounge
How can departments, institutions, and funders evaluate contributions to team science?
Facilitator: Sue Biggins (Fred Hutchinson Cancer Research Center) and Miriam Kip (BIH QUEST Center For Transforming Biomedical Research) North Lounge
How might we improve equity and inclusion in academia (e.g. by examining how models of scarcity and exclusivity influence our current research assessment practices and concepts of rigor)?
Facilitators: Needhi Bhalla (UC Santa Cruz) and Olivia Rissland (University of Colorado School of Medicine) Room C123
What do preprints need to be more useful in evaluation?
Facilitator: Jessica Polka (ASAPbio) Room N140
Break, Great Hall
Reports from Breakout Sessions, Cowan Auditorium Moderated by Boyana Konforti (HHMI)
Lunch, Dining Room
Meeting Reflections and Next Steps, Cowan Auditorium Anna Hatch (DORA)
To better understand how research institutions use the journal impact factor (JIF) for researcher evaluations, Erin McKiernan and colleagues analyzed review, promotion, and tenure (RPT) documents from universities across the United States and Canada. Table 1 shown below summarizes their key findings. Overall, 23% of the universities sampled mention the JIF, and this number increases to 40% for R-type institutions. Eighty-seven percent of institutions had at least one supportive mention of using the JIF for evaluation purposes. None of the mentions prohibit its use.
The authors found inconsistencies in RPT documents between academic units of the same university. For example, one unit may caution the use of the JIF while another one is supportive. There is also variability in what institutions are measuring with the JIF. In addition to impact, the JIF can also be used to measure quality, importance, significance, prestige, reputation, and status.
The analysis was limited to terms that were highly similar to JIF. It did not include indirect but probable references to the JIF, such as top-tier journal, prominent journal, or leading journal. This study provides a benchmark for future studies to compare and contrast how the JIF is used.
2. Bias against novelty in science: a cautionary tale for the users of bibliometric indicators
Research assessment practices that rely on citation counts and journal impact factors may inadvertently discourage academics from pursuing explorative, novel research approaches. In this study, Wang and colleagues dissect how novelty affects research impact.
Novel approaches can lead to substantial scientific advances. But not always. Novel papers have a higher mean and a larger variance in their citation performance, illustrating the risky nature of explorative research. Despite the uncertainty, however, the authors find novel approaches can make significant scientific progress. Highly novel research articles are more likely:
to be a top 1% highly cited paper over time,
to result in highly cited follow-up studies,
and to be cited by a wider set of disciplines.
However, recognition takes time. Novel papers are less likely to be top cited when looking at shorter citation windows. Highly novel papers are significantly more likely to become top cited after four years, well outside the 2-year period over which JIF is calculated; or nine years for moderately novel papers. Yet the time required to realize the impacts of novel research is often incompatible with the time frame for faculty appointments or review, promotion, and tenure (RPT) decisions.
The authors show that novel papers are more likely to be published in journals with impact factors that are lower than expected. Current research assessment policies and practices favor individuals who publish their work in high impact factor journals, which could discourage innovation.
Scholarship is not limited to publishing papers. Many university promotion and tenure policies include language that labels teaching as a valuable component of scholarship, but often without clear criteria of how contributions to teaching should be assessed beyond the use of student evaluations, which are known to be biased against women and underrepresented groups. In many cases, research excellence is perceived to compensate for teaching and academic service.
Here, the authors suggest general guidelines for the evaluation of teaching and present three approaches that universities are taking to evaluate and reward teaching:
Three Bucket Model of merit review (University of California, Irvine; Figure 1)
Evaluation of Teaching Rubric (University of Kansas)
Teaching Quality Framework (University of Colorado, Boulder)
4. The evaluation of scholarship in academic promotion and tenure processes: past, present, and future
In this review, Schimanski and Alperin name three pressing challenges in review, promotion, and tenure:
The increased value placed on research comes at the expense of teaching and service contributions.
Proxy measures of prestige are often used as shortcuts to measure the quality of a research article.
The integration of article-level metrics into RPT processes is too slow.
They also identify a number of other issues. Evaluation for RPT is typically divided into research, teaching, and service. Service is the least valued of the three, cannot compensate for the other two, and diverts time away from the most important contribution of research. To fulfill university diversity requirements, underrepresented groups spend more time serving on committees. But this may be a disadvantage for RPT. Women are also more likely to spend time on service to the detriment of their careers.
Another challenge is level of detail provided in RPT guidelines. Ambiguous language provides a certain degree of flexibility. But there is a cost too—candidates may not always be assessed using the same standards.
5. How significant are the public dimensions of faculty work in review, promotion and tenure documents?
Public dimensions of faculty work are often neglected in RPT guidelines, despite regular references to terms and concepts related to “public” and “community.” While 87% of institutions mention “community” and 75% mention “public,” they do not typically point to the public dimensions of faculty work. For example, community is often used in relation to the academic community.
Furthermore, terms and concepts related to “public” and “community” are associated with academic service, which is not a valued aspect of RPT. And the highly valued traditional outputs like research papers largely disregard the public dimensions of research. How we measure the success of research papers is at odds with the public dimensions of faculty work too. Traditional bibliometric indicators, including journal impact factors, encourage uptake within a scholarly discipline rather than the broader community.
RPT guidelines can discourage researchers from publishing their work in open access venues. Here, the authors found the term “open access” is explicitly mentioned by 5% of the institutions sampled, and all of the mentions are cautionary.
Culture change requires a comprehensive strategy. Efforts aimed at changing the behavior of individuals can fall flat if social and cultural systems are not addressed. These systems guide individual behavior, so if systems do not change, behavior won’t either. In this blog, Brian Nosek describes the strategy the Center for Open Science is taking to change the research culture to promote openness, integrity, and reproducibility. It requires five levels of progressive intervention (see image below). This strategy and others can guide our thinking about research assessment reform.
In this comment, Rinze Benedictus and Frank Miedema describe the strategy used by the University Medical Center (UMC) Utrecht to develop research assessment policies that focus on scholarly contributions rather than publications. Community engagement was a key factor in cultivating culture change at the university. Researchers helped define new assessment standards through a series of internal discussions. PhD students, principal investigators, professors, and department heads were all invited to participate. Reports from these meetings and additional interviews were published on UMC Utrechts’ internal website to engage the wider academic community. But change takes time. Once the discussion phase was complete, policies were developed over the next year based on the input that was received. A few years on, the evaluative approach focusing on contributions was used in a formal institutional evaluation. For professors and associate professors a portfolio-based assessment has become standard.
In this piece, Needhi Bhalla outlines several proven strategies to improve equity in faculty hiring. Despite the number of underrepresented minorities in trainee positions increasing, the number of underrepresented faculty members in academic science remains low. In addition, these strategies also increase transparency and consistency of faculty searches, which builds trust in the process.
Additional recommended readings from participants: