The law and averages
From the June 2023 print edition
Recent Canadian federal procurement rulings have recognized a public institution’s right to average individual evaluator scores instead of using group consensus scoring to arrive at final bid evaluation scores. This article summarizes some of those rulings and offers recommendations on how to incorporate averaging into overall group scoring.
For example, in its February 2019 determination in Temprano and Young Architects Inc. v. National Capital Commission, the Canadian International Trade Tribunal confirmed that averaging individual evaluator scores rather than conducting group consensus scoring sessions is an acceptable evaluation practice. The dispute dealt with a Request for Standing Offers (RFSO) issued by the National Capital Com-
mission (NCC) for the provision of architectural consulting services. The complainant challenged the evaluation process, arguing that, amongst other things, the NCC failed to ensure a fair and unbiased evaluation since it failed to conduct a group evaluation process. The Tribunal disagreed, finding that the contracting authority conducted the evaluation in accordance with the criteria and essential requirements specified in the RFSO. More specifically, the Tribunal confirmed that even in instances where suppliers were required to meet a minimum scoring threshold in a specific evaluation category, averaging individual scores instead of conducting group consensus scoring sessions was an acceptable method of arriving at final evaluation scores.
The Tribunal therefore determined that the complainant’s grounds for complaint were invalid and rejected the bid protest.
Similarly, in its August 2019 decision in Harris Corporation v. Department of Public Works and Government Services, the Tribunal once again confirmed that it is unnecessary to engage in group consensus scoring since averaging individual evaluator scores is an acceptable evaluation practice.
The dispute dealt with a bidding process for the provision of night vision goggles for the Royal Canadian Mounted Police (RCMP). The evaluation process was based on direct user testing of the proposed equipment. While the complainant challenged the group evaluation of the competing binoculars as subjective and arbitrary, the Tribunal disagreed. Rather than engaging in group consensus scoring after the independent evaluations, individual evaluator scores were simply collected and averaged
to arrive at a final score. The Tribunal upheld this practice, finding that there is no obligation to conduct consensus scoring and that averaging out the scoring variations between evaluators was just as acceptable as addressing those variations through group consensus scoring meetings. The Tribunal therefore dismissed the complaint.
Government’s right to average
Finally, in its February 2022 determination in Pacific Northwest Raptors Ltd. v. Canada (Department of Public Works and Government Services), the Tribunal rejected a complaint after recognizing the government’s right to average individual evaluation committee member scores to arrive
at final evaluation scores. The dispute involved a solicitation issued by the Department of Public Works and Government Services (PWGSC) on behalf of the Department of National Defence (DND) for aerodrome wildlife control services at Royal Canadian Air Force 12 Wing Shearwater, Nova Scotia (Shearwater). As the Tribunal noted, it will defer to government evaluation decisions that are reasonably conducted. Further, the Tribunal confirmed that there were several recognized evaluation methods that could be used to arrive at final scores and that those methods could include the averaging of individual scores, particularly where the range between initial individual scores is relatively small, while larger differences between initial individual scores may require a more thorough evaluation method that could include group consensus scoring meetings. Further, the Tribunal ruled that the averaging of scores did not constitute a hidden evaluation practice, stating that it saw “no reason to rule that the averaging of scores represented an error in the scoring or represented the application of undisclosed evaluation criteria”.
As these cases illustrate, purchasing institutions are not required to engage in group consensus scoring or compel individual evaluators to agree to an identical group consensus score, and can instead average individual evaluator scores to arrive at final group scores.
In fact, many public institutions are adopting enhanced consensus scoring methods that average individual evaluator scores when those scores fall within a predetermined variation tolerance and limit group evaluation discussions to areas where there were significant discrepancies between the individual scores. These enhanced methods also ensure that group evaluation sessions are properly moderated with clear procedures that protect the autonomy of each individual evaluator so that group discussions do not open the door to pressuring individual evaluators to change their scores against their independent judgment. As recent legal rulings reflect, evaluator independence should prevail over artificially imposed uniformity. This blended approach helps streamline the overall evaluation process while bolstering the defensibility of evaluation and award decisions.