Stanford to Revise Study on AI Legal Research Tools Amid Methodological Controversy

Stanford University is set to augment a recent study it released on generative AI legal research tools from LexisNexis and Thomson Reuters. The initial findings highlighted that these tools deliver hallucinated results more frequently than companies claim. This revelation has attracted scrutiny regarding the study’s methodology and fairness.

The preprint study, carried out by Stanford’s RegLab and its Human-Centered Artificial Intelligence research center, found that both LexisNexis (LN) and Thomson Reuters (TR) products hallucinate in over 17% of queries. The study further observed considerable differences in accuracy between the two platforms, with LexisNexis providing accurate responses 65% of the time, while the TR tool managed only 18%.

Critics have raised significant concerns over the study’s methodology, particularly its “apples to oranges” comparison. For LexisNexis, the study evaluated Lexis+ AI, the company’s generative AI platform for general legal research. In contrast, for Thomson Reuters, the study examined Ask Practical Law AI, which is limited to Practical Law content, rather than the more relevant AI-Assisted Research in Westlaw Precision.

Daniel E. Ho, a Stanford Law professor and one of the study’s authors, acknowledged these limitations, noting that Thomson Reuters had denied multiple requests for access to their AI-Assisted Research product. Ho confirmed that Thomson Reuters has since granted access, and the study will be augmented accordingly. The timeline for these updates remains uncertain given the intensive nature of the work.

The initial findings have prompted responses from both companies. Jeffrey S. Pfeifer, chief product officer for LexisNexis, noted that the study’s broader definition of hallucinations affects the results and emphasized LexisNexis’ continuous enhancements to mitigate hallucinations. Meanwhile, Thomson Reuters highlighted that Practical Law AI was not the suitable product for the study’s comparisons and reiterated its commitment to transparency and the responsible development of AI.

The contentious debate underscores the need for rigorous and transparent benchmarking of AI tools in the legal sector, as emphasized by the study’s authors. Until such standards are widely adopted, claims of hallucination-free legal AI systems should be scrutinized critically. For more details on the ongoing story, visit the full article on LawNext.

Share this: