AI Models Surpass Human Performance on Legal Ethics Exam: Implications and Potential Applications

In an advancement for the realm of artificial intelligence, generative AI has passed the Multistate Professional Responsibility Examination (MPRE), an essential benchmark for prospective lawyers with respect to their understanding of professional conduct rules. This feat, achieved by two leading large language models (LLMs), follows a previous accomplishment where OpenAI’s GPT-4 successfully took the bar exam in March, scoring well within the top 10% of test takers as mentioned here.

The testing was conducted by the team at LegalOn Technologies, headed by VP of artificial intelligence Gabor Melli. According to Daniel Lewis, the U.S. CEO of LegalOn, the findings hint at AI’s potential capabilities in facilitating ethical decision-making without implying any intrinsic moral understanding by AI.

Models including OpenAI’s GPT-4 and GPT-3.5, Anthropic’s Claude 2, and Google’s PaLM 2 Bison were tested on their competence in resolving queries modeled for the MPRE. GPT-4 delivered the best performance, correctly responding to 74% of the queries and thereby outperforming the average human candidate. Claude 2 also fared well with a success rate of 67%, while the performance of GPT-3.5 and PaLM 2 was found to be somewhat lacking with accuracy percentages of 49% and 42%, respectively.

In every state where it is necessitated, both GPT-4 and Claude 2 scored above the passing threshold for the MPRE, which typically varies between 56-64% depending on the jurisdiction. The LLMs were assessed against 500 simulation exam questions developed by Dru Stevenson, a professor specializing in professional responsibility at South Texas College of Law Houston. Importantly, no prior training about legal ethics was given to these models.

While the proficiency displayed by GPT-4 was noteworthy, variations in performance were observed across different subject areas. Expertise in topics related to conflicts of interest and client relationships was evident, though topics like the safekeeping of property posed a greater challenge.

The intersection of AI and the legal profession holds significant interest for Dru Stevenson. He suggests that while the responsibility for ethical decision-making lies unequivocally with legal professionals, technology’s potential in the achievement of uniformly high ethical standards is considerable. Readers interested in a deeper insight into this study can access the report here.

For more details, visit the original article here.

Share this: