Highest AI score in the US Bar Exam

Highest AI score in the US Bar Exam
Who
GPT-4, OpenAI
What
298 point(s)
Where
Not Applicable
When
March 2023

The highest score in the United States Uniform Bar Exam by an artificial intelligence is 298 out of 400, which was achieved in March 2023 by the GPT-4 Large Language Model, built by OpenAI (USA). The score puts GPT-4 ahead of 90% of the humans that have taken the test, and would have comfortably qualified it to practice law in many jurisdictions within the United States.

OpenAI's GPT-4 is a type of artificial intelligence called a Large Language Model (or LLM). These consist of a deep-learning neural network that has been trained on a massive quantity of written material, typically trillions of words scraped from the internet and other sources.

The training process involves the neural network cycling through text data sets, gradually learning the relationships between various words and phrases in a probabilistic manner (i.e. in the context of the prompt Z, word X is most likely to be followed by word Y). LLM's base these decisions by weighing an enormous number of factors (GPT-4 may have more than 1 trillion parameters that inform its text generation).

With enough training data, and enough time and processing power (it has been estimated that GPT-4 cost $60–$100 million to train) this approach can create a believable impression of intelligence. The problem with this method of constructing artificial intelligence is that the reasoning behind the AI's decisions is entirely obscured, even to its developers. This is a particular problem in the context of a phenomenon known as "hallucination", in which LLM's provide wildly inaccurate or misleading responses that are seemingly generated spontaneously.

It should be noted that while GPT-4 performed well in the bar exam, attempts to set up GPT-4 instances as legal advice providers have all been looked on with skepticism by regulators, due to the AI's tendency to hallucinate and to provide what has been described as "poor legal advice". In one widely publicized case, a lawyer relied on ChatGPT (the public-facing application based on GPT-4) to write a legal brief, only to be sanctioned after it was discovered that the brief was full of imaginary legal citations.