News

You do know Harvey's BigLaw Bench does not actually test case law research, right?

April 25, 20261 minute

Annoyingly every news outlet states that GPT 5.5 is "great" at "legal research" but I can not find any proof it is any good at #1 legal job of any legal professional: finding case law or legislation.

Can anyone show some proof with snapshots that GPT 5.5 is good at finding recent, obscure or any unpopular cases or codes in any of the ~193 law jurisdictions the world has.

In fairness: I'm not saying GPT 5.5 is bad, I just like to see some proof

Speaking of bad: on Reddit, Inc. - one response was a snapshot of benchmark results where GPT 5.5 was #5 below GPT 5.4 🙈 and #1 was Gemini 3.1 Pro - https://lnkd.in/eDDENbPn

You do know Harvey's BigLaw Bench does not actually test case law research, right?

More to read