📑 AI and Numbers Part 2 - Page Numbers
Another surprising challenge - what page is my clause on?
In my previous post, I shared how AI can struggle with clause numbers in legal documents — not because it’s "bad with numbers" but because document conversion often strips away important details. This time, let's talk about another, equally surprising issue: page numbers.
📝 Why Doesn’t AI Know What Page It’s On?
It seems obvious, right? Documents are broken down into pages, and AI should be able to recognize them. But it’s rarely that simple. Even if your document has page numbers, AI might not be able to see them. Why? It all depends on the document format and how the document is presented to the AI at the end.
AI products like ChatGPT and Claude interpret documents as a continuous stream of text. Unless the document is specially formatted or pre-processed to insert explicit page markers in the text itself, the AI won’t know where the page breaks are. It simply sees a “wall of words” with no natural breaks.
The format you choose matters, too. In Word documents, even if page numbers are visible, they’re usually part of the document layout, which is often lost when converting to plain text. With PDFs, it can be hit or miss — Claude does a better job at recognizing page numbers in PDFs, while ChatGPT still struggles. So once again, AI’s ability to handle numbers relies on how well the document conversion keeps these details intact.
🔧 What Can You Do About It?
Here are some practical ways to avoid these issues and make sure AI can interpret your page numbers more accurately:
Test Your AI Tool and Document First: Before relying on AI to answer questions based on page numbers, it’s a good idea to test it out. Better yet, focus on asking about sections or clauses and try to refer to them by name rather than by number, as AI generally handles these references more accurately.
Use PDFs with Visible Page Numbers: PDFs often preserve formatting better than Word documents, so using a PDF with clear, visible page numbers can yield better results.
Opt for Legal AI Tools: Specialized legal AI tools are designed for handling complex document structures, making them far more reliable for interpreting page numbers and maintaining document fidelity.
Page numbers are just one of many quirks to keep in mind when working with AI and legal documents. As I explore more of these hidden challenges, I’ll be sharing additional insights on how to work around them. If you’re finding this series helpful, don’t forget to subscribe to The Legal Engineer for updates!