The Week We Bought a PDF Prison
Why a 'Full Export' that delivers millions of PDFs is not a backup—it is a scorched earth policy designed to destroy your history.
The Tombstone of Information
The hard drive arrived by courier on a rainy Tuesday. It marked the end of a five-year relationship with a case management vendor that had become too expensive and too arrogant.
“Here it is,” the Archivist said, plugging it in. “The full history of the Social Services department.”
We opened the root folder. We expected SQL dumps, JSON files, perhaps a structured XML.
Instead, we found folders labeled by year. Inside were millions of files: case_001.pdf, case_002.pdf.
I felt a cold weight in my stomach. I opened one. It was a perfect, A4-formatted image of the case file. It looked nice. But it was useless.
“Where is the database?” I asked the vendor on the phone.
“We provided a full export,” the representative replied, reading from a script. “Every record has been rendered for your convenience.”
“You have given us pictures,” I said, my voice hardening. “I cannot migrate a picture into a new system. I cannot run a query on a picture to find all cases involving ‘Asbestos.’ You have not returned our data. You have burned down the library and handed us photographs of the ashes.”
The Trap: The “Scorched Earth” Exit
Vendors know that data migration is the biggest barrier to exit. If they make the exit painful enough, we will stay.
Delivering data in Unstructured Formats (PDF, TIF, flat text) is a hostile act. It is a digital blockade. By rendering the data into a visual format, they destroy the relationships—the invisible lines that connect a Citizen to a household, a permit to a property, or a payment to an invoice.
Without those relationships, we do not have a system. We have a pile of digital paper. To rebuild the system, we would have to pay humans to manually re-type millions of records. The cost of exit becomes higher than the cost of staying. We are not customers; we are inmates in a PDF prison.
The Exit Strategy: The Machine-Readable Mandate
We never sign a contract that allows for “document-based” exports. We demand Machine-Readable Sovereignty.
Our requirements are absolute:
- Structure, Not Image: The export must be in a format that computers can read without optical character recognition (JSON, XML, CSV).
- Schema Preservation: We require the dictionary. If the column is labeled
col_A, the vendor must provide the documentation proving thatcol_Aequals “Date of Birth.” - No Proprietary Encodings: If the data is compressed or encrypted, the tools to unpack it must be Open Source and available to us in perpetuity.
We do not accept “renderings.” The history of the municipality is a living dataset, not a static art gallery.
FAQs
Why is PDF bad for archiving?
A PDF is a visual representation. You cannot query it, you cannot feed it into a new system, and you cannot analyze it. It is dead information.
Is this legal?
If your contract only says 'Export,' yes. They technically gave you the files. You must specify 'Structured Data' to close this loophole.
How do we prevent this?
We test the export in the Proof of Concept phase. If it comes out as a PDF, we do not sign.