Why this page?
The problems are part of the method.
HistoriaMP is not based on the belief that LLMs can simply read historical manuscripts reliably. On the contrary: many central architecture decisions arose because these systems have real limits.
Anyone familiar with the subject should be able to see here: the problems have not been overlooked. They are the reason why the pipeline is built so carefully.
Digital preprocessing is never neutral
The Xerox case shows why the path of origin matters.
In 2013, David Kriesel showed with Xerox scan copiers that digital documents can look clean and still be wrong: under certain conditions, numbers and image fragments were swapped in scans. The decisive methodological point is this: the error did not begin with OCR, but in the image data itself.
For HistoriaMP this is an important reminder. With manuscripts, too, a digital output must not be confused with the source. Scans, compression, segmentation, OCR, HTR and AI outputs are processing steps. They can help, but they can also change, conceal or create apparent certainty.
That is why HistoriaMP documents not only results, but also the path toward them.
Provisional position
The answer is not trust, but control.
HistoriaMP addresses these problems through segmentation, visual-basis references, uncertainty marking, separate artifacts, review steps, validators and quality control.
This page will later describe in detail which obstacles currently exist, which solution approaches are being tested and where certainty is deliberately not claimed.