Derivation chain
The documented sequence of all image versions from the source image through crops, segments, scaling or model inputs. It shows which concrete image version a finding rests on.
This glossary explains central terms from HistoriaMP, digital palaeography, AI-assisted image analysis, OCR/HTR, MUFI, Unicode, glyph analysis, image integrity and artifact-based pipeline methodology.
HistoriaMP does not treat historical manuscripts as a pure OCR task. The terms on this page describe a pipeline in which visible evidence, segmentation, glyph findings, uncertainty, model input and later reading are documented separately.
All terms are shown.
The documented sequence of all image versions from the source image through crops, segments, scaling or model inputs. It shows which concrete image version a finding rests on.
A rectangular coordinate area that marks a zone, segment or finding in the image.
A planned knowledge system that can store recurring glyph variants, abbreviation forms, scribe profiles, layout patterns and typical readings.
The system property that makes later error tracing possible because runs, artifacts, prompts, segments and results remain stored.
A research field that connects digital methods with humanities questions. HistoriaMP positions itself in this field as a source-bound manuscript analysis platform.
A visible risk at image edges, such as cut-off sign forms, markings near margins or unclear edge areas that must be protected for later analysis.
A single technical or visual finding, such as a conspicuous small form above a line, a stroke near the edge or a dense minim cluster.
Short for Handwritten Text Recognition. Classical HTR systems often aim at image to text. HistoriaMP instead uses image to finding to structure to reading.
The use of AI models within a controlled pipeline. AI does not provide unsupported truth, but verifiable analytical contributions.
Structural information about the page, such as text zones, image areas, line spaces, margins, segment boundaries or visible groupings.
A more understandable text version for general users. It is explicitly an interpretive layer and not identical with the diplomatic transcription.
Large Multimodal Model. A model that can process image and text inputs. In HistoriaMP an LMM may analyze, but must not replace the first finding layer.
Signs, notes or markings near the margin. In HistoriaMP they are first treated as visible margin findings before a function is assigned.
The separation between diplomatic transcription, critical reading and readable version. Each layer has a different function and must not be mixed.
The server's neutral technical data preparation. It stores, segments and manages data, but does not interpret manuscript content.
The scholarly study of historical writing forms. HistoriaMP touches palaeographic questions, but strictly separates visible finding from later classification.
The rule that no words or signs may be reconstructed if no visible basis for them exists in the image.
The merging of results from multiple segments, modules or model runs.
A planned analysis module for entries that differ in color or form and may relate to a rubricator hand. Such an assignment still has to be evidenced.
The scholarly analysis layer of HistoriaMP. It is separated from the technical infrastructure.
The decomposition of a manuscript page into verifiable layers such as layout, segment, glyph, minim, abbreviation, reading and quality control.
An expanded abbreviation or supplemented reading without sufficient visual basis.
The comparison of competing readings, manuscript findings or transcription proposals.
Everything that can be visibly observed in the concrete image: surface, sign forms, spacing, color differences, damage, margin traces or conspicuous markings.
A derived image version, for example a scaled image, crop, segment, compressed image or input prepared for a model.
The technical check whether an image file is complete, clearly registered and robust enough for a particular analysis.
The concrete image file or image version to which a finding refers. In HistoriaMP it must be clear whether a finding arose on the original image, on a segment or on a model input.
The technical check of an uploaded image file, such as file type, image size, readability, pixel count and processability.
A bounding box in percentage coordinates. It describes a position relative to the respective image area, not as an absolute pixel position.
A checking layer that ensures every coordinate is bound to a defined image space and can be correctly traced back to original, segment or model input.
The rule that no coordinate may be used without stating its coordinate space. A coordinate is valid only within a specific image version.
A controlled image excerpt. Crops can be used for detail checks, but must be documented as their own image artifacts.
A visual preview of the image with an overlaid grid. It helps check image areas and tiles in the workbench.
A grid-based approach to image division. In HistoriaMP it is a debugging and viewer tool, not the actual scholarly segmentation decision.
A technical temporary storage of image data to avoid repeated loading of large files.
An image registry that stores hash, run ID, filename and timestamp. It helps recognize identical images and keep analysis histories traceable.
The degree to which a model input still corresponds to the registered source file. Scaling, compression or cropping can change input fidelity.
A segmentation that follows visible structures of the source, not only a technical grid.
The digital image version of a historical manuscript. For HistoriaMP the decisive point is which concrete image version was analyzed.
The tracing of a finding from segment, crop or model input back to its position in the registered source image.
A visible area of the source that mainly shows surface, damage, stains, margins or empty spaces. It is not automatically excluded as meaningless.
A record of which image version a model actually received: hash, dimensions, crop, scaling, compression and coordinate space.
The concrete image or text version actually passed to an AI model. It is not automatically identical with the original source.
The stored version of a model input including technical metadata. It makes later model claims verifiable.
The deliberate overlap of neighboring segments. It prevents signs or structures from being cut off at segment boundaries.
A segmentation strategy in which image segments overlap. Critical image areas therefore appear in several context windows.
A controlled image area that is stored for analysis purposes and can later be traced back to its original position.
A planned technical cache of segments for more efficient processing.
Metadata for a segment, such as ID, file, x/y position, width and height. They enable mapping back to the original image.
A planned queue for systematic processing of individual segments by modules.
Information about which segment a finding belongs to and how that segment is positioned in the source image.
The division of an image into analyzable areas. In HistoriaMP segmentation is methodologically critical because early losses can distort later readings.
A rectangular grid excerpt of an image. Tiles serve technical orientation and the viewer, but are not necessarily the primary scholarly analysis basis.
A workbench tool with which individual grid excerpts can be inspected.
A visually distinguishable area in the image, such as a larger writing-like area, a margin area, a color-different area or a material zone.
The detection of visible zones within a manuscript image. In HistoriaMP zones should not be functionally overinterpreted.
An abbreviation is a historical shortened form in a manuscript. In HistoriaMP it is not silently expanded, but first treated as a visible finding and only then checked as a possible reading.
A reading that is not merely asserted, but can be traced back to concrete visual evidence, segments, glyph findings, variants and uncertainties.
A source-close transcription that does not silently smooth signs, abbreviations, uncertainties and visible special features.
A visible sign form in the manuscript image. A glyph is first a form in the image and not automatically a modern letter or Unicode codepoint.
A planned checking instance that compares technical glyph findings with MUFI/Unicode candidates, model outputs and transcription claims.
An internal identifier for a documented glyph form. It separates the visible form from later readings, Unicode assignments or font renderings.
An upstream visual control layer that marks critical glyphic anomalies before a model turns them into text.
A prepared reading in which abbreviations and editorial decisions are made visible. It stands between diplomatic transcription and readable version.
A text hypothesis derived from findings. In HistoriaMP a reading must be traceable to visible evidence and documented uncertainty.
A connected or fused sign form. Ligatures can lead to misinterpretations if they are resolved into separate letters too early.
A short vertical stroke in historical writing forms. Several minims can form clusters that are difficult to distinguish.
The rule that minim clusters must not be automatically interpreted or supplemented through linguistic plausibility.
A dense group of similar stroke forms where several readings may be possible. Minim clusters are among the central error sources in historical transcription.
The Medieval Unicode Font Initiative. For HistoriaMP, MUFI is a reference and encoding space, but not proof of a reading.
A list of possible sign or codepoint candidates after a documented glyph finding. Candidates are hints, not final decisions.
An automated visual finding instance that marks critical special forms and prepares possible encoding spaces without transcribing by itself.
A later reading-near unit within the pipeline. A token may be stabilized only after its visual basis has been documented.
The rule that word or token boundaries may be assumed only where visible separations or sufficiently documented findings exist.
A possible reading or transcription version that can coexist with other variants as long as the image finding does not force a clear decision.
A transcription that discloses its basis: image location, segment, glyph finding, uncertainty and alternative readings.
A stored intermediate result of the pipeline, for example segment data, glyph findings, variant lists or uncertainty reports. Analysis artifacts make the path to the reading traceable.
A method in which not only a finished text is produced, but every relevant intermediate step remains available as a verifiable artifact.
A planned interface with which stored analysis artifacts of a run can be searched, checked and compared.
The area of HistoriaMP in which all analysis results, metadata, segment information and checking findings are stored in structured form.
The possibility of critically checking a reading later: which image location, segment, glyph finding and uncertainty led to this reading?
A simplified usage mode for users who mainly need a readable output. Unlike Research Mode, it does not necessarily show all analysis layers in detail.
An observable property of the source or of an image segment. A finding is not yet interpretation and not a final reading.
A stored visual or technical finding, such as a marked conspicuous glyph form with coordinates, segment reference and uncertainty status.
An analysis layer that documents visible properties before transcription or interpretation is derived from them.
Uncertainty is not hidden, but explicitly marked. It is an analysis result and not a system error.
Scholarly work on textual transmission in which readings, variants, interventions and decisions are documented traceably.
The concrete basis of a statement. In HistoriaMP, evidence primarily means a visible, documented image finding.
The comparison between technical image finding, model claim, transcription proposal and later output.
The authoritative basis of the analysis. In HistoriaMP the source is not the generated text, but the documented image basis.
An analysis in which every claim must be traced back to the concrete source or a documented image version.
A detailed usage mode that makes the complete analysis pipeline, artifacts, variants and uncertainties visible.
Silent normalization occurs when an uncertain or special image finding is smoothed in the result without the uncertainty remaining visible.
An artifact that documents where and why the analysis is uncertain.
A visible marker for uncertain readings or findings, for example `⟦...??⟧`.
A reference to the concrete visual basis of a reading, such as a segment, glyph finding or documented image area.
The rule that visible evidence has priority over linguistic, historical or statistical plausibility.
The ability to check an analysis not only as a result, but as a documented path from source to reading.
The defined image space in which a coordinate is valid. After scaling, crop, padding, segmentation or model preprocessing, a new coordinate space is created.
The methodological principle that a reading is not treated as fact, but as a justified, verifiable proposal based on visible evidence.
The impression of a secure, smooth reading although the visible finding does not sufficiently support this certainty.
A reading that seems plausible but is not sufficiently bound to visible evidence.
The unnoticed transformation of uncertain, damaged or ambiguous places into apparently secure forms or words.
The assignment of a function such as rubric, initial, comment or correction. In HistoriaMP it must not be derived from color, size or position alone.
A supplement based on linguistic, historical or editorial expectation. It must not replace visible evidence.
The error of treating a statement about a reduced, scaled or otherwise altered model image as an unchecked statement about the source image.
The expert checking of findings, variants, uncertainties and readings by a reviewing person. It remains part of the method.
A controlled space of possible signs, codepoints, glyph forms or abbreviation interpretations. A candidate is not yet a reading.
A deliberately limiting rule or prompt structure that prevents a model from reading, interpreting or smoothing uncertainty too early.
The display layer on which image excerpt, coordinates, segments, glyph findings, variants and uncertainties remain visible close to the source.
The display layer of a diplomatic or critical reading that remains bound to finding artifacts and uncertainties.
The mediating display layer for present-day readers. It may explain and translate, but must not replace the finding layer.
A checking mechanism that controls schema, vocabulary, coordinate reference, artifact references or scholarly risks.
A technical check value for a file or image version. It helps uniquely recognize inputs and document analysis paths.
An analysis module for abbreviation forms. It should examine visible signs, abbreviation strokes or additional forms without prematurely converting them into modern expanded forms.
The stepwise processing of a manuscript source from image checking through layout, segments, glyphs, minim clusters and abbreviations to justified reading and quality control.
A module that compares several findings or reading proposals. The goal is not majority at any price, but a justified decision with documented uncertainty.
A module for examining individual visible sign forms. It should record graphic features before a reading is derived from them.
A planned module for analyzing recurring glyph forms. In the long term it can help identify scribe profiles or formal patterns within a codex.
A planned module for separating different visual areas, such as text, illustration, ornament, margin area or other graphic structures.
A technical system that divides an image into rectangular grid areas. It mainly serves orientation, visualization and technical control.
An upstream module that documents image files, hashes, dimensions, formats, metadata, derivatives and model-input versions. It does not read or interpret text.
A planned module for controlled image preparation, such as rotation, contrast, perspective or other technical corrections.
A module for examining page structure: visible areas, line arrangement, text zones, margin areas and structural separations.
A module for analyzing line structures, line courses, spacing, interruptions and problematic transitions.
The upstream module for image integrity, input fidelity and coordinate integrity. It checks the technical robustness of the image basis.
The first analysis module of the pipeline. It describes only visible properties of the source and does not generate transcription.
A module for analyzing the visible page and layout structure.
A module or system area for controlled division into relevant analysis areas, such as line, word or glyph areas.
A module for analyzing individual visible glyph forms.
A module for examining minim structures and dense stroke groups.
A module for analyzing visible abbreviation forms and possible abbreviations.
A planned visual lens for conspicuous glyph areas and possible MUFI/Unicode candidates. It generates findings, not finished readings.
A module for creating a diplomatic or source-bound transcription on the basis of documented findings.
A module for comparing competing readings and findings.
A module for checking structure, consistency, uncertainties, visual basis and possible errors.
A module for analyzing minim structures. It should prevent dense stroke groups from being reconstructed too quickly into secure words.
A specialized pipeline step with a clearly limited task, such as image checking, layout analysis, glyph analysis, transcription or quality control.
The coordinated execution of several modules in a defined pipeline.
A distortion that arises when early technical or interpretive assumptions influence later results. HistoriaMP tries to reduce this through separated modules and artifacts.
A checking module for consistency, visual basis, uncertainties, error sources and possible impermissible smoothing.
The system for creating image segments. In HistoriaMP it serves complete image coverage and protection against information loss.
The module that checks the source for visible properties without reading text or interpreting meaning.
A planned module for detecting different text areas or visual zones within a manuscript page.
A module for creating a transcription. In HistoriaMP it must not silently smooth uncertain places.
A planned module for comparing multiple manuscripts, readings or transmission variants.
The Python web framework on which HistoriaMP's server infrastructure is based.
The technical layer of HistoriaMP: upload, storage, validation, segmentation, artifact management, API and module orchestration. It does not interpret manuscript content.
An image-processing library that can be used in HistoriaMP for technical tasks such as segmentation, edge analysis or image operations.
A library of versioned module prompts. It makes traceable which instruction a module worked with.
An isolated analysis execution. Each upload creates its own run with its own directory structure, image data, segments, module results and logs.
The unique identifier of an analysis run. It connects image, segments, artifacts and module results.
The principle that each analysis is stored completely in its own run directory. This keeps analyses reproducible and separated from each other.
A technical safeguard intended to prevent parallel write access to the same run.
The technical basis of HistoriaMP: FastAPI server, upload, storage, runs, segmentation, APIs and artifact management.
A tracking system that uses run ID and trace run ID to make analysis paths, module steps and results reproducible.
An identifier for tracing individual pipeline steps or execution paths within a run.
The working interface of HistoriaMP. It serves upload, image display, grid and segment view, module control and result review.