COLLaiTE Collens

Overview
COLLaiTE Collens is a sophisticated web-based tool for digital manuscript collation and visualization, designed specifically for digital humanities scholars working with TEI XML transcriptions. It enables side-by-side comparison of multiple witnesses, advanced text processing, and integration with CollateX for automated collation.
Key Highlights
- π Multi-witness Comparison - View and compare multiple manuscript witnesses simultaneously
- π Advanced TEI XML Processing - Sophisticated parsing of complex editorial markup
- πΌοΈ Image Synchronization - Connect transcriptions with manuscript images
- π CollateX Integration - Automated text collation and variant analysis
- πΎ Offline Support - Local storage using IndexedDB for offline access
- π Editorial Statistics - Track additions, deletions, substitutions, and transpositions
Features
Core Functionality
Document Management
- Drag-and-drop upload for documents and folders
- Automatic project loading with example datasets (Infinito, Sheherazade)
- IndexedDB storage for persistent, offline-capable document access
- Batch file processing for handling multiple witnesses simultaneously
- Smart folder structure recognition for organized document import
Multi-Witness Visualization
- Side-by-side comparison view for multiple witnesses
- Dynamic witness management with real-time toggle controls
- Individual witness metrics showing editorial statistics
- Synchronized scrolling across witness panels
- Customizable witness display with show/hide controls
TEI XML Processing
Advanced Parser Features
- Three-stage transcription support:
Stage 1a
: Original text (preserves deletions, excludes additions)
Stage 1b
: Intermediate state (applies instant edits and nested changes)
Stage 1c
: Final text (includes all additions, removes deletions)
TEI Element Support
<add>
additions rendered with {}
<del>
deletions rendered with []
<unclear>
text rendered with β¨β©
<supplied>
text rendered with <>
<note>
editorial notes rendered with (())
<subst>
substitutions with proper unwrapping
<lb>
line breaks and <pb>
page breaks
- Nested editorial interventions handling
- Instant revision processing
Image Visualization
- Multi-format support (PNG, JPG, JPEG, WebP, TIFF, AVIF)
- Thumbnail sidebar for easy navigation
- Page synchronization between text and images
- Smart numerical sorting of manuscript pages
- Full-screen image viewer with zoom capabilities
CollateX Integration
- Real-time collation using CollateX.net web service
- Dekker algorithm for optimal text alignment
- Interactive variant table with horizontal/vertical orientation
- Invariant highlighting for identical text sections
- JSON export of collation results
- Dynamic updates when witnesses are toggled
User Interface
Components & Design
- Responsive layout optimized for desktop and mobile
- Toast notifications for user feedback
- Settings bar with witness controls and statistics
- Metric circles for visual representation of changes
- CodeMirror integration for XML syntax highlighting
- Dark mode support via DaisyUI themes
Getting Started
Prerequisites
- Bun - Fast all-in-one JavaScript runtime
- Modern web browser with JavaScript enabled
- (Optional) Git for cloning the repository
Installation
- Clone the repository:
git clone https://github.com/collaite/collens.git
cd collens
- Install dependencies:
bun install
- Run the development server:
bun dev
- Open in browser:
Navigate to
http://localhost:5173
Building for Production
bun build
The application will be built to the build
directory, ready for deployment.
Usage Guide
Loading Documents
-
Using Example Projects:
- Click on the Infinito or Sheherazade cards on the homepage
- Documents will be automatically loaded and displayed
-
Uploading Custom Documents:
- Drag and drop a folder containing your TEI XML files
- Ensure your folder follows the required structure (see documentation)
- Files will be processed and stored locally
Working with Witnesses
-
Toggle Witnesses:
- Use the checkboxes in the settings bar to show/hide witnesses
- View editorial statistics for each witness
-
Compare Texts:
- Witnesses appear side-by-side for easy comparison
- Scroll synchronization helps maintain alignment
-
View Images:
- Click on image thumbnails in the sidebar
- Images sync with the corresponding text sections
Using CollateX
-
Generate Collation:
- Select witnesses to include in collation
- Click "Generate CollateX" button
- View results in the collation table
-
Export Results:
- Copy JSON output for further analysis
- Toggle between horizontal/vertical table views
Documentation
User Documentation
Technical Documentation
Documentation is also accessible through the application interface after running locally.
Example Projects
Infinito Project
A comprehensive 4-witness manuscript comparison showcasing:
- Multiple manuscript versions of the same text
- Complex editorial interventions
- Full image sets for each witness
- Extensive TEI markup examples
Sheherazade Project
A focused 2-witness comparison demonstrating:
- Direct manuscript comparison
- Clear visualization of textual variants
- Simplified witness management
Technical Stack
Core Technologies
Key Libraries
- MDSvex - Markdown processing with Svelte components
- CodeMirror - Code editor for XML display
- Prism.js - Syntax highlighting
- IndexedDB - Client-side storage
Build Tools
- Vite - Fast build tool and dev server
- Bun - JavaScript runtime and package manager
Development
Project Structure
collens/
βββ src/
β βββ routes/ # SvelteKit routes and pages
β βββ lib/
β β βββ components/ # Reusable Svelte components
β β βββ stores/ # Svelte stores for state management
β β βββ utils/ # Utility functions
β β βββ types/ # TypeScript type definitions
β βββ app.html # HTML template
βββ static/ # Static assets
βββ package.json # Project configuration
Local Development
The development server includes:
- Hot module replacement for instant updates
- TypeScript checking
- Automatic error reporting
- Development-specific debugging tools
Deployment
Changes pushed to the main
branch are automatically deployed to GitHub Pages, making the latest version available at the project URL.
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests to improve COLLaiTE Collens.
License
[License information to be added]
Acknowledgments
COLLaiTE Collens is developed as part of the COLLaiTE project for digital humanities research and manuscript studies.