Skip to content

Latest commit

 

History

History
441 lines (351 loc) · 21.6 KB

File metadata and controls

441 lines (351 loc) · 21.6 KB

TejOCR - CHANGELOG

All notable changes to this project will be documented in this file chronologically.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.


[0.2.2] - 2026-03-11 - Setup Diagnostics Polish, Clipboard Hardening & PDF Runtime Fixes

Added

  • Added extension UI language selection:
    • Auto follows LibreOffice UI language first, then system locale,
    • manual selection is available in Settings,
    • validated catalogs are exposed only when they are loadable and translated.
  • Added Spanish UI translation catalog support from the merged community contribution.
  • Added completed, user-visible UI catalogs for Hindi, French, German, Simplified Chinese, Portuguese (Brazil), Arabic, Bengali, Russian, Japanese, Korean, Indonesian, Turkish, Vietnamese, Italian, Polish, Dutch, Ukrainian, Persian, Urdu, Tamil, Telugu, Marathi, Punjabi, and Swahili.
  • Added a Windows-first setup helper script:
  • Added richer Setup & Diagnostics actions:
    • Copy Support Snapshot
    • Save Script...
    • Open Install Guide
  • Added FilterTube.in (Free Open Source) to the top-level TejOCR Writer menu.
  • Added test coverage for:
    • setup dialog command generation,
    • clipboard fallback behavior,
    • multi-flavor text transfer,
    • UTF-8-safe OCR/PDF subprocess decoding.

Changed

  • Setup & Diagnostics now treats the dependency surface more truthfully:
    • Tesseract is the core OCR requirement,
    • Pillow is recommended for preprocessing,
    • numpy and pytesseract are optional compatibility extras,
    • PDF runtime guidance is separated from core image OCR readiness.
  • Settings readiness now shows independent status for:
    • Tesseract
    • PDF
    • Extras (optional)
  • Install guidance now makes LibreOffice Python explicit, especially for Windows and first-time users.
  • Platform reference and support snapshot text were reformatted for readability and copy/export workflows.
  • Extension UI language selection now has a taller selector so users can see more language choices at once.
  • Settings now applies the selected extension UI language to XDL-backed labels at runtime instead of leaving the main Settings surface in English.
  • The extension UI language selector was tuned down to a three-row height so it no longer overlaps the OCR language Search / Refresh controls.
  • Hindi Settings copy now uses निःशुल्क मुक्त-स्रोत for the FilterTube label, and the Japanese UI-language selector entry now uses Japanese (ja) to avoid unreadable tofu boxes on LibreOffice dialog fonts that lack Japanese glyphs.

Fixed

  • Fixed a PDF/runtime failure caused by ASCII decoding of subprocess output; OCR/PDF subprocess paths now decode as UTF-8 with replacement instead of failing on non-ASCII bytes.
  • Fixed clipboard robustness for setup/support copy flows by:
    • advertising multiple plain-text clipboard flavors in UNO,
    • adding stronger macOS clipboard fallbacks (pbcopy, then osascript).
  • Fixed setup/status messaging drift where Settings could imply missing core readiness while Setup & Diagnostics showed Tesseract/PDF as available.
  • Fixed install/troubleshooting docs and metadata guidance around package/license errors and platform-specific setup commands.

[0.2.0] - 2026-03-07 - UI Consolidation, Packaging Fixes & Writer Output Polish

Added

  • Added a dedicated OCR Complete dialog with:
    • grouped colored sections,
    • a scrollable source list for larger batches,
    • separated profile/runtime blocks,
    • cleaner requested-vs-effective diagnostics,
    • better language presentation for multi-language OCR runs.
  • Added an A Message button directly in the main Settings UI and a dedicated advocacy dialog explaining:
  • Added regression coverage for:
    • OCR output font-size handling,
    • package metadata validation,
    • dialog formatting and structured output presentation.

Changed

  • cleaner dialogs,
  • clearer runtime summaries,
  • more stable package/install behavior,
  • and better Writer output defaults.
  • OCR Complete moved away from a dense message-box/log-dump feel toward a structured, readable result surface.
  • Batch success summaries now separate successful source listing from failed-source reporting instead of mixing long install hints into the same block.
  • OCR-inserted text now defaults to 6 pt in Writer for:
    • insert at cursor,
    • new text box,
    • replace image.
  • The message/help/settings family now follows a more consistent visual language.

Fixed

  • Fixed Settings save regressions in the newer UI path, including:
    • preview_control not defined,
    • incomplete persistence for preset/grayscale/binarize values,
    • parameter value normalization issues in advanced PSM/OEM handling.
  • Fixed multiple OCR Complete layout issues:
    • collapsed multiline profile/runtime sections,
    • long source lists wrapping into clipped text,
    • footer/button clipping,
    • requested/effective details merging into a single unreadable line.
  • Fixed success-dialog source breakdown behavior for mixed batch runs so long file names and failed-source notes no longer corrupt the visible source list.
  • Fixed Windows extension installation failures caused by invalid simple-license metadata in description.xml, which surfaced as:
    • Could not obtain path to license. Possible error in description.xml
  • Fixed package metadata validation gaps by keeping description.xml, META-INF/manifest.xml, and shipped resources in sync.

[0.1.9] - 2026-03-07 - OCR Hardening, Safer Runtime Detection & Release Documentation

Added

  • Added a pure runtime planning layer in ocr_runtime.py for:
    • preset resolution,
    • bounded retry planning,
    • requested-vs-effective option tracking,
    • language normalization and install hints,
    • PDF DPI selection and diagnostics formatting.
  • Added structured OCR stats objects and benchmark coverage under tests/.
  • Added a committed benchmark corpus and benchmark comparison workflow for modern vs legacy executor checks.
  • Added dedicated documentation for:
    • OCR option semantics,
    • hardening rollout checks,
    • UI alignment planning,
    • current security/risk review.

Changed

  • Fast, Balanced, Accuracy, and Custom presets now map to bounded, predictable OCR behavior.
  • PDF handling is now page-streamed and adaptive instead of treating PDFs as a single up-front rasterization problem.
  • Requested-vs-effective OCR diagnostics are now captured consistently across image, PDF, and batch flows.
  • Setup and diagnostics flows now generate platform-aware install guidance using a safe LibreOffice Python interpreter path.
  • Runtime language handling preserves valid requested codes in order and reports missing-language guidance more clearly.

Fixed

  • Fixed multiple LibreOffice macOS helper-launch crashes caused by probing unsafe wrapper scripts such as:
    • Contents/Resources/python
    • LibreOfficePython
    • python3-config / python3.11-config
  • Stopped executing candidate Python helpers just to build pip install commands.
  • Fixed settings-state regressions that could leave language and engine controls uninitialized.
  • Fixed packaging consistency so release-note and metadata files referenced by description.xml / META-INF/manifest.xml are included in release builds.

Documentation

  • Updated top-level and deep docs to reflect:
    • the current OCR runtime model,
    • adaptive PDF behavior for both single-page and multi-page PDFs,
    • hidden maintainer rollback key,
    • current known risks and trust boundaries.

Known limitations

  • The OCR runtime is substantially hardened, but the LibreOffice dialog UX is still only partially aligned with that runtime model.
  • Completion, review, and setup/diagnostic UI still rely too heavily on dense message-box output and remain a follow-up UX redesign target.

[0.1.8] - 2026-02-27 - Batch OCR, PDF Support & UI/UX Polish

🎉 MAJOR FEATURES

  • Batch Image Processing: "OCR Image from File" now supports selecting multiple images at once.
  • PDF Document OCR: Added support for multi-page PDF files, automatically rasterizing and OCR-ing each page.
  • Merge Output: New option to merge all batch/PDF results into a single consolidated output block with file/page headers.

🎨 UI & UX ENHANCEMENTS

  • Semantic Dialogs: Redesigned Help and Setup dialog text with semantic prefixes (ℹ️, ✅, ⚠️, 💡) for improved readability and quicker scanning.
  • Interactive Options Layout: Shrunk the PSM and OEM combo boxes to make room for inline usage hints within the OCR Options dialog.
  • Robustness: Fixed an XDL namespace bug that caused the Setup & Diagnostics page to crash into a generic text box on some LibreOffice builds.

[0.1.4] - 2025-05-24 - PHASE 1 COMPLETE: Core Stability & OCR Functionality

MAJOR ACHIEVEMENTS

  • Both OCR workflows fully functional: OCR from File and OCR from Selected Image
  • Robust multi-strategy fallback systems implemented and tested
  • All critical crashes eliminated - extension now handles errors gracefully
  • Dependency detection working perfectly - Tesseract, NumPy, Pytesseract, Pillow

🔧 CRITICAL FIXES APPLIED

Text Insertion Error - RESOLVED

  • Issue: After successful OCR, text insertion failed with RuntimeException: no text selection
  • Solution: Implemented 4-Strategy Robust Text Insertion in tejocr_output.py:
    1. Strategy 1: Standard view cursor approach
    2. Strategy 2: Fallback to text cursor at document end
    3. Strategy 3: Direct insertion via insertString method
    4. Strategy 4: Focus window and retry cursor operations
  • Result: Text insertion now succeeds even when view cursor is in invalid state

GraphicExporter Failure - RESOLVED

  • Issue: Selected images couldn't be exported for OCR due to GraphicExporter service unavailability
  • Solution: Implemented 6-Strategy Graphic Export in uno_utils.py:
    1. Strategy 1: Standard GraphicExporter
    2. Strategy 2: Alternative exporter service names
    3. Strategy 3: GraphicProvider.storeGraphic method
    4. Strategy 4: Direct bitmap data extraction
    5. Strategy 5: URL-based file copying for linked graphics
    6. Strategy 6: PIL-generated placeholder with helpful error message
  • Result: OCR Selected Image now works with multiple fallback approaches

NumPy Dependency Detection - ENHANCED

  • Issue: Pytesseract failing due to missing NumPy in LibreOffice Python environment
  • Solution: Enhanced dependency detection and installation guidance
  • Result: Clear diagnostics and user guidance for all required dependencies

🚀 FUNCTIONALITY CONFIRMED

  • OCR from File: ✅ End-to-end workflow functional
  • OCR Selected Image: ✅ Image export + OCR + text insertion working
  • Settings Dialog: ✅ Accurate dependency status reporting
  • Error Handling: ✅ Graceful with helpful user messages
  • Selection Detection: ✅ Properly detects image selections
  • Version Management: ✅ Centralized version constants

📊 TEST RESULTS

>>> TejOCR.Output - INFO: Strategy 1 SUCCESS: Inserted 861 characters at view cursor.
>>> TejOCR.uno_utils - INFO: Strategy 3 SUCCESS: Graphic stored using GraphicProvider
>>> TejOCR.Engine - INFO: OCR completed. Extracted 1154 characters.
>>> TejOCR.Output - INFO: Strategy 2 SUCCESS: Inserted 1153 characters using text cursor at document end.

🔄 TECHNICAL IMPROVEMENTS

  • Enhanced error handling and logging throughout codebase
  • Robust UNO service creation with multiple fallback strategies
  • Improved module loading with lazy imports for performance
  • Centralized version management via constants
  • Comprehensive diagnostic tools for troubleshooting

[0.1.3] - 2025-05-24 - Real OCR Functionality Enabled

🎉 MAJOR BREAKTHROUGH: Full Working OCR Implementation

Finally! The extension now performs real OCR on images!

Added - Real OCR Functionality

  • ✅ OCR Selected Image: Now extracts actual text from selected images in LibreOffice Writer
  • ✅ OCR Image from File: Opens file picker, processes image files, extracts text
  • ✅ Text Insertion: Automatically inserts extracted text at cursor position
  • ✅ Smart Error Handling: User-friendly error messages with helpful tips
  • ✅ Success Feedback: Shows extraction results and character count

Enhanced - User Experience

  • Simple Workflow: Select image → Menu → Text appears at cursor (Mac-like simplicity!)
  • File Processing: Choose image file → Menu → Text appears at cursor
  • No Complex Dialogs: Streamlined experience focused on getting OCR done quickly
  • Clear Feedback: Success messages show character count and extraction status
  • Helpful Error Messages: Guide users when things go wrong

Technical Implementation

  • Real Engine Functions: extract_text_from_selected_image() and extract_text_from_image_file()
  • Integrated pytesseract: Proper path detection and configuration
  • Temporary File Handling: Safe image extraction and cleanup
  • Robust Error Handling: Comprehensive exception handling with user feedback
  • Development Mode: Set DEVELOPMENT_MODE_STRICT_PLACEHOLDERS = False for real functionality

Fixed - Core Issues

  • ✅ Settings UI: Now shows proper dependency detection results in UI dialog
  • ✅ Real OCR: No more placeholder messages - actual text extraction
  • ✅ File Picker: Real file selection dialog for image processing
  • ✅ Text Output: Working text insertion at cursor position

User Workflow Now

  1. For Selected Images: Select image in Writer → TejOCR Menu → OCR Selected Image → Text appears!
  2. For Image Files: TejOCR Menu → OCR Image from File → Choose file → Text appears!
  3. Simple Settings: View dependency status and installation guidance

[0.1.2] - 2025-05-24 - Enhanced Settings & Dependency Management

🎯 Major UX Improvements - Focus on Common Users

Philosophy: Prioritize user experience over technical complexity. The extension should work seamlessly for non-technical users.

Added

  • Smart Settings Dialog: Works perfectly even without OCR dependencies installed
  • Dependency Status Checker: Real-time detection of Tesseract and Python packages
  • Auto-Installation Assistant: One-click dependency installation for users
  • Comprehensive Guidance: Clear instructions for different operating systems
  • Graceful Degradation: Extension remains functional and helpful without dependencies

Enhanced

  • Professional UI Dialogs: Beautiful, branded dialogs with consistent TejOCR styling
  • Intelligent Status Display: Shows exactly what's installed and what's missing
  • User-Friendly Messaging: Clear, non-technical language throughout
  • Cross-Platform Support: Automatic detection of installation paths on macOS, Linux, Windows

Technical Improvements

  • Robust Dependency Detection: Multiple fallback methods for finding installations
  • Smart Path Resolution: Auto-detect Tesseract and Python installations
  • Error Handling: Comprehensive error handling with user-friendly messages
  • Logging System: Detailed logging for troubleshooting without overwhelming users

[0.1.1] - 2025-05-24 - Critical Crash Resolution & UI Fixes

🔧 Complete Crash Elimination & UI Dialog Implementation

This version resolved ALL critical crashes and implemented working UI dialogs.

Fixed - Installation & Settings Issues

  • FIXED: Invalid dependency liberation-minimal-version in description.xml causing installation errors
  • FIXED: Settings dialog module loading error - _ensure_modules_loaded now correctly loads dialogs module
  • FIXED: Settings now display comprehensive extension information instead of crashing

Fixed - OCR Function Crashes

  • FIXED: "OCR Image from File" RuntimeException crash - Added development mode bypass for FilePicker creation
  • FIXED: Complex UNO operations failing in development mode - Implemented safe fallbacks
  • FIXED: All menu items now work without RuntimeException crashes

Fixed - UI Dialog Visibility

  • BREAKTHROUGH: Implemented robust 3-method fallback system for parent window detection
  • FIXED: getPeer errors preventing message box display
  • RESULT: UI dialogs now appear reliably on screen!

Implementation Details

  • Method 1: Use provided parent_frame if available
  • Method 2: Get desktop's current frame as fallback
  • Method 3: Use toolkit's desktop window as final fallback
  • Graceful Handling: Message boxes work even with None parent

Development Mode Features

  • Added: DEVELOPMENT_MODE_STRICT_PLACEHOLDERS = True in constants.py
  • Purpose: Prevents complex UNO operations that can crash during development
  • Dual Output: Console output (always reliable) + UI dialogs (enhanced robustness)
  • Safety First: Ensures extension stability while building features incrementally

User Experience Improvements

  • Professional Dialogs: Beautiful TejOCR-branded message boxes
  • Clear Communication: Users see both console and UI feedback
  • Development Status: Transparent about current capabilities
  • Zero Crashes: All menu interactions now completely stable

[0.1.0] - 2025-05-24 - Foundation Release & Initial Crash Fixes

🏗️ Stable Foundation Establishment

Initial release focused on creating a rock-solid foundation before implementing complex OCR features.

Added - Core Extension Structure

  • LibreOffice Integration: Full protocol handler implementation
  • Menu System: TejOCR menu with three main functions
  • Service Architecture: Proper UNO service registration and dispatch handling
  • Logging System: Comprehensive logging with file and console handlers

Added - UI Framework

  • Settings Dialog: Extension configuration and status display
  • OCR Options: Placeholder dialogs for OCR functionality
  • Professional Branding: Consistent TejOCR styling and messaging
  • Internationalization: Multi-language support framework

Initial Crash Resolution

  • FIXED: ImportError and NameError issues in module loading
  • FIXED: Protocol Handler registration problems
  • FIXED: Python path resolution in OXT structure
  • ESTABLISHED: Safe module loading patterns with error handling

Development Philosophy Established

  • Foundation-First: Build stability before adding complexity
  • Systematic Debugging: Address root causes, not symptoms
  • User-Focused: Prioritize end-user experience over technical convenience
  • Incremental Development: Each feature must be stable before moving to next

Technical Architecture

  • Modular Design: Clean separation between service, dialogs, engine, and output
  • Error Resilience: Comprehensive exception handling throughout
  • Logging Strategy: Detailed debugging without overwhelming users
  • Extension Packaging: Proper OXT structure with all required manifests

Development Insights & Lessons Learned

🎯 Key Success Factors

  1. Stability First: Building a crash-proof foundation was the RIGHT approach
  2. Systematic Debugging: Methodically resolving each issue without breaking working parts
  3. User Experience Focus: Every decision considered the end-user impact
  4. Incremental Progress: Each version adds solid functionality without compromising stability

🛠️ Technical Breakthroughs

  1. UI Dialog Resolution: Solving the getPeer issue was crucial for user experience
  2. Development Mode Strategy: Using placeholders during development prevents crashes
  3. Robust Error Handling: Multiple fallback methods ensure extension always works
  4. Proper UNO Integration: Understanding LibreOffice's architecture enabled smooth integration

🚀 Next Phase Preparation

With version 0.1.2, the extension is perfectly positioned for implementing real OCR functionality:

  • Stable Foundation: Zero crashes, reliable UI
  • User-Friendly: Works great for non-technical users
  • Dependency Management: Smart detection and installation assistance
  • Professional Quality: Ready for production use

Future Roadmap

Phase 2: Real OCR Implementation

  • Implement actual text extraction from images
  • Add file processing capabilities
  • Create advanced options dialog

Phase 3: Advanced Features

  • Batch processing
  • Multiple output formats
  • Language pack management

Phase 4: Distribution

  • Bundle common dependencies
  • Create installer packages
  • Submit to LibreOffice Extension repository

The systematic, stability-first approach has created an excellent foundation for a professional-grade LibreOffice extension! 🎉

[Unreleased]

Added

  • Support for internationalization (i18n) with initial language support for English, Spanish, French, German, Chinese (Simplified), and Hindi
  • Build script (build.py) for packaging the extension as .oxt
  • Icon generation utility (generate_icons.py)
  • Translation template generator (generate_translations.py)
  • Test script for verifying Tesseract OCR functionality (test_ocr_engine.py)

Changed

  • Improved XML file handling for better compatibility with LibreOffice

Fixed

  • XML parsing errors related to encoding and format

Deprecated

  • N/A

Removed

  • N/A

Security

  • N/A

Phase 1 Complete - Core functionality stable and tested 🎉