contentCrawler

Make Every Document Searchable:

contentCrawler’s ORC module can identify non-searchable content in a Document Management System database or a subset of documents based on specific queries. The OCR module converts this content to text-searchable PDFs, saving them back into the Content Repository as new or replacement documents.

Reduce File Size:

The contentCrawler Compression module enables Administrators to compress image and PDF documents in their DMS. Converting image documents to PDF and applying compression and downsampling to the files reduces overall file size.

Key features:

  • Assesses and analyses documents in a content repository for OCR and/or compression processing
  • Processes image-based documents such as TIF, JPG, PNG and image PDFs
  • Converts image-based documents to text-searchable PDFs adding a text layer for enhanced searching Reduces image-based document file size using a variety of JPEG compression standards
  • Processes image-based attachments in emails Set compression and text thresholds to optimize processing, ignoring documents that do not meet the requirement