Document Deskewer

Introduction
The scanning of printed documents involves several challenges that result from the quality of the material to be scanned and the scanning process itself. One of these challenges is the accidental production of skewed images due to an imprecise alignment of the printed document (Figure 1).



The Document Deskewer is a simple and easy to use command-line tool for automatically correcting skewed pages. Given a skewed input image the tool detects the skew angle and automatically corrects the rotation for the full range of 0-360 degrees. The processing can be adjusted using parameters which allow the user to select the resampling method or the colour used to fill blank areas introduced by the image rotation.

This tool is particularly interesting for institutions with printed material which is difficult to scan and thus might result in skewed images. By correcting skewed images it is possible to improve the results of subsequent post-processing steps such as OCR or to improve the overall visual appearance of an image that is supposed to be shown to users.

The Document Deskewer can be integrated into the existing scanning and post-processing workflow to correct these images. The best results are achieved for documents written in Roman scripts.

Non-technical requirements
The Document Deskewer does not have a graphical user interface (GUI) and therefore requires some basic knowledge on how to use and execute the command-line programs.

Licensing
The tool is produced at the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) in Sankt Augustin, Germany. For more information on terms and conditions to use the tool, please contact Fraunhofer IAIS, Department NetMedia: Dr. Joachim Köhler Joachim.koehler(at)iais.fraunhofer.de

Files
The Installation package of this tool contains the following files:

Filename: deskew.exe (Windows) / deskew (Linux)

Description: The Document Deskewer command-line executable. The executable is available for Windows and Linux operating systems.

Installation Instructions
The Document Deskewer is a tool and does not require any installation. After extracting the executable files from the installation package, it can be used by calling it from the command-line.

Quick Start Guide
An easy way to start using the Document Deskewer is choosing one of the examples in the “Examples” section. A detailed description of the parameters for the configuration of the tool can be found in the following section.

Documentation
To effectively use the Document Deskewer it is important to understand the possible configuration options explained in the following sections. For example the Document Deskewer will refuse to rotate images where the confidence in the calculated angle is below a certain value. It will produce the following error message:

Error: confidence in calculated skew angle -90 is low, use --force to deskew anyway

However it is possible to use the “-f” parameter to force deskew even if the confidence in the calculated angle is low.

Configuration and Customization
The Document Deskewer can be configured using command-line parameters. The following table gives an overview on the possible parameters and explains how they affect the output image.

Workflow Integration
The Document Deskewer can be integrated into any workflow or application that allows the execution of command-line tools. See section “Configuration” for details on how the tool can be configured using parameters.

Examples
1. Deskew a sample image, display verbose output:

deskew -i sample.tif -o result.tif –v

Output:

Loading input file: sample.tif Determining skew angle maxpos 0 max0 24.47806 max1 18.97955 max2 4.14367 res 1 Determined skew angle: -1.1 Rotating image Writing output file: result.tif

2. Deskew a sample image, force deskew and fill pixels for which the gray value cannot be computed with black:

deskew -i sample.tif -o result.tif –f --fill-black

Output:



3. Deskew a sample image, choose “triangle” as the resampling method

deskew -i sample.tif -o result.tif --resampling-method triangle