WSEAS Transactions on Signal Processing
Print ISSN: 1790-5052, E-ISSN: 2224-3488
Volume 10, 2014
Text/ Background Separation in the Degraded Document Images by Combining Several Thresholding Techniques
Authors: , ,
Abstract: Extract the text from the background is an important step in all process of document analysis and recognition. If this extraction is easy for document images of good quality by applying simple techniques of global thresholding, the images of degraded documents require a more accurate analysis and we have recourse in this case to local methods. Indeed, these latter are generally more efficient and provide better results than the global methods but they are very slow because of the threshold calculation which is performed separately for each pixel based on the information of its neighborhood. In this article, we try to solve this problem by proposing a hybrid thresholding technique which combines the advantages of the two families of methods, speed and performance. The idea is to precede a thresholding in two passes: globally in order to class the most of pixels and then locally on the remaining pixels. The approach has been tested on a standard collection and compared with well known methods, and the results are encouraging.
Search Articles
Keywords: Binarization, Degraded Documents, Document Preprocessing, Combination of Thresholding Methods, Evaluation of Binarization Methods, Hybrid Thresholding
Pages: 436-443
WSEAS Transactions on Signal Processing, ISSN / E-ISSN: 1790-5052 / 2224-3488, Volume 10, 2014, Art. #46