Digital Signal Processing


Minimum Amount of Text Overlapping in Document Separation

Authors: Antonio Boccuto, Ivan Gerace, Valentina Giorgetti

We consider a Blind Source Separation problem. In particular we focus on reconstruction of digital documents degraded by bleed-through and show-through effects. In this case, since the mixing matrix, the source and data images are nonnegative, the solution is given by a Nonnegative Factorization. As the problem is ill-posed, further assumptions are necessary to estimate the solution. In this paper we propose an iterative algorithm in order to estimate the correct overlapping level from the verso to the recto of the involved document. Thus, the proposed method is a Correlated Component Analysis technique. This method has low computational costs and is fully unsupervised. Moreover, we give an extension of the proposed algorithm in order to deal with a not translation invariant model. Our experimental results confirm the goodness of the method.

Comments: 91 Pages.

Download: PDF

Submission history

[v1] 2018-05-15 05:31:48
[v2] 2018-09-25 09:11:52

Unique-IP document downloads: 106 times is a pre-print repository rather than a journal. Articles hosted may not yet have been verified by peer-review and should be treated as preliminary. In particular, anything that appears to include financial or legal advice or proposed medical treatments should be treated with due caution. will not be responsible for any consequences of actions that result from any form of use of any documents on this website.

Add your own feedback and questions here:
You are equally welcome to be positive or negative about any paper but please be polite. If you are being critical you must mention at least one specific error, otherwise your comment will be deleted as unhelpful.

comments powered by Disqus