Loading…
Printed and scanned document authentication using robust layout descriptor matching
Automatic document authentication is a complex task. The aim is to prove that the document at hand is not a fraudulent one. This can be achieved through a fingerprint that is based on the document’s content. To this end, it is necessary to analyze and describe the different constituent elements of t...
Saved in:
Published in: | Multimedia tools and applications 2023-10, Vol.83 (16), p.47477-47502 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Automatic document authentication is a complex task. The aim is to prove that the document at hand is not a fraudulent one. This can be achieved through a fingerprint that is based on the document’s content. To this end, it is necessary to analyze and describe the different constituent elements of the document: graphics, text, tables, as well as the layout. In this context, this article focuses on layout description and authentication. The Delaunay layout descriptor Eskenazi et al.
2015
is a robust descriptor allowing the fast comparison and authentication of layouts based on the spatial relationships of the regions composing the document. As the page layout description needs a segmentation of the document into regions, the Delaunay layout descriptor does not allow to match an authentic copy with the original when the number of segmented regions is different for both documents. This is mainly due to the use of a global matching approach. To overcome this drawback, we present a new refined matching algorithm for the Delaunay layout descriptor, which combines global and local matching. Furthermore, we present a storage and retrieval scheme to match a Delaunay layout descriptor efficiently with a layout database. In addition to its ability of comparing layouts with a different number of segmented regions, the proposed method outperforms related work. We obtain respectively a false negative and false positive rate of 0.011 and 0.0 for a data set of printed and scanned layouts, and of 0.3978 and 0.0029 for a data set of real documents. |
---|---|
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-17021-1 |