DANIEL

Logiciel pour l'extraction d'entités nommées dans des textes manuscrits

Technology No.

DANIEL DANIEL

Description :

An OCR transcribes a written document seen as a picture into a numerical file, making it easier to manipulate. However, it may be interesting to go beyond this first process by obtaining a rich numercial text where named entities have a specific label. This is what DANIEL does : extracting the named entities from a written document.

How does it work :

Acquisition of the written document as an image
Scanning of the document to extract its text
This text is then analysed to detect the named entities
A label is associated to each named entity

DANIEL is an end-to-end software performing handwritten text recognition and named entity recognition on full-page documents. It is working with a fully convolutional encoder so it is able to deal with images of any size and uses an attention network with a LLM to extract named entities.

Applications :

Creation of databases linking entities within documents
Analysing historical document
Searching through documents for a specific named entity

Advantages :

State of the art results in text recognition and named entity extraction
Works with multiple languages
Faster than other solutions
End-to-end architecture

Authors (1)

Thomas Constum
Supporting documents (2)

Datasheet Daniel English

datasheet_Daniel_en.pdf (435 KB)

Datasheet Daniel Français

datasheet_Daniel_fr.pdf (471 KB)

DANIEL

Logiciel pour l'extraction d'entités nommées dans des textes manuscrits

Get in touch to discuss licensing options