Paper documents data processing

Beschwerde­management­system für den ÖPNV

What can you do with Arkeda:

  • Create a database of digitized paper documents
  • Tune classification: set groups and classes of documents
  • Search and filter items by fields
  • Find connections between documents
  • Build reports and forecasts

Create your own AI experience:

  • Create mockups for fast text recognition, use your own fields, edit automatic labeling
  • Evaluate the labeling done by your employee
  • Retrain ML models for better text and image recognition

One paper document requires 120 - 180 mins of a specialist

ARKEDA needs only 3 seconds for processing of a single document

ARKEDA can process 100 000 documents in 3.5 days!

Our technology stack:

OCR

OpenCV + Tesseract

Text processing
  1. Multilanguage encoder CNN
    neuronet.
  2. deeppavlolv: NER with BERT
Classificator
  1. K-means and logreg 98.9% accuracy for
    doc2vec neuronet
  2. Parsing: regex, BERT NER, Multilanguage encoder vectors, ElasticSearch