Mykola MelnykinJohn Snow LabsDICOM de-identification at scale in Visual NLP-Part 3This post will delve into the utilization of Visual NLP to manipulate pixel and overlay data within DICOM images.6 min read·Sep 29, 2023----
Mykola MelnykinJohn Snow LabsDICOM de-identification at scale in Visual NLP-Part 2In this post, we are taking a deep dive into working with metadata using Visual NLP.4 min read·Sep 25, 2023----
Mykola MelnykinJohn Snow LabsDICOM de-identification at scale in Visual NLP-Part 1.Visual NLP is an advanced tool built on the top of Apache Spark framework, designed to handle OCR tasks, including DICOM de-identification.4 min read·Sep 20, 2023----
Mykola Melnykinspark-nlpText Detection in Spark OCRFor simplify text recognition on images with natural scene, complex background, rotated text we added DL based approach for detect text.5 min read·Jan 11, 2022----
Mykola Melnykinspark-nlpExtract Tabular Data from PDF in Spark OCRIntroduction to Table Extraction4 min read·Jul 21, 2021----
Mykola Melnykinspark-nlpSignature Detection in Spark OCRHow to detect signature in image-based documents3 min read·Jul 1, 2021----
Mykola Melnykinspark-nlpTable Detection & Extraction in Spark OCRConverting tables in scanned documents & images into structured data8 min read·Jun 21, 2021--3--3
Mykola Melnykinspark-nlpGPU image pre-processing in Spark OCR 3.1.0Did you have issue with bad quality of results after applying OCR? I think yes. Image preprocessing can significant improve results. I…3 min read·Apr 19, 2021----
Mykola MelnykProcessing RVL -CDIP dataset using Spark OCR on DatabricksWe decided to validate some our model on RVL-CDIP dataset. For use it we need to extract text from the images in HOCR format. Let’s do it…3 min read·Mar 25, 2021----
Mykola MelnykSpark OCR: How to calculate cluster size for process 1 million documentsInput: 1 m doc, 4 avg page/doc, avg Spark OCR pipeline from John Snow Labs1 min read·Mar 13, 2021----