DICOM de-identification at scale in Visual NLP-Part 3This post will delve into the utilization of Visual NLP to manipulate pixel and overlay data within DICOM images.Sep 29, 2023Sep 29, 2023
Published inJohn Snow LabsDICOM de-identification at scale in Visual NLP-Part 2In this post, we are taking a deep dive into working with metadata using Visual NLP.Sep 25, 2023Sep 25, 2023
Published inJohn Snow LabsDICOM de-identification at scale in Visual NLP-Part 1.Visual NLP is an advanced tool built on the top of Apache Spark framework, designed to handle OCR tasks, including DICOM de-identification.Sep 20, 2023Sep 20, 2023
Published inspark-nlpText Detection in Spark OCRFor simplify text recognition on images with natural scene, complex background, rotated text we added DL based approach for detect text.Jan 11, 2022Jan 11, 2022
Published inspark-nlpExtract Tabular Data from PDF in Spark OCRIntroduction to Table ExtractionJul 21, 2021Jul 21, 2021
Published inspark-nlpSignature Detection in Spark OCRHow to detect signature in image-based documentsJul 1, 2021Jul 1, 2021
Published inspark-nlpTable Detection & Extraction in Spark OCRConverting tables in scanned documents & images into structured dataJun 21, 20213Jun 21, 20213
Published inspark-nlpGPU image pre-processing in Spark OCR 3.1.0Did you have issue with bad quality of results after applying OCR? I think yes. Image preprocessing can significant improve results. I…Apr 19, 2021Apr 19, 2021
Processing RVL -CDIP dataset using Spark OCR on DatabricksWe decided to validate some our model on RVL-CDIP dataset. For use it we need to extract text from the images in HOCR format. Let’s do it…Mar 25, 2021Mar 25, 2021
Spark OCR: How to calculate cluster size for process 1 million documentsInput: 1 m doc, 4 avg page/doc, avg Spark OCR pipeline from John Snow LabsMar 13, 2021Mar 13, 2021