Extracting data from PDF documents efficiently and quickly is termed Data extraction using OCR. It can also scan converted to pdf. It is one of the most prominent technical systems but not advanced. It requires a pre-processing method such as the PDF files need to be arranged by maintaining a sequence. No noise background, texts should be visible to the eyes, and be sure the material is not damaged. After clearing these facts, the document can be processed in the system. Data extraction using OCR faces difficulty in processing the low-quality files, images, and finds difficulty in the case of handwritten characters. Though OCR does not give accurate results all the time.