How to remove OCR from a PDF?

I have been searching Google for some time but cannot find an answer to my question.

I have unwanted layers of OCR in a document that I recently scanned with Adobe Acrobat. It has not been OCRed properly, and I want to redact some information, but the OCR is making the wanted information to get erased. I converted the files to TIFs, but noticed a (very) significant quality loss. I have heard that printing to another PDF either keeps the text or reduces the image quality.

I appreciate any help in solving this issue ASAP.

Thank You.

edited Oct 12 '14 at 15:00

asked Oct 11 '14 at 6:32

Sanoo

1282521

add a comment |

I have been searching Google for some time but cannot find an answer to my question.

I appreciate any help in solving this issue ASAP.

Thank You.

edited Oct 12 '14 at 15:00

asked Oct 11 '14 at 6:32

Sanoo

1282521

add a comment |

I have been searching Google for some time but cannot find an answer to my question.

I appreciate any help in solving this issue ASAP.

Thank You.

edited Oct 12 '14 at 15:00

asked Oct 11 '14 at 6:32

Sanoo

1282521

I have been searching Google for some time but cannot find an answer to my question.

I appreciate any help in solving this issue ASAP.

Thank You.

pdf adobe-acrobat ocr tif

edited Oct 12 '14 at 15:00

asked Oct 11 '14 at 6:32

Sanoo

1282521

edited Oct 12 '14 at 15:00

asked Oct 11 '14 at 6:32

Sanoo

1282521

edited Oct 12 '14 at 15:00

asked Oct 11 '14 at 6:32

Sanoo

1282521

asked Oct 11 '14 at 6:32

Sanoo

1282521

asked Oct 11 '14 at 6:32

Sanoo

1282521

add a comment |

6 Answers
6

active

oldest

votes

In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.

On running the command, it just searches out the hidden information but does not change the document. You must then tell Acrobat which information to remove. In this case, select "Hidden Text" in the Results pane, then click the Remove button and save the changed document.

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

answered Apr 11 '17 at 4:11

user1125483

1313

I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.

– Sanoo
Apr 11 '17 at 4:20

This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.

– Nicholas Riley
Jan 21 '18 at 20:16

The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."

– user1125483
Sep 18 '18 at 10:38

add a comment |

After a lot of experimenting, I found that printing to Adobe PDF from Adobe Acrobat prints the document without the OCR and without losing the quality (an unnoticeable at first glance resolution is lost).

However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.

Please keep in mind that your mileage may vary.

Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.

edited Oct 13 '14 at 7:53

answered Oct 13 '14 at 6:06

Sanoo

1282521

add a comment |

In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone

answered Oct 20 '16 at 15:55

jazzzz

111

add a comment |

In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.

edited Jan 30 '18 at 16:51

darthbith

340215

answered Dec 14 '17 at 8:49

Dave

111

add a comment |

(one year ago...)

If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:

Select Document, Examine Document and now you can remove the hidden text (OCR).

answered Dec 10 '15 at 10:50

Fran

Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!

– Sanoo
Feb 19 '16 at 14:31

I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.

– Sanoo
Jul 17 '16 at 7:43

add a comment |

I built a tool to do this free PDF Redactor. If you upload the image and just click redact it'll flatten your pdf and remove OCR. If you want you can also draw redaction marks on the document as well.

edited Jan 31 at 8:19

answered Jan 31 at 7:31

levinology

1113

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f823808%2fhow-to-remove-ocr-from-a-pdf%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

answered Apr 11 '17 at 4:11

user1125483

1313

I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.

– Sanoo
Apr 11 '17 at 4:20

This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.

– Nicholas Riley
Jan 21 '18 at 20:16

The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."

– user1125483
Sep 18 '18 at 10:38

add a comment |

In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

answered Apr 11 '17 at 4:11

user1125483

1313

I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.

– Sanoo
Apr 11 '17 at 4:20

This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.

– Nicholas Riley
Jan 21 '18 at 20:16

The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."

– user1125483
Sep 18 '18 at 10:38

add a comment |

In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

answered Apr 11 '17 at 4:11

user1125483

1313

In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

answered Apr 11 '17 at 4:11

user1125483

1313

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

edited Sep 22 '17 at 1:06

Warren Young

2,24711424

answered Apr 11 '17 at 4:11

user1125483

1313

answered Apr 11 '17 at 4:11

user1125483

1313

answered Apr 11 '17 at 4:11

user1125483

1313

I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.

– Sanoo
Apr 11 '17 at 4:20

This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.

– Nicholas Riley
Jan 21 '18 at 20:16

The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."

– user1125483
Sep 18 '18 at 10:38

add a comment |

I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.

– Sanoo
Apr 11 '17 at 4:20

This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.

– Nicholas Riley
Jan 21 '18 at 20:16

The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."

– user1125483
Sep 18 '18 at 10:38

I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.

– Sanoo
Apr 11 '17 at 4:20

This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.

– Nicholas Riley
Jan 21 '18 at 20:16

The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."

– user1125483
Sep 18 '18 at 10:38

add a comment |

However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.

Please keep in mind that your mileage may vary.

Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.

edited Oct 13 '14 at 7:53

answered Oct 13 '14 at 6:06

Sanoo

1282521

add a comment |

However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.

Please keep in mind that your mileage may vary.

Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.

edited Oct 13 '14 at 7:53

answered Oct 13 '14 at 6:06

Sanoo

1282521

add a comment |

However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.

Please keep in mind that your mileage may vary.

Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.

edited Oct 13 '14 at 7:53

answered Oct 13 '14 at 6:06

Sanoo

1282521

However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.

Please keep in mind that your mileage may vary.

Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.

edited Oct 13 '14 at 7:53

answered Oct 13 '14 at 6:06

Sanoo

1282521

edited Oct 13 '14 at 7:53

answered Oct 13 '14 at 6:06

Sanoo

1282521

answered Oct 13 '14 at 6:06

Sanoo

1282521

answered Oct 13 '14 at 6:06

Sanoo

1282521

add a comment |

In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone

answered Oct 20 '16 at 15:55

jazzzz

111

add a comment |

In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone

answered Oct 20 '16 at 15:55

jazzzz

111

add a comment |

In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone

answered Oct 20 '16 at 15:55

jazzzz

111

In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone

answered Oct 20 '16 at 15:55

jazzzz

111

answered Oct 20 '16 at 15:55

jazzzz

111

answered Oct 20 '16 at 15:55

jazzzz

111

answered Oct 20 '16 at 15:55

jazzzz

111

add a comment |

In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.

edited Jan 30 '18 at 16:51

darthbith

340215

answered Dec 14 '17 at 8:49

Dave

111

add a comment |

In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.

edited Jan 30 '18 at 16:51

darthbith

340215

answered Dec 14 '17 at 8:49

Dave

111

add a comment |

In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.

edited Jan 30 '18 at 16:51

darthbith

340215

answered Dec 14 '17 at 8:49

Dave

111

In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.

edited Jan 30 '18 at 16:51

darthbith

340215

answered Dec 14 '17 at 8:49

Dave

111

edited Jan 30 '18 at 16:51

darthbith

340215

edited Jan 30 '18 at 16:51

darthbith

340215

edited Jan 30 '18 at 16:51

darthbith

340215

answered Dec 14 '17 at 8:49

Dave

111

answered Dec 14 '17 at 8:49

Dave

111

answered Dec 14 '17 at 8:49

Dave

111

add a comment |

(one year ago...)

If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:

Select Document, Examine Document and now you can remove the hidden text (OCR).

answered Dec 10 '15 at 10:50

Fran

Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!

– Sanoo
Feb 19 '16 at 14:31

I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.

– Sanoo
Jul 17 '16 at 7:43

add a comment |

(one year ago...)

If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:

Select Document, Examine Document and now you can remove the hidden text (OCR).

answered Dec 10 '15 at 10:50

Fran

Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!

– Sanoo
Feb 19 '16 at 14:31

I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.

– Sanoo
Jul 17 '16 at 7:43

add a comment |

(one year ago...)

If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:

Select Document, Examine Document and now you can remove the hidden text (OCR).

answered Dec 10 '15 at 10:50

Fran

(one year ago...)

If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:

Select Document, Examine Document and now you can remove the hidden text (OCR).

answered Dec 10 '15 at 10:50

Fran

answered Dec 10 '15 at 10:50

Fran

answered Dec 10 '15 at 10:50

Fran

answered Dec 10 '15 at 10:50

Fran

Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!

– Sanoo
Feb 19 '16 at 14:31

I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.

– Sanoo
Jul 17 '16 at 7:43

add a comment |

Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!

– Sanoo
Feb 19 '16 at 14:31

I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.

– Sanoo
Jul 17 '16 at 7:43

Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!

– Sanoo
Feb 19 '16 at 14:31

I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.

– Sanoo
Jul 17 '16 at 7:43

add a comment |

edited Jan 31 at 8:19

answered Jan 31 at 7:31

levinology

1113

add a comment |

edited Jan 31 at 8:19

answered Jan 31 at 7:31

levinology

1113

add a comment |

edited Jan 31 at 8:19

answered Jan 31 at 7:31

levinology

1113

edited Jan 31 at 8:19

answered Jan 31 at 7:31

levinology

1113

edited Jan 31 at 8:19

answered Jan 31 at 7:31

levinology

1113

answered Jan 31 at 7:31

levinology

1113

answered Jan 31 at 7:31

levinology

1113

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Super User!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

TUjv,wub,LxSe

搜尋此網誌

Jtdylktuy