Convert PDF file back to ODT after incorrect export.
Solved
Whoviana
-
Whoviana Posted messages 36 Registration date Status Membre Last intervention -
Whoviana Posted messages 36 Registration date Status Membre Last intervention -
Hello,
People are going to say I believe in Santa Claus, but after exporting an entire book to pdf and realizing I no longer had the original in .odt (I don't have Word!) file that I would need just for the "word count" function, which doesn't exist from the pdf file, I read this solution given on this site (but in 2016) by someone named Shunesburg:
On LibreOffice we can re-import pdfs, but it’s not originally designed for that:
To open as text, launch LibreOffice, click open, and choose the format "PDF - Portable Document Format (Writer) (*.pdf)" and select your file, it will then open in word processing, you just have to save it as odt once you are done.
Except I don't understand what "launch LibreOffice" means. When I "launch" this program, I directly get Writer on my screen, and aside from the drop-down menu "Format" (which concerns the layout of a current document), I don't see anywhere that states "PDF - Portable Document Format." Nonetheless, just in case, I clicked "open," chose my file (which was the pdf I wanted to convert back to odt), clicked "open," and I had a rather long wait, which did result in my text appearing, but... in something called "drawing OF something," with 4 save format options all related to drawing, and of course without a "word count" function.
The other responses to the person (who was in the same situation) were to completely rewrite their text. Well, hers was a letter; mine is a book of over 70,000 words, so you understand that this solution doesn’t work for me...
I also see some paid sites, but it's still to be seen if they take such long texts, and all this just to "count the words"?...
Next time I create a pdf, I will take the "editable" option (if it allows a return?), but I don't understand how you can convert an odt to pdf but not the other way around (okay, sure, it might be like not understanding how you can break a vase into 10,000 pieces and not be able to put it back together; I’m aware of that, but still!!!!) :-)
Thank you for your responses
Configuration: Windows / Firefox 89.0
People are going to say I believe in Santa Claus, but after exporting an entire book to pdf and realizing I no longer had the original in .odt (I don't have Word!) file that I would need just for the "word count" function, which doesn't exist from the pdf file, I read this solution given on this site (but in 2016) by someone named Shunesburg:
On LibreOffice we can re-import pdfs, but it’s not originally designed for that:
To open as text, launch LibreOffice, click open, and choose the format "PDF - Portable Document Format (Writer) (*.pdf)" and select your file, it will then open in word processing, you just have to save it as odt once you are done.
Except I don't understand what "launch LibreOffice" means. When I "launch" this program, I directly get Writer on my screen, and aside from the drop-down menu "Format" (which concerns the layout of a current document), I don't see anywhere that states "PDF - Portable Document Format." Nonetheless, just in case, I clicked "open," chose my file (which was the pdf I wanted to convert back to odt), clicked "open," and I had a rather long wait, which did result in my text appearing, but... in something called "drawing OF something," with 4 save format options all related to drawing, and of course without a "word count" function.
The other responses to the person (who was in the same situation) were to completely rewrite their text. Well, hers was a letter; mine is a book of over 70,000 words, so you understand that this solution doesn’t work for me...
I also see some paid sites, but it's still to be seen if they take such long texts, and all this just to "count the words"?...
Next time I create a pdf, I will take the "editable" option (if it allows a return?), but I don't understand how you can convert an odt to pdf but not the other way around (okay, sure, it might be like not understanding how you can break a vase into 10,000 pieces and not be able to put it back together; I’m aware of that, but still!!!!) :-)
Thank you for your responses
Configuration: Windows / Firefox 89.0
9 réponses
Hello,
There are several solutions, and your problem with counting words is minor.
But you need to know, first of all, if your PDF file has actually recorded the characters as such or if they have been converted to image form in the process.
To check this is very simple: when you open your PDF, if you can select text with the cursor, it is "real" text.
I don't understand how you can convert an ODT to PDF but not the other way around
When you make a cake, you use various ingredients that you can handle separately. But once the cake is ready to eat, it is impossible to separate the sugar or eggs from the rest of the cake! Or, with syrup diluted in water, you can no longer separate the two.
Here, it's the same thing. Your original document is transformed into a format that does not understand the information and instructions in the same way they were before.
First question: why not just convert your PDF into an ODT document, which is feasible with many software? (For example, my favorite: FormatFactory).
In the case that your text is real text, that should be enough.
If it’s text in the form of an image, you will then need to run an OCR algorithm, which is used to detect writing in images. Some websites can also do this. Disadvantage: your layout may be a bit messed up.
And all of this is without considering that it is entirely possible to count the words of a PDF document if you use a somewhat advanced PDF software. For instance, PDF X-Change Editor.
There are several solutions, and your problem with counting words is minor.
But you need to know, first of all, if your PDF file has actually recorded the characters as such or if they have been converted to image form in the process.
To check this is very simple: when you open your PDF, if you can select text with the cursor, it is "real" text.
I don't understand how you can convert an ODT to PDF but not the other way around
When you make a cake, you use various ingredients that you can handle separately. But once the cake is ready to eat, it is impossible to separate the sugar or eggs from the rest of the cake! Or, with syrup diluted in water, you can no longer separate the two.
Here, it's the same thing. Your original document is transformed into a format that does not understand the information and instructions in the same way they were before.
First question: why not just convert your PDF into an ODT document, which is feasible with many software? (For example, my favorite: FormatFactory).
In the case that your text is real text, that should be enough.
If it’s text in the form of an image, you will then need to run an OCR algorithm, which is used to detect writing in images. Some websites can also do this. Disadvantage: your layout may be a bit messed up.
And all of this is without considering that it is entirely possible to count the words of a PDF document if you use a somewhat advanced PDF software. For instance, PDF X-Change Editor.
Hello,
You can convert pdf to doc online, for example here: https://www.ilovepdf.com/fr/pdf_en_word and then open the doc file in Libre Office Writer.
There’s nothing stopping you from doing it in txt on the same principle: https://pdftotext.com/fr/, in this last case you have to count the words using the Ms-Dos command line.
You can convert pdf to doc online, for example here: https://www.ilovepdf.com/fr/pdf_en_word and then open the doc file in Libre Office Writer.
There’s nothing stopping you from doing it in txt on the same principle: https://pdftotext.com/fr/, in this last case you have to count the words using the Ms-Dos command line.
Hello,
A PDF can be of several types: a text-type file (which may be what you call editable), where you can see, for example, in a PDF reader that the text is selectable with the mouse, character by character, or an image-type file, as if you took a photograph of the document. Your file seems to be in the latter case. It is therefore not directly possible to retrieve it as text.
We can convert the image into text; for this, we use OCR software. There is often software of this type associated with the printer to transform a scanned document into text.
--
a stranger is a friend that we haven't met yet.
A PDF can be of several types: a text-type file (which may be what you call editable), where you can see, for example, in a PDF reader that the text is selectable with the mouse, character by character, or an image-type file, as if you took a photograph of the document. Your file seems to be in the latter case. It is therefore not directly possible to retrieve it as text.
We can convert the image into text; for this, we use OCR software. There is often software of this type associated with the printer to transform a scanned document into text.
--
a stranger is a friend that we haven't met yet.
Hello.
Simply use an online document converter:
https://www.online-convert.com/fr
Choose Document Converter with target format ODT.
Upload the PDF file
Set options or not.
Start the conversion.
Retrieve the document.
Simply use an online document converter:
https://www.online-convert.com/fr
Choose Document Converter with target format ODT.
Upload the PDF file
Set options or not.
Start the conversion.
Retrieve the document.
Hello
When the source document was in odt, did you convert it to PDF with the command
File > export > PDF format, selecting a different destination folder?
If so, the source document should have remained in odt in the original folder
(that's what happens with Open Office, which is similar to Libre Office)
When the source document was in odt, did you convert it to PDF with the command
File > export > PDF format, selecting a different destination folder?
If so, the source document should have remained in odt in the original folder
(that's what happens with Open Office, which is similar to Libre Office)
Thank you all for these almost immediate responses (what a great site CCM is!). I'm primarily responding to Alouminium since he himself has asked me several questions, but of course, this applies to everyone:
So, I checked: the lines (or words, let's say generally "the text") can be selected with the mouse. So it's indeed a "text" recording, and you say there are software programs that can convert it. I will therefore download one.
In that case, do I copy-paste my entire book into the conversion software? I thought there was a maximum text limit to respect, and that a whole book would be too much. And that's also why, to respond to Dhyd, I wasn't turning to online converters, also because I had understood that they were paid.
Yes, I suspect there is a word count possible on a pdf, but the problem is when you convert to a basic pdf (which is my case), this option is not included; you have to pay to access other options (including this one).
I understand the cake analogy, which relates to my story of the vase broken into 10,000 pieces (I prefer the cake as an example! :-))
Otherwise, to respond to Jee Pee on what I call "modifiable" (perhaps not the right word): when I convert an odt file to pdf, this is how I do it:
In the File menu, I select "export as pdf", there I arrive at a window where I have on the left a choice
All
Page
Selection
I check "all" (which is actually pre-checked)
And on the right a choice:
"hybrid format" (pre-checked)
pdf archive (and other details, with a sub-choice)
PDF
Create a form
I keep the "hybrid" checked option
It seems, if I understood correctly, that since I can select what I obtain, this choice leads to a "text" file.
Finally, to respond to Yclick and continuing from what I just said: when I convert as I mentioned, in the window I'm talking about, I don't see a "destination folder" choice, so at this level I don't change anything.
That's it. If any of my indications lead to a clarification from your end, I thank you, and in any case, I already thank you all for your responses.
Tonight I will download the program that Alouminium mentioned, and I will definitely come back, as always when I'm helped, to give updates.
A big thank you!
Whoviana
So, I checked: the lines (or words, let's say generally "the text") can be selected with the mouse. So it's indeed a "text" recording, and you say there are software programs that can convert it. I will therefore download one.
In that case, do I copy-paste my entire book into the conversion software? I thought there was a maximum text limit to respect, and that a whole book would be too much. And that's also why, to respond to Dhyd, I wasn't turning to online converters, also because I had understood that they were paid.
Yes, I suspect there is a word count possible on a pdf, but the problem is when you convert to a basic pdf (which is my case), this option is not included; you have to pay to access other options (including this one).
I understand the cake analogy, which relates to my story of the vase broken into 10,000 pieces (I prefer the cake as an example! :-))
Otherwise, to respond to Jee Pee on what I call "modifiable" (perhaps not the right word): when I convert an odt file to pdf, this is how I do it:
In the File menu, I select "export as pdf", there I arrive at a window where I have on the left a choice
All
Page
Selection
I check "all" (which is actually pre-checked)
And on the right a choice:
"hybrid format" (pre-checked)
pdf archive (and other details, with a sub-choice)
Create a form
I keep the "hybrid" checked option
It seems, if I understood correctly, that since I can select what I obtain, this choice leads to a "text" file.
Finally, to respond to Yclick and continuing from what I just said: when I convert as I mentioned, in the window I'm talking about, I don't see a "destination folder" choice, so at this level I don't change anything.
That's it. If any of my indications lead to a clarification from your end, I thank you, and in any case, I already thank you all for your responses.
Tonight I will download the program that Alouminium mentioned, and I will definitely come back, as always when I'm helped, to give updates.
A big thank you!
Whoviana
On the advice of AluMinioume, I tested the conversion in FormatFactory, which I usually use for videos.
With a PDF book (text format), it's possible to obtain a .docx of the book (I think LibreOffice recognizes Word documents). Since FormatFactory does not include OCR, a PDF (image format) will be converted into a .docx containing images instead of text.
https://www.ilovepdf.com/fr converted my book into a .doc.
https://pdfsimpli.com/fr/ also converted the PDF, but you need to create an account to retrieve the file.
https://www.online-convert.com/fr also successfully converted the file.
With a PDF book (text format), it's possible to obtain a .docx of the book (I think LibreOffice recognizes Word documents). Since FormatFactory does not include OCR, a PDF (image format) will be converted into a .docx containing images instead of text.
https://www.ilovepdf.com/fr converted my book into a .doc.
https://pdfsimpli.com/fr/ also converted the PDF, but you need to create an account to retrieve the file.
https://www.online-convert.com/fr also successfully converted the file.
Thank you Jee Pee. I have a little issue tonight, I didn't have time to do what I had planned, I'll try tomorrow night, but I'm carefully noting these links. I'd prefer not to open an account, so I'll start with the links that don't require an account.
The book is pure text. No images.
Have a good evening!
The book is pure text. No images.
Have a good evening!
Good evening
In response to Trotte-Menu: I did as you said, but unfortunately, when I click "open with", I only have 4 choices: Adobe Acrobat Reader DC; Adobe Acrobat touch, Firefox, Edge. There is also "open in another application" but it shows the same options.
So I don't have the means to open with Libre Office
Otherwise, I notice something strange, in my window where my folders and files related to this book are: in the "Type" column, my Libre Office text files are listed as "Open Document Text" but my PDF files are listed as Microsoft Edge P... Is that normal?
In response to Trotte-Menu: I did as you said, but unfortunately, when I click "open with", I only have 4 choices: Adobe Acrobat Reader DC; Adobe Acrobat touch, Firefox, Edge. There is also "open in another application" but it shows the same options.
So I don't have the means to open with Libre Office
Otherwise, I notice something strange, in my window where my folders and files related to this book are: in the "Type" column, my Libre Office text files are listed as "Open Document Text" but my PDF files are listed as Microsoft Edge P... Is that normal?
Always me, I couldn't come back here sooner.
So, Jee Pee, does that mean we have the choice to associate a pdf with something? I didn't notice that the question was asked of me... Is it in the general settings of Windows? And does this association matter when submitting the file in ODT or do we not care? If we don't care, that's fine with me, whether it's Edge or something else, it doesn't matter to me, but I'm asking the question because I like (trying to) understand.
So, Jee Pee, does that mean we have the choice to associate a pdf with something? I didn't notice that the question was asked of me... Is it in the general settings of Windows? And does this association matter when submitting the file in ODT or do we not care? If we don't care, that's fine with me, whether it's Edge or something else, it doesn't matter to me, but I'm asking the question because I like (trying to) understand.
Well then, here we go (and the end?)
I just converted it with online-convert.
Quick (they told me "files over 1000 KB will take time, but less than 20 seconds later I had my file!!!)
Well: however, I got something quite strange typographically:
on the same page, and from a uniform style text, meaning neither bold, italics, nor different font sizes..., I find myself with normal lines, bold lines, lines in 2 larger sizes... Quite funny, and I must say, not ugly if it had been a document for graphic purposes :-).
That said, this oddity doesn’t matter since I was simply trying to "count the words," which is now done!
And maybe it won’t happen with the other sites you gave me, which I will test.
And I thought (after the fact, as often!) that since the book has been published (and is selling quite well, summer being what it is since it's a small cozy mystery easy to read on the beach or elsewhere :-)) the publisher could have simply given me the info!!!
But there's a silver lining to every misfortune: I learned some things from this thread, things that I am eager to note in my "useful computer tips" file
If Jee Pee could respond to me about the "associate" story, it would help me complete, but otherwise, I want to say a very, very big thank you again
Have a great end of the week!
I just converted it with online-convert.
Quick (they told me "files over 1000 KB will take time, but less than 20 seconds later I had my file!!!)
Well: however, I got something quite strange typographically:
on the same page, and from a uniform style text, meaning neither bold, italics, nor different font sizes..., I find myself with normal lines, bold lines, lines in 2 larger sizes... Quite funny, and I must say, not ugly if it had been a document for graphic purposes :-).
That said, this oddity doesn’t matter since I was simply trying to "count the words," which is now done!
And maybe it won’t happen with the other sites you gave me, which I will test.
And I thought (after the fact, as often!) that since the book has been published (and is selling quite well, summer being what it is since it's a small cozy mystery easy to read on the beach or elsewhere :-)) the publisher could have simply given me the info!!!
But there's a silver lining to every misfortune: I learned some things from this thread, things that I am eager to note in my "useful computer tips" file
If Jee Pee could respond to me about the "associate" story, it would help me complete, but otherwise, I want to say a very, very big thank you again
Have a great end of the week!
Online-convert works very well for practically all formats.
Generally speaking, when converting a document, the formatting of that document is not preserved, and that's somewhat normal; how do you expect software to recognize white spaces and incorporate them into formatting? It's nearly impossible to assign gaps to any specific formatting.
However, anything that is a typographic character will be recognized by OCR, but not bold, italic, etc.
But isn't the main goal to retrieve the text? Can you imagine retyping everything by hand?
And in the future, when you delete a file, think carefully about what you're doing before clicking YES.
Nowadays, the capacity of disks can be quite large, both internally and externally, so keeping a few backups somewhere won't harm your computer's health, but it will certainly help yours by preventing you from spending hours redoing everything if it's possible, of course.
Come on, hang in there; this unfortunate episode has taught you a few things at least. Making backups to preserve your work.
Generally speaking, when converting a document, the formatting of that document is not preserved, and that's somewhat normal; how do you expect software to recognize white spaces and incorporate them into formatting? It's nearly impossible to assign gaps to any specific formatting.
However, anything that is a typographic character will be recognized by OCR, but not bold, italic, etc.
But isn't the main goal to retrieve the text? Can you imagine retyping everything by hand?
And in the future, when you delete a file, think carefully about what you're doing before clicking YES.
Nowadays, the capacity of disks can be quite large, both internally and externally, so keeping a few backups somewhere won't harm your computer's health, but it will certainly help yours by preventing you from spending hours redoing everything if it's possible, of course.
Come on, hang in there; this unfortunate episode has taught you a few things at least. Making backups to preserve your work.
Dhyd, if you read me correctly, you would have seen that I am not complaining at all about the different rendering of my text after conversion (on the contrary, I find it quite charmingly fun). I clearly specify in my message that it didn’t bother me since the goal was simply to count the words.
The important thing, as you said (and as I thought I was clearly implying), was to "retrieve my text" in order to count the words.
Text that indeed, I did not see myself retyping because... I would have simply copied and pasted it, (since that can be done from a pdf)
As for backups... believe me, I make them. I have been a copywriter in advertising and direct marketing, I have written entire catalogs, and my automatic saves occur every 2 minutes. I save, I save, I save, not only the work in progress, but all previous versions of that same work, born from different ideas that the advertiser might have preferred. Backups that are well organized in files and then folders.
But you see, Dhyd, there are simply, in human nature, little things that we can call "distraction", "clumsiness", and even more, "wrong manipulation".
It’s one of those "wrong manipulations" that brought me here, where kind souls therefore rescued me without giving me... um... that gentle little moral lesson (are you a teacher?) :-)
Thanks again to everyone (I naturally include you in that, Dhyd, because you also took the time to propose a solution, and I thank you just like the others)
The important thing, as you said (and as I thought I was clearly implying), was to "retrieve my text" in order to count the words.
Text that indeed, I did not see myself retyping because... I would have simply copied and pasted it, (since that can be done from a pdf)
As for backups... believe me, I make them. I have been a copywriter in advertising and direct marketing, I have written entire catalogs, and my automatic saves occur every 2 minutes. I save, I save, I save, not only the work in progress, but all previous versions of that same work, born from different ideas that the advertiser might have preferred. Backups that are well organized in files and then folders.
But you see, Dhyd, there are simply, in human nature, little things that we can call "distraction", "clumsiness", and even more, "wrong manipulation".
It’s one of those "wrong manipulations" that brought me here, where kind souls therefore rescued me without giving me... um... that gentle little moral lesson (are you a teacher?) :-)
Thanks again to everyone (I naturally include you in that, Dhyd, because you also took the time to propose a solution, and I thank you just like the others)
A file extension (.odt, .doc, .txt, .pdf, ...) can be associated with a default program. If you click on a file in the explorer, the associated program will launch to open the file.
A .pdf will be associated with software capable of opening it. Edge can do this, Firefox too. Or you can have specialized PDF software: Adobe Acrobat, Foxit.
You can change the default program associated with the extension:
- in the explorer, right-click on a pdf
- Open with, there you should find programs capable of opening a pdf
- or you have the option "Choose another app" to look for others
- by going through this last option, you can choose a program, and by checking the box "Always use this app to open .pdf files," make it the default program for this type of files.
A .pdf will be associated with software capable of opening it. Edge can do this, Firefox too. Or you can have specialized PDF software: Adobe Acrobat, Foxit.
You can change the default program associated with the extension:
- in the explorer, right-click on a pdf
- Open with, there you should find programs capable of opening a pdf
- or you have the option "Choose another app" to look for others
- by going through this last option, you can choose a program, and by checking the box "Always use this app to open .pdf files," make it the default program for this type of files.
Ah okay, I understand. I guess opening a PDF in one program rather than another is interesting for people who want to work on that PDF, with each associated program providing different options that another might not offer, but at my level, I don't care. Thank you very much for this insight, Jee Pee, and for your patient and very clear answers.
I don’t know if we can consider this thread resolved, I think so, while waiting for some future magic program that allows this manipulation as simply as Trotte-menu seemed to imply (I really don't see how he/she managed to have Libre Office in the "open with" list. Is there a way? If so, that would solve everything, but it seems to me that it would be widely known, right?...)
I therefore think I’ll mark it as Resolved, and I again thank all the people who took the time to care about my question.
Have a good end of the week!
I don’t know if we can consider this thread resolved, I think so, while waiting for some future magic program that allows this manipulation as simply as Trotte-menu seemed to imply (I really don't see how he/she managed to have Libre Office in the "open with" list. Is there a way? If so, that would solve everything, but it seems to me that it would be widely known, right?...)
I therefore think I’ll mark it as Resolved, and I again thank all the people who took the time to care about my question.
Have a good end of the week!
Hello,
When created, a LibreOffice file can be exported as a hybrid PDF.
To do this, you need to check the hybrid PDF option (ODF file embedded) in File > Export As > Export as PDF > General tab > General.
It is possible to reopen this PDF in the module that was used to create it.
Example with a .odt file
If the option is not checked, right-clicking Open with > LibreOffice opens it with Draw.
When created, a LibreOffice file can be exported as a hybrid PDF.
To do this, you need to check the hybrid PDF option (ODF file embedded) in File > Export As > Export as PDF > General tab > General.
It is possible to reopen this PDF in the module that was used to create it.
Example with a .odt file
If the option is not checked, right-clicking Open with > LibreOffice opens it with Draw.
Trotte-menu, I have correctly converted to hybrid PDF and mentions as you say, but I am sorry, I absolutely do not have the option "open with Libre Office."
But well, with online converters, for the very small need I had to reconvert, following my mistake (deleting my original .odt file!...) the solutions offered suited me perfectly. You are nonetheless lucky to have the "open with Libre Office" option, and I'm not sure other people have that choice; otherwise, it's so simple and basic that I think it would have been suggested to me...
Best regards.
But well, with online converters, for the very small need I had to reconvert, following my mistake (deleting my original .odt file!...) the solutions offered suited me perfectly. You are nonetheless lucky to have the "open with Libre Office" option, and I'm not sure other people have that choice; otherwise, it's so simple and basic that I think it would have been suggested to me...
Best regards.
