Read a docx document with python3

Solved
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   Ambassadeur -  
mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention   -

Hello,

I'm also starting out with doc, I want to read a Word file in Python:

import docx doc = docx.Document("L'ANGE GABRIEL") print (read(doc))

ModuleNotFoundError: No module named 'docx'

Where is the error? Thank you.


17 réponses

mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention   7 927
 

Hello,

I’m marking the topic as resolved (see #65), @quentin2121 StatutMembre is thinking about doing this in the future as explained here.

Thanks to everyone (especially 633049 and @Diablo76 StatutMembre) who helped quentin2121 find the answer. Below is a summary of this long discussion.

Summary of the problem

Reading a docx file in python3 on Windows.

Summary of the solution

  • Install pycharm.
  • Create a new project in pycharm.
    • It is recommended to create a virtual environment for the project using the python 3.11 interpreter, as python 3.12 is currently causing issues.
    • If python3.11 is not installed, see #59 for instructions.
  • Install python-docx. Be careful not to install docx, but python-docx this link). Two methods are possible
    • via the Python Package menu in pycharm
    • via the terminal (shell) (not in the python interpreter!) of pycharm, by invoking pip yourself:
      • python -m pip install --upgrade pip # Upgrading pip pip install python-docx # Installing the package
    • If there is a pip error, particularly:
      When import docx in python3.3 I have error ImportError: No module named 'exceptions'

      ... you should consider using another version of python, such as python 3.11. The above error currently affects python3.12.

Good luck and thanks to everyone

3
Anonymous user
 

In the bottom right corner of Pycharm, you can see the interpreter in use (in my screenshot 3.11)

You click on it and you can see all the installed interpreters (in my screenshot 3.9 and 3.11) as well as menus to configure everything.

There, either you choose 3.11, or you install it if it hasn't been done, and that will be fine at least for your current project.

As we have already told you, Pycharm creates a virtual environment for each project. All the dependencies of the project are in this environment (For example, docx if you install it, will only be accessible to this project, from another one you would have to install it again).

But there are settings that are global to Pycharm, and I don't know this IDE well enough to tell you whether the interpreter is global or not.

But since it says Python 3.11 (SAGC) which is the name of my project, I tend to think it is not a global setting.


When I was little, the Dead Sea was only sick.
George Burns

1
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 

When I click in the bottom right, I only see Python 3.12. I searched for the path of 3.11, which is as follows:

C:\Users\quent\AppData\Local\Programs\Python\Python311

I can't get it to register in PyCharm as the interpreter! Is there a way to install it?

0
Diablo76 Posted messages 344 Registration date   Status Membre Last intervention   140 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Yes, you can add an interpreter to your existing project via the "Add new interpreter" menu

Then, you select your existing project and enter the path to python 3.11 (C:\Users\quent\AppData\Local\Programs\Python\Python311)

1
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > Diablo76 Posted messages 344 Registration date   Status Membre Last intervention  
 

Once on the path C:\Users\quent\AppData

I can't go any further, and click OK to validate the path?

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Basically, Pycharm doesn't want to take Python 311! Maybe because we often install Python first, then Pycharm?

0
Anonymous user
 

Have you installed the module on your PC?


When I was little, the Dead Sea was only sick.
George Burns

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 

I thought this had been done on Pycharm during my classes. Please give me the instructions.

with:

pip install python-docx

it's giving an error!

0
Anonymous user
 

What mistake?


When I was little, the Dead Sea was only sick.
George Burns

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 
0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 
import docx doc = docx.Document # doc.read("THE ANGEL GABRIEL") 

In fact, the error may be that my document is on drive D, how can I access it from Python?

By entering cd dir on line 2?

I added my document in PythonProject via the Windows file explorer on drive C, it doesn't work any better.

In my courses, the docx module is already installed in PyCharm, as we don't add it at the start of the exercise.

0
Phil_1857
 

Hello,

No matter where the file is located, put the path:

import docx doc = docx.Document("C:\\Phil\\Dev\\Python\\tests\\Word_test.docx") for para in doc.paragraphs: print('Paragraph:\n',para.text)
0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 
0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

I just want the text to be displayed in Pycharm, read, "read", if that's possible!

Do I need to enter the path to Pycharm?

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   Ambassadeur 1 311
 
import docx doc = docx.Document("C:\\Users\\quent\\PycharmProjects\\pythonProject\\L'ANGE_GABRIEL.docx")

I have this error:

    doc = docx.Document("C:\\Users\\quent\\PycharmProjects\\pythonProject\\L'ANGE_GABRIEL.docx")

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape


0
jee pee Posted messages 31926 Registration date   Status Modérateur Last intervention   9 950
 

do not use the anti-slash which is the escape character but the slash / in the path

then you should start with a file with a simple name located in the same directory as the code: doc.docx

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > jee pee Posted messages 31926 Registration date   Status Modérateur Last intervention  
 

I did this, but I'm still getting errors

I was taught to do it like this:

import docx
doc = docx.Document()
# doc.add_paragraph("hello word")
doc.save("hello word.docx")
paraObj1 = doc.add_paragraph("second paragraph")
paraObj2 = doc.add_paragraph("third paragraph")

paraObj1.add_run("additional text")

doc.save("multiple_paragraphs.docx")

0
goulu > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Good evening,

In programming, when you have errors, you need to indicate these errors. Python is great for that, so please post the complete traceback of the errors, and especially no images, plain text is more than sufficient.

You should probably follow a tutorial regarding this module because right now, you seem to be completely struggling.

For example, this one:

https://www.geeksforgeeks.org/python-working-with-docx-module/

(there are other tutorials at the bottom of the page regarding this module)

Or even:

https://stackabuse.com/reading-and-writing-ms-word-files-in-python-via-python-docx-module/

Good luck.

0
georges97 Posted messages 14512 Registration date   Status Contributeur Last intervention   2 899 > goulu
 

Good evening goulu,

I hope Quentin will follow your excellent advice.
 

As for the line spacing, it seems to be a bug that hasn't been fixed for over a year, which affects the comments in posts.

You just have to press shift + enter to create the line spacing.

1
goulu > georges97 Posted messages 14512 Registration date   Status Contributeur Last intervention  
 

Oh okay, thanks for the info, I didn't know that ;)

0
Phil_1857
 

Hello,

Have you tried this:

import docx doc = docx.Documents("C:/Users/quent/PycharmProjects/pythonProject/L'ANGE_GABRIEL.docx") 
0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 
0
Anonymous user > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

At the risk of repeating what has already been asked of you by Goulu

In programming, when we have errors, we need to indicate those errors, Python is great for that, so please post the complete traceback of the errors, and especially no images, plain text is more than enough

1
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > Anonymous user
 

  File "C:\Users\quent\PycharmProjects\pythonProject\l'ange gabriel.py", line 1, in <module>

    import docx

  File "C:\Users\quent\PycharmProjects\pythonProject4\venv\Lib\site-packages\docx.py", line 21, in <module>

    from PIL import Image

  File "C:\Users\quent\PycharmProjects\pythonProject4\venv\Lib\site-packages\PIL\Image.py", line 39, in <module>

    import tempfile

  File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\tempfile.py", line 45, in <module>

    from random import Random as _Random

  File "C:\Users\quent\PycharmProjects\pythonProject\random.py", line 2

    generate a random number(1-10)

              ^^

SyntaxError: invalid syntax

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   Ambassadeur 1 311
 

I’m starting to wonder if I have the docx module installed on Pycharm? What path or console command can I use to check it, please?


0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   Ambassadeur 1 311
 

Hello,

Is there no one to help me? I'm stuck here.


0
Anonymous user
 

Small reminders:

  1. We are volunteers
  2. Our lives take precedence over any help on the forum
  3. When we ask you questions to better understand your problem, you don't have to respond quickly; when you do respond...

So if you want to continue receiving help from us, a little patience and respect are required.

A priori yes, the module is installed in your project's virtual environment; we can see in the call stack that the error goes through

 File "C:\Users\quent\PycharmProjects\pythonProject4\venv\Lib\site-packages\docx.py",

And if we continue to follow the stack, it intends to use the Random module.

I haven't checked and I don't have time right now, but the call path surprises me.

Did you perhaps make the poor decision to write your own random module, with a bug in it?

1
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > Anonymous user
 

Excuse me for my impatience in resolving my issue.

I haven't done anything in the Random module, apart from using it during my classes in exercises to discover it.

Thank you for your interest in my topic, have a great Sunday!

0
Anonymous user
 

On my PC that has Pycharm, I do have 2 files named random.py but

  • they are not organized in a folder that looks like a personal folder
  • the comments are not in French.

I therefore reformulate my question, did you write the file "C:\Users\quent\PycharmProjects\pythonProject\random.py"


When I was little, the Dead Sea was only sick.
George Burns

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 

I checked in my Pycharm file explorer, it was indeed me who created it for a course that covered the Random module.

from random import randint # generate a random number(1-10) 

I only copied the beginning of the code, I have a "syntax error" on line 2.

0
Anonymous user
 

So for some reason I don't know, PyCharm thinks the random module is your file and not the original one.

We need to make sure the real one is taken into account, maybe reinstall it.

As for

I only copied the beginning of the code, I have an error with "syntax error" on line 2.

yes it is doubly obvious, first because in your other program that is exactly what the error message is telling you. And moreover, a comment starts with #.


When I was young, the Dead Sea was just sick.
George Burns

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 

I deleted the file Random.py in my Windows explorer (after backing it up elsewhere), but while testing my code L'ANGE_GABRIEL.py, there are still errors:

File "C:\Users\quent\PycharmProjects\pythonProject\L'ANGE_GABRIEL.py", line 1, in <module>

import docx

File "C:\Users\quent\PycharmProjects\pythonProject4\venv\Lib\site-packages\docx.py", line 30, in <module>

from exceptions import PendingDeprecationWarning

ModuleNotFoundError: No module named 'exceptions'

By the way, what command should I use in the console to reinstall Random?

0
mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention   7 927
 

Hello,

I haven't read everything, so my apologies if what I'm saying is a repeat:

  • As explained in this link, for modern versions of Python, you need to install python-docx and not docx. This will solve the error you mentioned in #28.
    pip remove docx pip install python-docx 
  • Regarding #26 and the naming of your function
    • A Python function name should not contain spaces.
    • I strongly discourage the use of non-ASCII characters (typically accented characters). So rename your function "générer un nombre aléatoire" to "generer_un_nombre_aleatoire". If you really wanted to use Unicode characters, your file would need to start with the appropriate header. Example: 
      #!/usr/bin/env python3 # -*- coding: utf-8 -*- def f(): print("Hello world")
    • I recommend naming your functions in English; this helps develop good habits for the future and eliminates the use of accented characters by default.
    • In your particular case, the function you are trying to implement is probably random.randint
      from random import randint n = randint(1, 10) # 1 <= n <= 10
  • Instead of taking screenshots of your call stack, copy-paste it in full (as you did in #28). Also, please include it in a code section as explained here.

Good luck

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 
 File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\code.py", line 63, in runsource code = self.compile(source, filename, symbol) File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\codeop.py", line 153, in __call__ return _maybe_compile(self.compiler, source, filename, symbol) File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\codeop.py", line 73, in _maybe_compile return compiler(source, filename, symbol) 
0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Above console command result pycharm:

pip remove docx

0
mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention   7 927 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

The trace seems incomplete to me. At worst, check if docx is installed (with pip list). Then install python-docx with pip install python-docx.

Good luck

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention  
 

pip list:

 File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\code.py", line 63, in runsource code = self.compile(source, filename, symbol) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\codeop.py", line 153, in __call__ return _maybe_compile(self.compiler, source, filename, symbol) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\codeop.py", line 73, in _maybe_compile return compiler(source, filename, symbol) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\quent\AppData\Local\Programs\Python\Python312\Lib\codeop.py", line 118, in __call__ codeob = compile(source, filename, symbol, self.flags, True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "<input>", line 1 pip list ^^^^
0
mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention   7 927 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Given the error, I have the impression that you are entering your pip commands in a Python script, whereas you should be typing them in a terminal (for example, the MS-DOS commands or the terminal tab if your IDE offers one).

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   Ambassadeur 1 311
 

Maybe Pycharm is bugging, maybe it needs to be uninstalled, reinstalled? While saving my projects?


0
Diablo76 Posted messages 344 Registration date   Status Membre Last intervention   140
 

Hello,

First, you need to select the Terminal in your toolbar on the left:

Another method to install/uninstall a module is the package manager:

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > Diablo76 Posted messages 344 Registration date   Status Membre Last intervention  
 

I did the same as you for the package and:

0
Anonymous user
 

I opened a project in PyCharm where I have docx.

With the package manager (as @Diablo76 showed you), I can see it in the list of installed components.

Do you see it?

I'm asking you this because earlier your error trace seemed to go through this package. So maybe it wasn't the right package as @mamiemando suggested.

In that case, maybe you should uninstall it first.

To the right of the package manager, you can click on the three dots.

Otherwise, in the PyCharm terminal (as Diablo also showed you), with pip list, you can also see the installed packages in your project.

To be sure, you should see

 (venv) followed by the path to your project

And the install commands have already been given to you, here's the uninstall one:

pip uninstall docx


When I was little, the Dead Sea was just sick.
George Burns

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 

Hello,

In the packages, I have this installed:

When I try to install python-docx, I get an error message, failed:

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

When I run pip list in the terminal, I get:

  File "<frozen runpy>", line 198, in _run_module_as_main

  File "<frozen runpy>", line 88, in _run_code

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Scripts\pip.exe\__main__.py", line 4, in <module>

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_internal\cli\main.py", line 9, in <module>

    from pip._internal.cli.autocompletion import autocomplete

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_internal\cli\autocompletion.py", line 10, in <module>

    from pip._internal.cli.main_parser import create_main_parser

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_internal\cli\main_parser.py", line 8, in <module>

    from pip._internal.cli import cmdoptions

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_internal\cli\cmdoptions.py", line 23, in <module>

    from pip._internal.cli.parser import ConfigOptionParser

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_internal\cli\parser.py", line 12, in <module>

    from pip._internal.configuration import Configuration, ConfigurationError

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_internal\configuration.py", line 20, in <module>

    from pip._internal.exceptions import (

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_internal\exceptions.py", line 7, in <module>

    from pip._vendor.pkg_resources import Distribution

  File "C:\Users\quent\PycharmProjects\pythonProject1\venv\Lib\site-packages\pip\_vendor\pkg_resources\__init__.py", line 2164, in <module>

    register_finder(pkgutil.ImpImporter, find_on_path)

                    ^^^^^^^^^^^^^^^^^^^

AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?

0
Diablo76 Posted messages 344 Registration date   Status Membre Last intervention   140 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Ok, update your packages by clicking on the links and try reinstalling python-docx:

0
goulu > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Hello Quentin.

Sorry for you, but you have a version of Python that's too recent XD, you'll need to start over using an earlier version.

https://stackoverflow.com/questions/77364550/attributeerror-module-pkgutil-has-no-attribute-impimporter-did-you-mean#answer-77364602

0
mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention   7 927 > Diablo76 Posted messages 344 Registration date   Status Membre Last intervention  
 

The problem seems to be related to python3.12 (see here). This discussion suggests reverting to either python3.11 or adopting this solution.

1
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   Ambassadeur 1 311
 

Thank you to Whismeril, goulu, and mamiemando for taking an interest in my problem. I uninstalled Python 3.12 and installed Python 3.11. When I run pip install python-docx, I get:

(venv) PS C:\Users\quent\PycharmProjects\pythonProject> pip install python-docx

No Python at '"C:\Users\quent\AppData\Local\Programs\Python\Python312\python.exe'

(venv) PS C:\Users\quent\PycharmProjects\pythonProject>

Python is installed in the C of app data, local, during the install, at the very beginning, I checked "install the path", at the bottom, an option, could this be the issue?


0
Diablo76 Posted messages 344 Registration date   Status Membre Last intervention   140
 

Hello,

If it's still the same project, you need to specify the new version of Python (3.11) in its settings.

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > Diablo76 Posted messages 344 Registration date   Status Membre Last intervention  
 

Thank you for helping me too. In fact, do you need to specify the version of Python in PyCharm? Or for each project, right-click, settings?

0
Anonymous user
 

And if you copy and paste the path into the text area of the explorer window?


When I was little, the Dead Sea was only sick.
George Burns

0
Diablo76 Posted messages 344 Registration date   Status Membre Last intervention   140
 

Yes, Ctrl/C Ctrl/V work, I just tested.

After that, I can't help any further, I'm not on Windows...

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311
 

I managed to set up the Python 3.11 interpreter by creating a new project. However, when typing in the Pycharm terminal: pip install python-docx, I get this:

 No Python at '"C:\Users\quent\AppData\Local\Programs\Python\Python312\python.exe'

Version 3.12 is uninstalled! Finally, I have this left:

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   1 311 > quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention  
 

Oh, I managed to do it by using the Pycharm package, I installed python-docx, FINALLY!

0
quentin2121 Posted messages 9063 Registration date   Status Membre Last intervention   Ambassadeur 1 311
 

Hello Mamiemando,

Thanks for closing the topic. But I hadn't done it because I still have questions. I installed pyinstaller on my PC to run Python projects on any PCs not equipped with Python. It works for other files, but not for the ange_gabriel?

Actually, it opens fine in Pycharm with the lines I included in the project, but I wanted all the content of the text file to open in the Pycharm execution console! Then on a PC after I have directed it to pyinstaller.

Do I need to add any other lines of code for it to display in Pycharm? Will it be possible to create a .exe file with pyinstaller? Thanks again for the help!


0
mamiemando Posted messages 33540 Registration date   Status Modérateur Last intervention   7 927
 

Hello quentin2121: this question is not directly related to the initial topic, so you should ask your question in a new discussion if you want more details. But in essence, either the target machine has a Python interpreter and the necessary dependencies for the topic, or you need to transform your project into a "stand alone" executable, for example with pyinstaller.

0