I am trying to write a Python program to transform a Word document in small font size to one in large font size with some specific formatting requirements.
So far I am failing at the first hurdle as the Word files my program creates are not being recognised as Word files in replit. When I download them, they are recognised as Word files but will not open because they are said to include corrupted content (even when they just contain just 2 words). I am working on MacOS within the replit website or app. I have installed the python-docx library.
Within my repl, the files I import or create should look like Word files (with a blue W icon). Hopefully then they will open correctly directly from repl and if downloaded.
Tbh the appearance of the file icons in an IDE (like Replit) depends more on the IDE capability to recognize the file and not on the Python code itself.
Likewise, Replit not recognizing the file doesn’t mean a problem with the file itself (you can try downloading the file and opening it with a local Word processor to see if it works correctly).
What I do recommend you to change, is that when setting the font name and size, you should do it directly on the run, not on the paragraph style.
I have tried downloading the .docx files created by my Python program but get error messages saying they are corrupted and won’t open: “Word found unreadable content in “Hello World output.docx”. Do you want to recover the contents of this document? If you trust the source of this document, click Yes.”. I’m unable to recover the content.
Thanks for your advice regarding setting the font name and size directly on the run. I’m not sure what command to use for this. I’ll experiment but do let me know if you know the syntax I should be using. Thanks again!
You don’t have to change much from your original program, just the paragraph part:
from docx import Document
from docx.shared import Pt
from docx.enum.section import WD_ORIENTATION
from docx.enum.text import WD_PARAGRAPH_ALIGNMENT
def transform_document(input_file, output_file):
# Open the input Word document
doc = Document(input_file)
# Create a new output Word document with specified settings
new_doc = Document()
section = new_doc.sections[0]
section.orientation = WD_ORIENTATION.LANDSCAPE
section.left_margin = Pt(20)
section.right_margin = Pt(20)
section.top_margin = Pt(20)
section.bottom_margin = Pt(20)
# Loop through each paragraph in the input document
for para in doc.paragraphs:
# Add the paragraph to the new document with specified formatting
new_para = new_doc.add_paragraph()
run = new_para.add_run(para.text)
run.bold = True
run.font.name = 'Arial'
run.font.size = Pt(36)
new_para.paragraph_format.line_spacing = 1.5
# Save the new document
new_doc.save(output_file)
# Provide the input and output file paths
input_file = 'Hello World input.docx'
output_file = 'Hello World output.docx'
# Call the function to transform the document
transform_document(input_file, output_file)