Script: multiple documents into a single PDF + directory structure

rb12000 Posted messages 22 Registration date   Status Member Last intervention   -  
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   -
Hello,

I just recovered a mini "EDMS" (Electronic Document Management System), but the scans are page by page (only front) in X (X being the number of pages) .jpg files. I would like to automatically generate these PDF files using a CMD script (batch) or an Excel macro. I have already generated the directory structure using two Excel macros.



Can you please help me?

Thank you.

4 answers

barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930
 
I found the solution, and it took me quite a bit of time ;)

I made you a Batch script to convert your documents, here’s the code:

@echo off

:: testing if nconvert.exe is present

if not exist %windir%\system32\nconvert.exe goto :eof

:: source directory containing the scanned documents

set rep_source=D:\RECUP_GED\3000057\Administratif

:: testing if the source directory exists

cd %rep_source%
if not %errorlevel%==0 goto :eof

if not exist "%rep_source%\Documents_PDF" mkdir "%rep_source%\Documents_PDF"

:: defining a temporary directory

set dir_temp=D:\TMP_jpg_to_pdf

:: grouping all pages of each document

for /f "tokens=1,2,3 delims=_." %%a in ('dir /b /a-d-s-l "*.jpg"') do (

if not exist %dir_temp%\%%a mkdir %dir_temp%\%%a

copy %%a_%%b.%%c %dir_temp%\%%a\ > nul

)

echo.

:: processing documents one by one

cd %dir_temp%

for /f "tokens=*" %%a In ('dir /b /ad "*.*"') do (

set fichier=%%a

cd %dir_temp%\%%a

set /a compteur=0

Setlocal enableextensions enabledelayedexpansion

For /r %%i In (*.jpg) Do (set /a compteur+=1)

if !compteur! GTR 1 (nconvert -in jpeg -out pdf -multi -o res_!fichier!.pdf *.jpg) else (nconvert -in jpeg -out pdf -o res_!fichier!.pdf *.jpg)

echo.

move /y "res_!fichier!.pdf" "!rep_source!\Documents_PDF" > nul

)

Endlocal

cd %USERPROFILE%

rmdir /s /q "%dir_temp%"

explorer /select,"%rep_source%\Documents_PDF"

:eof

1
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930
 
First, you need to download the tool Nconvert available here:

https://www.xnview.com/fr/nconvert/

You need to retrieve the file nconvert.exe and copy it to C:\Windows\System32


ps1: I want to specify that I have tested and retested the script, it works wonderfully.

ps2: The script does not touch your jpg files, so there is nothing to worry about.
0
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930 > barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention  
 
What is the document naming convention?

For example, are all the files (008816) you show in your message one and the same document?

If the answer is yes, you can use my script without any worries.
0
rb12000 Posted messages 22 Registration date   Status Member Last intervention  
 
Thank you very much for taking the time to respond to me. I will take a closer look at this today and keep you updated. Have a great day!!
0
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930 > rb12000 Posted messages 22 Registration date   Status Member Last intervention  
 
Thank you, have a nice day as well.
0
rb12000 Posted messages 22 Registration date   Status Member Last intervention   > barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention  
 
Salut, désolé pour le retard ! Ça fonctionne super bien en test ! Seulement, mon arborescence est faite ainsi : D:\RECUP_GED\%NumDoss% avec %NumDoss% comprenant 17 000 dossiers, et dans chaque dossier de 1 à 3 sous dossiers. Ce sont dans ces sous-dossiers que se trouvent les images à convertir.
Saurais-tu faire le même principe mais avec une boucle pour mon arborescence ? Ou appeler une liste dans le batch (car j'ai cette liste) ?

Deuxième point, je remarque que les fichiers PDF générés sont extrêmement lourds (une image de 200 Ko se convertit en PDF de 8 Mo)... Aurais-tu une idée ?

Merci beaucoup dans tous les cas ;-) !!
0
rb12000 Posted messages 22 Registration date   Status Member Last intervention  
 


Here is part of the list, it is in XLSM format.
0
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930
 
Here is the beast:

@echo off

if not exist %windir%\system32\nconvert.exe goto :eof

cd D:\Recup_GED

if not %errorlevel%==0 goto :eof

for /f "tokens=*" %%K In ('dir /b /ad "*.*"') do (

for /f "tokens=*" %%E In ('dir /b /ad "%%~dpnK\*.*"') do (

if not exist "%%~dpnK\%%E\Documents_PDF" mkdir "%%~dpnK\%%E\Documents_PDF"

if not exist "%%~dpnK\%%E\TMP_jpg_to_pdf" mkdir "%%~dpnK\%%E\TMP_jpg_to_pdf"

for /f "tokens=1,2,3 delims=_." %%A in ('dir /b /a-d-s-l "%%~dpnK\%%E\*.jpg"') do (

if not exist "%%~dpnK\%%E\TMP_jpg_to_pdf\%%A" mkdir "%%~dpnK\%%E\TMP_jpg_to_pdf\%%A"

copy %%~dpnK\%%E\%%A_%%B.%%C %%~dpnK\%%E\TMP_jpg_to_pdf\%%A > nul

)

for /f "tokens=*" %%R In ('dir /b /ad "%%~dpnK\%%E\TMP_jpg_to_pdf\*.*"') do (

set fichier=%%R

set /a compteur=0

Setlocal enableextensions enabledelayedexpansion

For /r %%I In (*.jpg) Do (set /a compteur+=1)

if !compteur! GTR 1 (nconvert -in jpeg -out pdf -multi -c 5 -o res_!fichier!.pdf %%~dpnK\%%E\TMP_jpg_to_pdf\%%R\*.jpg) else (nconvert -in jpeg -out pdf -c 5 -o res_!fichier!.pdf %%~dpnK\%%E\TMP_jpg_to_pdf\%%R\*.jpg)

echo.

move /y "res_!fichier!.pdf" "%%~dpnK\%%E\Documents_PDF" > nul

Endlocal

)

rmdir /s /q "%%~dpnK\%%E\TMP_jpg_to_pdf"

)

)

explorer /select,"%CD%"

:eof

I hope you will like it ;)

I compressed the PDFs as much as possible, they are 3 to 4 times smaller than the original images.
0
rb12000 Posted messages 22 Registration date   Status Member Last intervention  
 
Thank you for your great work ;-) I’ll test this as soon as possible!!!
0
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930 > rb12000 Posted messages 22 Registration date   Status Member Last intervention  
 
You're welcome, I'm waiting for the comments ;)
0
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930 > barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention  
 
Regarding the destination folder for the PDFs, does it work for you or not?

I can group them if needed.
0
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930
 
Here is a slightly modified version where all the PDFs are grouped in a single folder, I think it's better this way:

@echo off

if not exist %windir%\system32\nconvert.exe goto :eof

set source=D:\RECUP_GED

set destination=Documents_PDF

if %source:~0,2% EQU %CD:~0,2% (cd %source% || goto :eof) else (cd /d %source% || goto :eof)

if not exist "%destination%" mkdir "%destination%"

for /f "tokens=*" %%K In ('dir /b /ad "*.*"') do (

for /f "tokens=*" %%E In ('dir /b /ad "%%~dpnK\*.*"') do (

if not exist "%%~dpnK\%%E\TMP_jpg_to_pdf" mkdir "%%~dpnK\%%E\TMP_jpg_to_pdf"

for /f "tokens=1,2,3 delims=_." %%A in ('dir /b /a-d-s-l "%%~dpnK\%%E\*.jpg"') do (

if not exist "%%~dpnK\%%E\TMP_jpg_to_pdf\%%A" mkdir "%%~dpnK\%%E\TMP_jpg_to_pdf\%%A"

copy "%%~dpnK\%%E\%%A_%%B.%%C" "%%~dpnK\%%E\TMP_jpg_to_pdf\%%A" > nul

)

for /f "tokens=*" %%R In ('dir /b /ad "%%~dpnK\%%E\TMP_jpg_to_pdf\*.*"') do (

nconvert -in jpeg -out pdf -multi -c 5 -o recup_%%R.pdf "%%~dpnK\%%E\TMP_jpg_to_pdf\%%R\*.jpg"

echo.

move /y "recup_%%R.pdf" "%destination%" > nul

)

rmdir /s /q "%%~dpnK\%%E\TMP_jpg_to_pdf"

)

)

explorer /select,"%destination%"

:eof
0
barnabe0057 Posted messages 14329 Registration date   Status Contributor Last intervention   4 930
 
I simplified and corrected the code to handle spaces in folder names; it might be useful you never know.
0