• Utility to select smaller of two file versions?

    From Philip Herlihy@21:1/5 to All on Sun Aug 4 23:00:24 2024
    I'm looking for a utility to select the smaller of two versions of a PDF document.

    I'm helping a friend compress PDFs of music scores to load onto a tablet to play from. We've found that Corel's PDF Fusion can batch process compression, usually taking about 40% off the file size. But some files come out bigger, occasionally much bigger! We'd like to use the original if it's smaller than the processed version.

    I'd thought Robocopy could likely to this - but it can't. (It will set a min or max file size to be copied/moved/merged. I've looked at Xcopy, and I can't think of a command-line (or other) utility that will do this. CoPilot will write you a PowerShell script to do it, but not (still) having learned PowerShell I'd be wary of trusting a generated script.

    Can anyone think of a utility that will do this? I'd get PDF Fusion to dump processed files in a separate folder, then we'd want to compare sizes with the folder with the originals - unless there's a better way?

    --

    Phil, London

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Burns@21:1/5 to Philip Herlihy on Mon Aug 5 12:38:52 2024
    Philip Herlihy wrote:

    I'm looking for a utility to select the smaller of two versions of a PDF document.
    You could do it using the %~z modifier to get the file size into a
    variable in a CMD batch file.

    You could use wsh/jscript/vbscript/filesystemobject but that's going
    away, so nowadays I'd probably use powershell.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Philip Herlihy@21:1/5 to All on Mon Aug 5 13:41:44 2024
    In article <lhbrudFo7tvU1@mid.individual.net>, Andy Burns wrote...

    Philip Herlihy wrote:

    I'm looking for a utility to select the smaller of two versions of a PDF document.
    You could do it using the %~z modifier to get the file size into a
    variable in a CMD batch file.

    You could use wsh/jscript/vbscript/filesystemobject but that's going
    away, so nowadays I'd probably use powershell.

    Clearly, for scripting PowerShell is the way to go. I've just never yet found time to learn it. I was something of a guru in Unix shell-scripting in my day, but there's always something pressing getting in the way of learning things I don't often have a need for these days!

    --

    Phil, London

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Burns@21:1/5 to Philip Herlihy on Mon Aug 5 14:16:52 2024
    Philip Herlihy wrote:

    Clearly, for scripting PowerShell is the way to go. I've just never yet found
    time to learn it.

    I can't say that using CoPilot as a buddy-programmer interests me much,
    but let it have a go at writing you a script, I suspect communicating to
    it how the corresponding old and PDF files will be named might be a
    challenge?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From GB@21:1/5 to Philip Herlihy on Mon Aug 5 15:57:21 2024
    On 04/08/2024 23:00, Philip Herlihy wrote:

    I'm looking for a utility to select the smaller of two versions of a PDF document.

    I'm helping a friend compress PDFs of music scores to load onto a tablet to play from. We've found that Corel's PDF Fusion can batch process compression,
    usually taking about 40% off the file size. But some files come out bigger, occasionally much bigger! We'd like to use the original if it's smaller than the processed version.

    I'd thought Robocopy could likely to this - but it can't. (It will set a min or max file size to be copied/moved/merged. I've looked at Xcopy, and I can't
    think of a command-line (or other) utility that will do this. CoPilot will write you a PowerShell script to do it, but not (still) having learned PowerShell I'd be wary of trusting a generated script.

    Can anyone think of a utility that will do this? I'd get PDF Fusion to dump processed files in a separate folder, then we'd want to compare sizes with the
    folder with the originals - unless there's a better way?

    ChatGPT came up with this.
    On the assumption The filenames are in pairs filename_1.pdf and
    filename_2.pdf.


    import os
    import shutil

    def get_file_size(filepath):
    """Returns the size of the file at filepath."""
    return os.path.getsize(filepath)

    def copy_smaller_files(source_dir, dest_dir):
    """Copies the smaller file of each pair from source_dir to dest_dir."""
    if not os.path.exists(dest_dir):
    os.makedirs(dest_dir)

    # Get a list of files in the source directory
    files = os.listdir(source_dir)

    # Group the files by their base name without the _1.pdf or _2.pdf
    suffix
    file_pairs = {}
    for file in files:
    if file.endswith('.pdf'):
    base_name = file.rsplit('_', 1)[0]
    if base_name not in file_pairs:
    file_pairs[base_name] = []
    file_pairs[base_name].append(file)

    # Iterate over the file pairs and copy the smaller file to the
    destination directory
    for base_name, pair in file_pairs.items():
    if len(pair) == 2:
    file_1 = os.path.join(source_dir, pair[0])
    file_2 = os.path.join(source_dir, pair[1])
    if get_file_size(file_1) <= get_file_size(file_2):
    smaller_file = file_1
    else:
    smaller_file = file_2
    shutil.copy(smaller_file, os.path.join(dest_dir, os.path.basename(smaller_file)))
    print(f"Copied {os.path.basename(smaller_file)} to {dest_dir}")
    else:
    print(f"Warning: Pair for base name {base_name} is
    incomplete or malformed.")

    # Example usage
    source_directory = r'C:\path\to\source_directory'
    destination_directory = r'C:\path\to\destination_directory' copy_smaller_files(source_directory, destination_directory)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David@21:1/5 to Philip Herlihy on Mon Aug 5 19:46:28 2024
    On Sun, 04 Aug 2024 23:00:24 +0100, Philip Herlihy wrote:

    I'm looking for a utility to select the smaller of two versions of a PDF document.

    I'm helping a friend compress PDFs of music scores to load onto a tablet
    to play from. We've found that Corel's PDF Fusion can batch process compression,
    usually taking about 40% off the file size. But some files come out
    bigger, occasionally much bigger! We'd like to use the original if it's smaller than the processed version.

    I'd thought Robocopy could likely to this - but it can't. (It will set
    a min or max file size to be copied/moved/merged. I've looked at Xcopy,
    and I can't think of a command-line (or other) utility that will do
    this. CoPilot will write you a PowerShell script to do it, but not
    (still) having learned PowerShell I'd be wary of trusting a generated
    script.

    Can anyone think of a utility that will do this? I'd get PDF Fusion to
    dump processed files in a separate folder, then we'd want to compare
    sizes with the folder with the originals - unless there's a better way?

    Never a Perl Monk around when you need them. ;-)

    Cheers



    Dave R

    --
    AMD FX-6300 in GA-990X-Gaming SLI-CF running Windows 10 x64

    --
    This email has been checked for viruses by Avast antivirus software. www.avast.com

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Philip Herlihy@21:1/5 to All on Wed Aug 7 12:35:05 2024
    In article <MPG.411ad8e327d88648989ae1@news.eternal-september.org>, Philip Herlihy wrote...

    In article <lhbrudFo7tvU1@mid.individual.net>, Andy Burns wrote...

    Philip Herlihy wrote:

    I'm looking for a utility to select the smaller of two versions of a PDF document.
    You could do it using the %~z modifier to get the file size into a
    variable in a CMD batch file.

    You could use wsh/jscript/vbscript/filesystemobject but that's going
    away, so nowadays I'd probably use powershell.

    Clearly, for scripting PowerShell is the way to go. I've just never yet found
    time to learn it. I was something of a guru in Unix shell-scripting in my day,
    but there's always something pressing getting in the way of learning things I don't often have a need for these days!

    Thanks for all suggestions. Scripting seems to be the way forward. I wouldn't want to go that way if there was already a command-line utility which could do this, but that doesn't seem to be an option.

    --

    Phil, London

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)