Scripts and tricks I used to scan a large number of photo albums, and automatically do a bunch of processing of photos to crop them and make PDFs of photo albums.
Creating PDFs of the Albums
After taking “from the top” pictures of each page of the album, I used this script called
convert-images.sh:
# example values:
# FULLX=1600
# FULLY=940
# TOPOFF=195
# LEFTOFF=188
FULLX=${1}
FULLY=${2}
LEFTOFF=${3}
TOPOFF=${4}
#EXTRAPARAMS="-brightness-contrast -20x20"
EXTRAPARAMS=""
for INFILE in $(find raw-scans -type f -iname '*.jpg' | sort); do
OUTFILE=auto-cropped/$(basename "${INFILE}" | sed 's|.JPG||')
#if [[ "$(basename "${INFILE}")" == "0007_2024-09-22.JPG" ]]; then
# LEFTOFF=173
#elif [[ "$(basename "${INFILE}")" == "0022_2024-09-22.JPG" ]]; then
# LEFTOFF=102
#fi
HALFX=$(($FULLX / 2))
OFFXR=$(($HALFX + $LEFTOFF))
LEFT="${HALFX}x${FULLY}+${LEFTOFF}+${TOPOFF}"
RIGHT="${HALFX}x${FULLY}+${OFFXR}+${TOPOFF}"
echo -e "$(basename "${INFILE}")\t${LEFT}\t${RIGHT}"
magick "${INFILE}" -crop "${LEFT}" ${EXTRAPARAMS} "${OUTFILE}-left.jpg"
magick "${INFILE}" -crop "${RIGHT}" ${EXTRAPARAMS} "${OUTFILE}-right.jpg"
done
And ran these commands for each zip of page pictures:
unzip raw-book-scans/Job\ Drama\ c.zip -d job-drama
cd job-drama/
mv Job\ Drama raw-scans
cd raw-scans
mogrify -rotate 180 *.JPG
cd ..
open raw-scans/*
rm -rf auto-cropped && mkdir -p auto-cropped && rm $(basename $(pwd)).pdf auto-cropped/* && ../convert-images.sh 1660 1030 15 139 && magick auto-cropped/*.jpg ./$(basename $(pwd)).pdf && open $(basename $(pwd)).pdf
mv job-drama.pdf ~/GoogleDrive/to-file-server/family/memories/walter-shirley-memories/completed-albums/
cd raw-scans/
zip ../$(basename $(dirname $(pwd)))-raw-book-scans.zip *
cd ..
mv job-drama-raw-book-scans.zip ~/GoogleDrive/to-file-server/family/memories/walter-shirley-memories/raw-album-scans/
Automatically Cropping Large Numbers of Scanned Pictures
After we made a PDF of each album, we deconstructed the album, removing all the pictures that were worth saving. Pictures that were large enough easily fed through the scanner feeder, giving me one JPEG per pic. Then I just needed to crop the whitespace on those. When a large batch of them was all the same size, I simply used ImageMagick to crop each of them to a specified set of dimensions.
Note: mogrify (an ImageMagick command to use in lieu of magick) edits them in-place
(not writing to a new file), so if you’re very confident in your crop boundaries, you
could use it; but if you want to test the crop dimensions, then use magick and write to
a new file for each, for example:
mkdir auto-cropped
for INFILE in $(find raw-scans -type f -iname '*.jpg' | sort)
do magick "${INFILE}" -crop 1614x1072+30+186 auto-cropped/$(basename "${INFILE}" | sed 's|JPG$|jpg|')
done
Splitting Scanned Pictures Into Individual Images With ImageMagick
But other pictures wouldn’t feed through because they were too small, or the fancy scalloped edges on them would jam the scanner up. So, I scanned those on the flatbed scanner. Then you’d have 8-12 pictures per scanned page, and need to split them into their individual pictures. As with most amazing tools like ImageMagick and ffmpeg, there is a way to automate the work, and there’s someone out there on the internet that has written a script to help out. I found this answer on StackOverflow that formed the basis of my script, below.
INFILE="${1}"
INNAME=`magick -ping "$INFILE" -format "%t" info:`
TMPFILE="$(dirname "${INFILE}")/${INNAME}-tmp.png"
OLDIFS=$IFS
IFS=$'\n'
echo -e "$(date)\t${INFILE}\tmaking blurred image"
ARR=(`magick "$INFILE" -blur 0x5 -auto-level -threshold 99% -type bilevel +write "${TMPFILE}" \
-define connected-components:verbose=true \
-connected-components 8 \
null: | tail -n +2 | sed 's/^[ ]*//'`)
NUM=${#ARR[*]}
echo -e "$(date)\t${INFILE}\tfound ${NUM} blocks"
IFS=$OLDIFS
for ((I=0; I<NUM; I++)); do
COLOR=`echo ${ARR[$I]} | cut -d\ -f5`
BBOX=`echo ${ARR[$I]} | cut -d\ -f2`
DIMX=$(echo ${BBOX} | sed 's|x.*||')
DIMY=$(echo ${BBOX} | sed 's|^[0-9]\+x||' | sed 's|\+.*||')
echo "color=$COLOR; bbox=$BBOX"
ACTION="process"
if [[ $DIMX -lt 50 ]] || [[ $DIMY -lt 50 ]]; then
ACTION="skip"
fi
echo -e "$(date)\t${INFILE}\t${ARR[$I]}\t${COLOR}\t${ACTION}\t${BBOX}"
if [ "${ACTION}" = "process" ] && [ "$COLOR" = "gray(0)" ]; then
magick "$INFILE" -crop $BBOX +repage -fuzz 10% -trim +repage "$(dirname "${INFILE}")/${INNAME}_split_${I}.jpg"
fi
done