Digital production guidelines and workflows
Lab Policies
These policies were written to ensure the protection of the historical materials we work with and fellow colleagues. If you have any questions or need clarification, contact your supervisor.
General
- Please notify your supervisor if you are unable to make your shift.
- Your education comes before work. You will not be penalized for having to skip a shift or come in late/leave early for classes or studying. Just as long as you let your supervisor know.
- We can always reevaluate your schedule at any time.
- Do not share passwords or door codes with anyone outside of the digital production team.
- Only digital production SLA’s and library employees are allowed in the digital production lab.
- Only the lab supervisor or IT staff should install any software on the workstations.
- Please remember to take short breaks during your shift to maintain mental focus and clarity. It is always a good idea to look away from your computer screen for 30 seconds every half hour to give your eyes some rest.
- Do not come to work if you are not feeling well or are sick.
While at Work
- Make sure to clock in. Timeclock Is your responsibility. If you miss a clock in or out, just let your supervisor know and they’ll work it out with you.
- Sign into Teams chat and let your supervisor know you’ve arrived at work.
- Always wash your hands before handling materials, always. Avoid using hand sanitizer and/or lotion when handling materials or equipment.
- After cleaning the scanner glass, be sure to return the glass cleaner and wipes. Remember to dispose of your wipes so as not to contaminate any materials.
- Feel free to listen to music using headphones, but be mindful of the volume so the lab supervisor doesn’t have to shout or wave for you to hear them.
- Handle all materials appropriately and if you are unsure how to handle something, please ask the lab supervisor for assistance.
- Please be mindful of the arrangement of materials as they need to be kept in original order and not in danger of falling off a desk or scanner.
- If you have completed a task and are waiting for a new one, let your supervisor know by chat or email.
- It is imperative that food and drink is not consumed at a workstation. Contamination of materials is a risk best avoided by eating in the break room. Water is permitted at a workstation, but it must be placed on the floor in a closed container. It is good practice to drink water away from the workstation.
- Areas should always be kept clean and neat.
- Personal belongings should always be on the floor or in a chair.
Leaving Work
- Leave some time before the end of your shift to fill out the end of day report form. A link should be in the bookmarks bar.
- Please clean up your work area.
- At the end of a shift the staple removers and anti-static brushes should be placed back on the shelf.
- Make sure to update your spreadsheet before leaving.
- After completing your task, turn off your scanner and return materials to their storage. Computers should be left on.
- Make sure to clock out and sign out of accounts on the workstation. Log out or lock the workstation when leaving with: Windows key+L or the main Apple menu and “Lock Screen”.
Basic workflow
Scanning
- Lab manager assigns a set of materials, digitization settings, and handling instructions to scan technician.
- Scan tech uses Google Docs/One Drive spreadsheet to track per scan progress.
- This includes recording notes that came with the item, or if anything is written on the back of a photograph. Always include where the note is physically. Eg. “On back: note” or “On envelope: note”.
- Date scanned and initials of scan tech.
- Any other metadata assigned.
- When finished scanning for the day (regardless of box/folder completion) the files are then uploaded to the collection’s folder on the server until final quality control confirms an entire project has been completed.
- Lab supervisor will walk through the box with scan tech upon completion.
Quality Control
- Once an entire box/oversize folder of materials is digitized and uploaded to server storage, the lab manager then goes in and checks the digital images for:
- Dust, specks, any foreign artifacts
- The entire image having been scanned, rather than being cut off
- Proper scan dpi/ppi and file format
- Naming inconsistencies
- Number of files in digital folder versus number of recorded images in spreadsheet
- Correctness of spreadsheet metadata (misspellings, misplaced data)
- The lab manager will highlight image rows yellow in the spreadsheet if they need to be re-scanned and then will assign that task to the scan tech.
- Rescans should be compared to the original before replacing on the server.
- All re-scanned images replace the former preservation file, and the new scan date is updated in the spreadsheet.
Photoshop/Capture One
- A scan tech processes their scanned batch of images in Photoshop or Capture One. First they must copy them off of the server and onto their computer. This way any mistakes made can easily be rectified by copying the images back down and the server can get overwhelmed by multiple stations editing on it at the same time.
- The scan tech ensures all images are photo-edited and copied into the collection’s Intermediate folder on the server, then proceeds to their next batch of materials for scanning if they did not spot any items which might need re-scanning.
Material handling
All materials must be handled with care
Much of the damage that books and archives sustain is due to poor handling. As a partner with cultural heritage institutions and archives across the state, we have a responsibility to ensure the best possible care of the materials in the collections we are working with.
Retrieving materials for digitization
- Follow the instructions of the archivists in Special Collections.
- Hands must be “clean” before handling items in any collection.
- “Clean” means to wash and dry your hands thoroughly. Do not use lotion or hand creams after washing.
- Some materials should be handled -only- with cotton gloves. This includes: negatives and photographs. Do not use cotton gloves for handling books, paper, or glass plate negatives.
- Some materials should be handled -only- with nitrile gloves. This includes: glass plate negatives.
- Collection items should be supported at all times (e.g. if a rare book is opened, its covers must be supported from beneath, using a cradle or blocks).
- Collection items should always be removed one at a time
- Use two hands to carry large volumes or boxes.
Safeguard the integrity of archival documents by maintaining their original order
- Use only one folder from a box at a time.
- “Maintain the existing order of material” within each folder and box
- Put all materials away correctly.
Do not
- Wear jewelry or watches while scanning.
- Keep any beverages or food in your work area when handling original materials.
- Make any marks on material or erase existing marks
- Write, lean or rest on top of materials.
- Use fountain pens, felt tipped pens, or similar writing instruments alongside materials
- Touch the text or image on the page. Handle all materials by their edges when possible.
- Fold, tear or cut documents
- Make tracings of any documents
- Rest any other objects on the surface of any items
- Apply paper clips, fasteners, tape, Post-it notes or rubber bands
- Stack books more than three high.
- Place items on the floor.
- Use hand lotions before handling materials.
- Allow books or pamphlets dangle off scanners. Collection items should be supported at all times.
The following instructions are specific to our equipment setup: Phase One digital back with 120mm lens and 41mm extension tube, Kaiser light table, DT film scanning kit, and DT Element copy stand.
Notes and troubleshooting
- We measure color and luminance using LAB readouts (L = lightness, a = color scale of green to red, b = color scale of blue to yellow). A neutral mid-gray would have the LAB values of 50, 0, 0.
- Use nitrile gloves when handling negatives.
- Negatives should be scanned with the dull side (emulsion side) facing the camera (up) and the shiny side facing the light table.
- Do not remove the foam cover until the lens cap has been removed to protect anything under the camera, such as glass.
- Take care not to place anything to the left of the light table because of the cooling vent location.
- Use the blower to remove dust but it doesn’t have to be perfect. Most dust and tiny scratches will not show up in the image.
- If the camera does not respond when clicking capture or using the foot pedal, turn the camera off and back on. If it persists, you may need to restart the computer.
- If your next capture isn’t in frame as you expected, click the Crop tool to see if the selection has moved. This can occur when the previous image was rotated or cropped and the adjustments are being applied to the latest capture.
- If images aren’t loading in the right side viewer, right-click the Capture folder in the Library tool and click show in Library. This should always fix the issue.
Camera setup
- Remove the lens cap and turn the camera and the light table on.
- Place the foam covering away from the scan area.
- Open a file explorer window on the computer (Windows key + E) and click on the DigiProdWorkng folder in the navigation bar. Login with your UTK credentials.
- Open Capture One and the existing session for the project.
- Under the Camera tool, make sure the aperture is set to f8.0 (not 6.3)
- Autocolumn auto-ppi might not work because of extension tubes.
- For large at 1200 ppi with only the 120mm lens, set the copy stand column to 118.5. You can skip steps 6c-e if setting the column manually.
- For 120mm lens and extension, set the copy stand column to 88.3 (This might be different for 35mm slides). You can skip steps 6c-e if setting the column manually.
- Open Live View and set target resolution to desired ppi.
- Use copy stand controls to move the camera into visual focus.
- Use auto-focus followed by micro auto-focus.
- Visually check results. Best option is to use a target or the film housing because a photograph may not have been taken in focus by the original photographer.
- Set Base Characteristics
- Photography (this will capture the film as is for preservation purposes, vs inverting colors at scan which can give false black levels)
- Phase One iXG 100MP Flat Art Reproduction - LED DT Photon (closest profile to light box)
- Linear Scientific
- LCC (remove vignetting or bellows effect)
- Remove the film target and any film holders/glass.
- Adjust the film scanning kit’s baffles so that the light source takes up the entire field of view.
- Capture an image and place LAB readouts across the image.
- Target value for LCC is 50-60 Luminance (The L number in LAB readout).
- Increase shutter speed (eg 1/250) to get within range.
- UTK’s lightbox may only go to 65 L. Adjust exposure tool in Capture One with arrow keys as necessary to reach the 50-60 target. Do not go over 0.5 or under -0.5 in exposure tool adjustments.
- Make sure to capture another image before verifying target value.
- Create LCC from an image with the correct luminance value.
- Set the White balance off of the light source (click anywhere within the image of just the light source).
- Luminance target value should now be 92-95 at this point. Change shutter speed to reach that range. (1/100 has been the standard for this range with just 120mm lens, 1/80 for 120mm and extension tube).
- You may need to set the White balance again for the A and B values (of LAB) to be close to zero.
- Replace any film holders/glass necessary for scanning and adjust the baffles to reduce excess light.
- This camera should be ready to start capturing at this point.
- If gaffer tape has been applied to the film holder, make sure the crop excludes all tape, otherwise the autocrop feature may not work as well.
Scanning and resuming scanning
- If moving the column by even a couple of millimeters, you will have to repeat all of these steps.
- Check Next Capture Naming is correct. Use the top ellipses in the floating tool to set the capture counter if needed.
- Place the 4x5” negative target (sitting on the small table to the left of the copy stand) under the glass carrier.
- Under the Tether tooltab on the left, click the Live View button located in the camera tool. This will open the Live View of the capture area (see example below).
- Look for the Camera Focus tool on the left side, and click the AF button. An orangish square should appear over the negative. If it is not on the negative, wait for the autofocus to finish and then drag it over the negative and click AF again. The camera will begin to autofocus. If it reads around 1200-1210ppi (for large negatives) once it has finished, then you should be good to go, but we’ll double check first.
- Close the live view window.
- In the very top menu, go to Camera → Composition Mode. Do not worry about the naming of this capture.
- Take a picture and zoom in only 100% and confirm a good focus.
- If it is blurry, open the live view again and click any of the arrows next to the autofocus button to force a focus change. Then redo the autofocus.
- Place the target carefully back into its protective case and sleeve.
- Super important step!!! Place your first project negative under the glass and click capture. Once you’ve captured that first negative, go back to Camera → Composition Mode to turn that feature off, otherwise it will overwrite every image you capture.
- Confirm one last time that your numbering is correct and note if there are any duplicates in the box that need to be skipped.
- Continue scanning, periodically checking your progress against the spreadsheet. Mistakes can happen, whether it’s during the inventory process or scanning. Numbers can be thrown off.
- Once you finish scanning a box, compare the last image number to the spreadsheet.
- Starting from the first image you scanned, go through each image making sure it is in frame and rotate it to be right reading (text isn’t upside down, faces and landscapes aren’t rotated incorrectly).
- Press C or go to the Crop Tool in the top middle toolbar to move the crop around.
- Use Ctrl+Alt+L (replace Ctrl with Command for MacOS) to rotate the image counterclockwise, and Ctrl+Alt+R to rotate the image clockwise.
- Once you’ve confirmed all images are in frame and rotated, you can move onto scanning the next box.
- The production manager will move images to folders as needed.
- When you are finished, close Capture One first (it saves everything automatically) and then shut down the computer.
- Turn off the light table and the camera.
- Place the foam covering back before replacing the lens cap.
- Place the lens cap back on the lens with the foam covering the negative glass carrier. The camera may sink down a bit because it is no longer holding onto the focus.
Post-processing
Notes before starting or resuming
- All of the images should be rotated to be right-reading if the above steps were followed during scanning. If not, be sure to rotate the images before doing any of the steps below.
- When scanning negatives, we will preserve one version (variant) that has been rotated and cropped but no other settings applied.
- There should be an icon in the shape of the moon in the Capture One tooltab, this will have all of the tools necessary for processing transparencies.
- If all of the nested secondary variants aren’t showing, click the search icon on the far right of the browser window and check to see if the ellipsis are orange. If it is, this means a search filter has been applied. Make sure the filter is set to 5 stars.
- Otherwise if only the primary variants are showing, go to Image in the top menu and click Expand All. You may need to select a different folder in the Library tool and then the desired folder for the display to switch to showing all of the variants.
Non-batch process
Auto Crop
- Setting the Method to Loose Material-Fixed Size might help with low contrast negatives.
- Straighten should be Average.
- Pre-pass can be none, but for dark negatives (which look white) setting this to Low Object-Background Contrast may help the system detect the edges better.
- Optimization should be None.
- More testing needs to be done on the Auto Levels on Interior Crop.
- Padding should be 20px.
- Select all the desired images and click Auto Crop.
- Double check each crop to make sure it surrounds the entire negative, not just the image area, and that the image is straight. This may need adjustment if the negative was cut differently.
Create variants
- Select all negatives (deselect any setup images of color targets by holding down Ctrl and clicking the thumbnail), right click on one and click Clone Variant. Each thumbnail should now have a second thumbnail nested under it.
- Right click one of the new variants, and click Select by Same and then Variant Position [2].
- Right click one again and select Rating and assign it a 5 star rating.
- Go to the search button, click the dropdown, and select 5 star rating. This way we are not editing the primary variants.
Rotation and Flip
- Skip if this was batch-applied.
- Set the Flip to Horizontal so the negative is right reading.
- Do not use the copy settings button to apply these settings to other negatives, this is because it copies not just the flip, but the angle as well, which will be different for other negatives.
Base Characteristics
- Skip if this was batch-applied.
- Mode should be set to Film Negative to convert the image to a positive image (looks like a photograph).
Black & White
- Skip if this was batch-applied.
- Click Enable Black & White for B&W negatives.
Levels
- If an image is over or underexposed, we will adjust the levels for users to be able to see the image.
- Make sure the RGB tab is selected.
- Click the magic wand tool in the Levels toolbox and evaluate the image. Move the black and white sliders if the auto-Level over or underexposed the negative.
Batch process
Flipping, converting to positive, and enabling B&W mode can be batch-applied after auto-crop and variant creation. Normally the production manager will handle this, otherwise follow the steps below.
Select one secondary variant image to edit.
- Set the Flip to Horizontal.
- Set the Base Characteristics Mode to Film Negative.
- Check the Enable Black & White option under the Black & White tool.
- Go to the Adjustments Clipboard (might be easier to drag to be free floating) and click the ellipsis button, go to Autoselect, and make sure it is set to None.
- Under Base Characteristics, select just Mode.
- Under Black & White, select just Enabled.
- Under Composition, select just Flip.
- Click Copy.
- Select all of the other images and click Apply.
Common film issues
- Safety films can provide a strong yellow cast.
- Adjust tint to remove magenta. Evaluate whether these adjustments are appropriate for a large amount of images.
- Use the color wheel to remove saturation of certain colors.
- To remove red, move to cyan.
- To remove yellow, move to blue. Basically move the colors to the opposite side of the color wheel.
- White balance off the image. If it goes too far yellow or another color, then you can adjust your blue levels (again opposite of yellow).
- Negatives with orange hue – invert and pick white balance off of an object that is likely white/gray. Bring the levels in close to where the histogram lines start to increase. For the shadows, bring the level in just past the first small peaks.
- Negatives with red hue – invert and try to pick the white balance off of something white/gray and then bring the red shadows in slightly until the color normalizes.
Camera settings
- 118.5 column mark for 1200 ppi with 120mm lens, no extension tube
- 88.3 column mark for medium (2.25”) negatives with 120mm lens and extension tube
- f8.0 for transparencies
- f6.3 for most reflective
Camera setup
Camera setup can be skipped if both the camera and the strobe lights have not been moved, but you must always verify the focus. If using continuous lighting, camera setup must always be done.
We measure color and luminance using LAB readouts (L = lightness, a = color scale of green to red, b = color scale of blue to yellow). A neutral mid-gray would have the LAB values of 50, 0, 0.
- Grab materials for scanning (check spreadsheet).
- Unplug and replug in the copy stand USB cable before starting Capture One.
- Open Capture One.
- Turn on the camera (press the power button on the power strip) and lights. Remove the lens cap and place it to the side. Turn on modeling lights if not already on. Place the device level target on the table.
- Determine needed PPI, go to Live View, set the target PPI, hit enter, and then click Start.
- Reflective generally needs an aperture of 6.3.
- Shutter speed does not need to be adjusted because it is overridden by flash.
- Be mindful of cords when moving the camera.
- Adjust Orientation for most of the material to be right-reading. Always rotate material to be right-reading as you capture.
- Open a previous or new Capture One session and title it with the collection identifier. Eg, vols-football
- Turn on Live Focus and set a focus tool down on the target, then click AF. Double check the focus looks good. If it still doesn’t, manually adjust the focus out of range and then try the AF again.
- Capture an image and set Curve (under Base Characteristics) to Linear Response (this will darken the image slightly). Set the White Balance (WB) to Custom 1.
- If returning to a previous Capture One session, make sure to reset your adjustments by going to Adjustments -> Reset.
- Capture another image of the device target and confirm focus looks good.
- Place the grey board down on top of the target and capture an image. The grey board should cover the entire image.
- Set LAB readouts across the image.
- Adjust strobe power if readouts are above 65. You can determine how much by adjusting the exposure tool. Each tenth is one click of the strobe knob.
- Locate the Lens Cast Correction (LCC) tool and click Create LCC. Make sure no options are checked in the pop-up dialog box, and click Create. Capture One will alert if there is anything on the grey board to affect the LCC.
- Remove the board and capture another image of the device target.
- Select the eyedropper tool next to the WB in Camera Settings.
- Click on the grey background of the device target to set a new WB. (Alternatively, set the white balance based on Patch 13)
- Capture another image of the device target and check the color reading for Patches 10-15.
- Adjust strobe power first to get closest to patch 15’s color but try to get all patches close to their target.
- If within a couple of digits, go to the Exposure tool and adjust by only a few hundredths. Eg, +/- .03
- Always capture another image after changing any settings. The software automatically adjusts how the image “should” appear but capturing an image is the only way to truly determine if the changes will result in the desired look.
- Once Patch 15 is at or very close to 65, capture one more image of the device target and delete all previous captures. Add the current date in ISO8601 formatting to the final device capture. Eg, vols-football_2021-09-08
- Go to Next Capture Naming.
- Sequential naming: Click the ellipses next to Format and type assigned item identifier. For items with multiple pages, add the 3 or 4 Digit Counter. The digit counter may also be needed for single-paged items if the materials are sequentially named. Ask your supervisor if you’re unsure.
- Non-sequential naming: Clear all formats and add Clipboard Contents as the only format. This allows you to copy an identifier from a spreadsheet and then capture an image. The software automatically names the file from the copied identifier.
- Click on the thumbnail of an image you’ve captured. Sometimes the cursor will remain in the naming tool, which prevents the capture button from activating.
Capture
All objects must be captured at less than a 5 degree skew.
- Make sure the object level target is at the bottom of the frame. You can add Guides to help with lining material up. Open Live View.
- Place the first item down, lining it up to the guides. The object level target can also be used to help keep the item straight and for placing all other items down in relatively the same spot.
- A few seconds of careful placement will save a lot of time in the post processing phase.
- Use the foot pedal to capture the first item and confirm everything is within frame.
- If this project cannot use autocrop, then move the crop close to the item but leaving plenty of space in case your next item isn’t exactly placed in the same spot.
- Capture remaining items/pages in the exact same orientation.
- Troubleshooting tip: If you accidentally capture an image twice or there’s an issue with the image, delete the image and set the capture counter to the appropriate number and continue scanning.
- Once finished capturing a set of images (such as an athletics program), go through every image and check for accidental duplicates, stray items in the frame, and lighting or movement issues. This should take only 1-2 seconds per image to check.
- Once confirmed
- Update the spreadsheet
- Place the item back in its folder and label the folder with the number of images captured, the date, and your name.
- Begin capturing the next item making sure to change the next capture naming.
Quality Control and Processing
- This process should be started once every image has been checked. If any images need to be rescanned, do so before proceeding.
- Select All images (except the initial target) and go to Image -> Clone Variant. This might take a minute or two or five depending on the number of images.
- Move all of the images to the Selects folder: Image -> Move to Selects. Again, this might take a minute.
- Select one of the cloned variant images.
- Go to the Library tool and navigate to the Selects folder.
- Go to the Select menu and choose Select by Same Variant Position.
- Go to Adjustments -> Rating -> 5 Stars. You can also right click on any of the selected images and change the rating to 5 stars. Wait till the filter in the bottom left updates. There should be an equal number of none rating and 5-star rating
- In the filter window, select 5 Stars and set the autocrop to pad 100px on the edges. Then autocrop all of the images.
- Verify each one was cropped and deskewed correctly.
Output
- In the filter tool, select No Stars. Rotate all of these images to be right reading if not already done.
- Go to the Output menu and select the TIF output recipe
- Change the Fixed drop down to Long edge and depending on the material, pad the Preservation output with 500 pixels. So if the long edge is supposed to be 4000 pixels, set it to 4500.
- Set TIF as file output with the option to not embed thumbnail
- Under Output Location select the drop down and click Choose Folder.
- Create two folders: Preservation and Intermediate
- Select the Preservation folder as the output.
- In the Output Naming tool, add “_p” after the image name.
- Process the No Stars images.
- Change the Output Location to the Intermediate folder.
- In the filter tool, select 5 Stars and select all images.
- Change the long edge to the required pixels on the long edge (4000, 6000, or 8000).
- Change the _p in Output Naming to _i
- Process these images.
Server Upload
All collection directories should be organized as:
- Active Projects
- Collection
- Capture One session
- Preservation files
- Intermediate files
- Collection
- At the end of each scan session, copy the Capture One session into its respective folder on server, overwriting the previous day’s scan session for the same collection.
- Once all of the collection has been digitized. Copy the Preservation and Intermediate files to their respective folders on server.
Troubleshooting
- Camera not responding
- Try going in and out of live view
- Unplug and replug the camera cable to the computer
- Power cycle the camera
- Restart the software
- Restart the computer
- Copy stand errors
- Restart the computer
- Last resort - power cycle the copy stand box
- If auto-focus doesn’t work. Move focus out and then try again.
- When capturing in live view, the software may not capture in non-live view. Have to go back into live view and then close it out. The software should capture regularly.
- No way for autocrop to keep the target in. May have to only use it for intermediate.
- 120mm no extension is max 2200 ppi.
- Stitch - select images, right click and select Stitch to Panorama. Returns dng files and isn’t as good as Photoshop.
- Moire - adjust contrast or focus. Can use GoldenThread to see what fails and needs adjustment
- Digital protractor will help with angling the lights.
Embedding or editing file metadata with Exiftool
- Remove thumbnails from within files
- $exiftool -ifd1:all= -m fileName
- use a period for all files in the current directory, e.g. exiftool -ifd1:all= -m .
- add -overwrite_original to overwrite original files instead of making copies
- $exiftool -ifd1:all= -m fileName
- Embed title and author into PDF metadata. If a title has quotes in it, replace each single quote with three quotes.
- PathToExiftool.exe -title=”My Document” -author=”Author’s Name” filepathtomydocument.pdf -overwrite_original
- You can use a spreadsheet to build out multiple commands and copy and paste those into a text file with the file extension '.bat' (for Windows). Double click the bat file and it will run all of the commands.
- Conduct page count of PDFs in a directory and export to a text file named pages
- PathToExiftool.exe -T -r -filename -PageCount -s3 -ext pdf . > NameOfOutputFileToCreate.txt
- PathToExiftool.exe -T -r -filename -PageCount -s3 -ext pdf . > NameOfOutputFileToCreate.txt
File validation with FITS or JHOVE
FITS
- Single file
- fits.bat -i InputFile.tif -xc -o OutputFile.txt
- Run against a directory recursively
- fits.bat -r -I "InputPath" -xc -o "OutputPath"
JHOVE
- JHOVE will return: -not found; -not well-formed; -valid; -well-formed
- $./jhove -k -m jpeg2000-hul -o output.txt -h audit -krs "inputPath"
- replace jpeg2000-hul with desired format validation
- replace jpeg2000-hul with desired format validation
Combining multiple folders of images into PDFs using img2pdf
Requires img2pdf to be installed. Recommend using Homebrew on MacOS.
The following command will go through a directory full of folders combining the jpegs in each folder and creating a PDF version with the filename of the first JPEG in that folder, then saving the PDF in the same folder. You do not need to change any of the following commands, but they should be run in the directory with the folder(s).
- One folder down:
- find . -type d | while read d; do img2pdf "${d}"/.jpeg --output ./"${d}"/"${d##/}.pdf"; done
- Multiple subdirectories:
- find . -type f -name "*.jpeg" -print0 | while IFS= read -r -d '' file; do dir=$(dirname "$file") img2pdf "$file" --output "$dir/$(basename "$dir").pdf" done
Copying files using rsync or xcopy
Rsync
MacOS can be frustrating when copying or moving files, especially over a connected server. Use rsync for faster copying and for tracking progress.
- This will scan all files and directories and make incremental changes. Some options can be combined, such as –rvt. For a full list, see https://ss64.com/bash/rsync_options.html
- $rsync -rvt --progress --ignore-existing "/Source/Path" "/Destination/Path"
- -n option tells rsync to conduct a dry-run test (no files will be affected)
- -r option to copy directories recursively
- -t option to transfer modification times
- -v option increases the amount of information you are given during the transfer. By default, rsync works silently.
- A single -v will give you information about what files are being transferred and a brief summary at the end.
- Two -v flags will give you information on what files are being skipped and slightly more information at the end.
- More than two -v flags should only be used if you are debugging rsync.
- --progress option tells rsync to print information showing the progress of the transfer
- --ignore-existing options tells rsync to skip updating files that already exist at the destination
- --delete option tells rsync to delete extraneous files from the receiving side (ones that aren’t on the sending side), but only for the directories that are being synchronized
Xcopy
Xcopy can be used for incremental file updates
- This will copy all files and subdirectories but only if the source files are newer than the destination files or if the destination files do not exist.
- $xcopy C:\source D:\destination /d /e /y
- /d option specifies the date comparison
- /e option copies all subdirectories even if they are empty
- /y option suppresses the confirmation prompt for overwriting files.
Tesseract OCR
- Main command. l is for language:
- $tesseract ‘file’ ‘output file name no extension required’ -l eng hocr
- $tesseract ‘file’ ‘output file name no extension required’ -l eng ocr
- You can create a .sh file on MacOS like a Windows .bat file to run multiple pages.
- need to add #!/bin/bash at the beginning of the file, then the following lines can have the commands.
- then need to enter the command chmod 755 ‘scriptFile’ to allow the script to run in that folder.
Convert to JPEG2000
Federal Agencies Digital Guidelines Initiative (FADGI) lists JPEG2000 as an appropriate preservation file format for digitized content. This can result in a significant reduction in file storage use. A 100MB TIFF file can easily be converted to a 10MB JPEG2000 file. Harvard's Imaging Services department maintains a thorough guide for getting started converting TIFFs to JPEG2000. See the guide here.