How do you easily share memories captured in photo albums going back generations? Simple Scan is really simple. It scans a page but extracting photos from the scanned image is a very tedious task. So, this month's exploration is that how do you automate it to reduce the drudgery. It is clearly an example of the xkcd Automation (http://xkcd.com/1319/)! However, it also falls into the category of spending time on “What Doesn't Seem Like Work?” (http://www.paulgraham.com/work.html).
There is a wonderful script 'multicrop' (http://www.fmwconcepts.com/imagemagick/multicrop/), which uses Imagemagick tools, to crop and straighten images.
The basic logic of Multicrop script is as follows:
A fuzz factor is used to select the background color. If the value is too high, part of the photo may be lost in the background and the photo may be split in mutiple parts. If the value is too low, photos may not be extracted. However, even if a part of the photo is treated as background, as long as the enclosing rectangle is the size of the original photo, you don't have to worry.
The script worked very well with multiple loose photos scanned at a time as long as there was some gap between the photos and on the boundaries.
My problem was that the photos could not be removed from the album without damage. Furthermore, the background was not uniform. The background consists of multiple colors.
Hence, in the above logic, I decide to change the first step by three:
It helped reduce to the drudgery though definitely did not save any time!
The following steps have been adapted from the 'multicrop' script referenced above, though I converted the steps into a Python script using os.system, subprocess.call and subprocess.check_output methods. For more details about the convert options, see http://www.imagemagick.org/script/command-line-options.php. Sample values are used where needed to simplify the examples.
Convert the image file (Fig 1) into Imagemagick's internal mpc format and use the mpc format for the intermediate steps for efficient processing:
For each background colour - bgcolor as an (r,g,b) tuple, rename out.mpc and out.cache to in.mpc and in.cache and floodfill none replacing the background colour. A 1x1 border of the background colour is added to ensure floodfilling is from all sides of the image and it is then shaved off.
The next step is to remove the remaining part of the image by red. You will get an image similar to Fig 2.
You now need to find a cluster of red pixels. Since your photo will not be very small, rather than searching pixel by pixel, you can speed the process by a factor of 100 by searching every 10th pixel in each row and column.
To get the colour at pixel (x,y):
If the color is not none but red, replace the contiguous red pixels by white:
Now, you want to get only the white part. So, fill all pixels that are not white with transparecncy and then turn transparency off so that all that is not white becomes black.
The white part is not a rectangle. So, clone the image and trim it so that it bounds the white part. Now, replace all black by white in this trimmed image.
Now, flatten the second image on top of the previous one to get the mask for a photo (see Fig 3).
The above steps can be combined into a single convert command as follows:
The photo can now be extracted:
While extracting, you may want to add the logic to straighten the image as well. So, instead, use the following command:
The multicrop script adds a border as well for a better presentation.
Now, you need to remove the white image area so that it is not used again.
You are now ready to find another red pixel in TMP2.mpc and extract the next photo.
Usually, you will want to discard small photos as there may be spurious small islands of red. At times, you may find that the extracted image is smaller, e.g. if the sky is light, it may be mistaken for the background. So, there is considerable scope for making the script a lot smarter!
Scanning a text page to be never easy with Simple Scan., especially with old documents. Using the text mode, some folds show up as lines. If the text is faded or shaded, parts of the characters are missing. While the visual result of scanning in photo mode is much better, a printout normally has a distracting gray background and readability is often lost in the process.
A solution is to use the white-threshold option in Imagemagick's convert command line utility after scanning a text document as a photo, e.g.
Exploring Software >