This past Friday, Tom Henderson tweeted me a question:
@atduskgreg Can you think of any computational collage for newbies resources? FWIW I know a bit of ruby and a week of clojure.
— Tom Henderson (@mathpunk) January 12, 2013
Upon further questioning, Tom pointed to some inspirations for the type of thing he wanted to try to make: Sergio Albiac’s collages…
…and Dada collage (like this Hannah Hoch’s example):
Having gotten a sense of what he was going for, I suggested that Processing might be a good place to start, mainly because of how easy it makes it to work with images and how many resources there are out there for learning it. (Though I did suggest checking out Zajal as well since he already knows some Ruby.)
Further, I offered to put together a Processing example of “computational collage” specifically to help. While there are a lot of great texts out there for getting started with Processing (I especially recommend Dan Shiffman’s Learning Processing) it can be really helpful to have an in-depth example that’s approximately in the aesthetic direction in which you’re trying to proceed. While such examples might be a lot more complex and therefore much more difficult to read through, they can demonstrate how someone with more experience might approach the overall problem and also point at a lot of little practical tips and tricks that will come in handy as you proceed.
So, after a bit of thinking about it, I decided to write a Processing sketch that produces photomosaics. A photomosaic reproduces a single large image by combining many other smaller images. The smaller images act as the “pixels” that make up the larger image, their individual colors blending in with their neighbors to produce the overall image.
Here’s an example of the effect, produced by the sketch I ended up creating for Tom:
Check out the larger size to see the individual pictures that go into it.
Here’s another example based on a picture I took of some friend’s faces:
For the rest of this post, I’ll walk through the Processing (and bit of Ruby) code I used to create this photomosaic. I’ll explain the overall way it works and point out some of the parts that could be re-usable for other projects of this sort (loading images from a directory, dividing up an image into a grid, finding the average color of an image, etc.). At the end, I’ll suggest some ways I’d proceed if I wanted to produce more work in this aesthetic of “computational collage”.
A Note of Warning
This post is far longer and more detailed than your usual “tutorial”. That is intentional. I wanted to give Tom (and anyone else in a similar position) not just some code he could use to create an effect, but a sense of how I think through a problem like this. And also a solid introduction into some conceptual tools that will be useful to him in doing work in and around this area. I hope that the experience is a little like riding along in my brain as a kind of homunculus – but maybe a little better organized than that. This is exactly the kind of thing that I wished people would do when I was first starting out so I thought I’d give it a shot to see if it’s useful to anyone else.
Let’s start by talking about the overall plan: how I approached the problem of making a sketch that produced photomosaics. After thinking about how photomosaics work for a little while (and looking at some), I realized the basic plan was going to look something like this:
- Download a bunch of images from somewhere to act as the pixels.
- Process a source image into a grid, calculating the average brightness of each square.
- Go through each square in this grid and find one of the downloaded images that can substitute for it in the photomosaic.
- Draw the downloaded images in the right positions and at the right sizes.
In thinking through this plan, I’d made some immediate decisions/assumptions. The biggest one: I knew the photomosaics were going to be black and white and that I’d mainly use black and white images as my downloaded images. This choice radically simplified the process of matching a portion of the original image with the downloaded images – it’s much easier to compare images along a single axis (brightness) than along the three that are necessary to capture color (red, green, blue or hue, saturation, value). Also, aesthetically, most of Tom’s example images were black and white so that seemed like a nice trade-off.
After a first section in which I explain how to use some Ruby code to download a bunch of images, in the rest of the sections, I’ll mainly describe the thinking behind how I approached accomplishing each of the stages in Processing. The goal is to give you an overall sense of the structure and purpose of the code rather than to go through every detail. To complement that, I’ve also posted a heavily-commented version of the photomosaic sketch that walks through all of the implementation details. I highly recommend reading through that as well to get a full understanding. I’ve embedded a gist of that code at the bottom of this post.
The first step in making a photomosaic is to download all the images that are going to act as our tiles – the individual images that will stand in for the different grays in the original image. So, what we need is a bunch of black and white images with different levels of brightness ranging from pale white to dark black.
By far the easiest way to get these images is to download them from Flickr. Flickr has a great, rich API, which has been around for quite a long time. Hence there are libraries in tons of different languages for accessing its API, searching for images, and downloading them.
Even more conveniently, this is a task I’ve done lots of time before, so I already had my own little Ruby script sitting around that handles the job. Since Tom had mentioned he knew some Ruby this seemed like the perfect solution. You can get my Ruby script here: flickr_downloader.rb. To use this script you’ll have to go through a number of steps to authorize it with Flickr.
- Apply for API access
- Enter the API key and shared secret they give you in the appropriate place in the flickr_downloader.rb script.
Now you need permission to login as a particular user. This is done using an authentication process called “oauth”. It is surprisingly complicated, especially in the relatively simple case of what we want to do here. For our purposes, we’ll break down oauth into two steps:
- Give our code permission to login as us on Flickr.
- Capture the resulting token and token secret for reuse later.
This example from the flickraw gem will take you through the process of giving our code permission to log in to flickr: auth.rb. Download it and run it. It will guide you through the process of generating an oauth url, visiting Flickr, and giving permission to your code.
At the end of that process, be sure to capture the token and token secret that script will spit out. Once you’ve got those, go back to our flickr_downloader.rb script and paste them in the appropriate places marked ACCESS_TOKEN and ACCESS_SECRET.
Now the last step is to select a group to download photos from. I simply searched for “black and white flickr group” and picked the first one that came up: Black and White. Once you’ve found a group, grab its group id from the URL. This will look something like “16978849@N00” and it’s what you need for the API to access the group’s images. When you’ve got the group id, stick it in the flickr_downloader.rb script and you’re ready to run it.
Make sure you have a directory called “images” next to the flickr_downloader.rb script – that’s where it wants to put the images it downloads. Start it running and watch the images start coming down.
Process the Source Image into a Grid
Now that we’ve got the images that will populate each of our mosaic’s tiles, the next step is to process the source image to determine which parts of it should be represented by which of our tile images.
When you look at the finished sketch, you’ll see that, there, the code that does this job actually comes at the end. However, in the process of creating the sketch it was actually one of the first things I did – while I was still thinking about exactly the best way to match downloaded images to each part of the source image – and it was a very early version of the sketch that produced the screenshot above. This kind of change is very common when working through a problem like this: you dive into one part because you have an idea for how to proceed regardless of whether that will be the first piece of the code in the final version.
Creating this grid of solid shades of gray consisted of two main components:
- Loop through the rows and columns of a grid and copy out just the portion of the original image within each cell.
- Pass these sub-images to a function that calculates the average brightness of an image.
First I defined the granularity of the grid: the number of rows and columns I wanted to break the original image up into. Based on this number, I could figure out how big each cell would be: just divide the width and height of the source image by how many cells you wanted to split each side into.
Once I knew those numbers, I could create a nested for-loop that would iterate through every column in every row in the image while keeping track of the x- and y-coordinates of each cell. With this information in-hand I used Processing copy() function to copy the pixels from each cell one-by-one into their own image so that I could calculate their average brightness.
See the drawPhotomosaic() function in the full Processing code below for a detailed description of this.
I implemented a separate function to calculate the average brightness of each of these sub-images. I knew I’d need this function again when processing the downloaded tile candidates. I was going to want to find their brightness as well so I could match them with with these cells. See the aveBrightness() function in the Processing code for the details of how to find the average brightness of an image.
In my original experiments with this, I simply drew a solid rectangle in place of each of these cells. Once I’d calculated the average brightness of that part of the source image, I set fill() to the corresponding color and drew a rectangle with rect() using the x- and y-coordinates I’d just calculated. Later, after I’d figured out how to match the tile images with these brightness colors, it was simple to simply draw the tile images at the same coordinates as these rectangles. The call to rect() simply got substituted for one to image().
Matching Tile Images
In many ways, this is the core of the photomosaic process. In order to replace our original image with the many images we downloaded, we need a way to match each cell in the original image to one of them.
Before settling on the final technique, I experimented with a few different ways of accomplishing this. Each of them had a different aesthetic effect and different performance characteristics (i.e. took a different amount of time to create the photomontage and that time got longer at different rates depending on different attributes).
For example, early on, it occurred to me that the grid of grayscale cells (as shown in the screenshot above) didn’t look very different if I used all 256 possible shades of gray or if I limited it to just 16 shades. This seemed promising because it meant that instead of having to use (and therefore download and process) hundreds of tile images, I could potentially use a much smaller number, i.e. as few as 8 or 16.
So, my first approach was to divide the possible range of 256 grayscale values into large “bins”. To do 16 shades of gray, for example, each bin would cover 16 different adjacent grayscale values. Then, I started loading the source images, checking to see which of these 16 bins they fit into, and moving on if I already had an image in that bin. The goal being to select just 16 images to cover the full range of values in the original image.
However, when actually running this approach, I found that it was surprisingly hard to find images for all of the bins. Most of my tile images had similar brightnesses. So while I’d find five or six of the middle bins immediately, it would process a huge number of images while failing to find the most extreme bins.
I eventually did manage to produce a few photomosaics using this method:
However, I decided to abandon it since it required a really large set of tile images to search through – and then didn’t use 98 percent of them – and also created a distracting visual texture by repeating each tile image over and over (which could be a nice effect in some circumstances).
After trying a few other similar approaches, it eventually occurred to me: instead of starting with a fixed set of grayscale colors I was looking for as my “palette” I should just organize the actual tile images that I had on hand so that I could pick the best one available to match each cell in the source image.
Once I’d had that revelation, things proceeded pretty quickly. I realized that in order to implement this idea, I need to be able to sort all of the tile images based on their brightness. Then I could simply select the right image to match the cell in the source image based on its position, i.e. if I need a full black image, I could grab ones at the front of my sorted list, if I needed ones near full white, I could grab ones at the end, and so forth for everything between. The image I grabbed to correspond to a full black pixel might not be all-black itself (in fact it almost definitely wouldn’t be – who posts all-black images to Flickr?), but it would be the best match I could get given the set of tile images I’d downloaded.
In order to make my tile images sortable, I had to build a class to wrap them. This class would hold both the necessary information to load and display the images (i.e. their paths) as well as their average brightness – calculated using the same aveBrightness() function I’d already written. Then, once I had one of these objects for each of my tile images, I could simply sort them by their brightness score and I’d have everything I needed to select the right image to correspond to each cell in the source image.
The code to accomplish this makes up most of the sketch. See the PixelImage class, the PixelImageComparator class, and most of the setup() function in the full sketch for details. I’ve written lots of comments there walking your through all of the ins and outs.
Once it was in place, my sketch started producing pretty nice photomosaics, like this one based on Tom’s twitter icon:
Though I found the result worked especially well with relatively high-contrast source images – like the black and white portrait I posted above or this one below based on an ink drawing of mine. I think this is because the tiles only have a limited range of grays that they cover. Hence, images that depend for their legibility on fine distinctions amongst grays can end up looking a little muddled.
At this point, I’m pretty happy with how the photomosaic sketch came out. I think its results are aesthetically nice and fit into the “computational collage” category that Tom set out to start with. I also think the code covers a lot of the bases that you’d need in doing almost any kind of work in this general area: loading images from a directory, processing a source image, laying things out in a grid, etc.
That said, there are obvious improvements that could be made as next steps starting from this code:
- Use tile pictures that are conceptually related to the source image. To accomplish this I’d probably start by digging more into the Flickr api to make the downloader pick images based on search terms or person tags – or possibly I’d add some OpenCV to detect faces in images…
- Vary the size of the individual images in the grid. While the uniformity of the grid is nice for making the image as clear as possible, it would be compositionally more interesting (and more collage-like) to have the size of the images vary more, as Tom’s original references demonstrate. For a more advanced version you could even try breaking up the rectangular shape of each of the source images (Processing’s mask() function would be a good place to start here).
- Another obvious place to go would be to add color. To do this you’d need a different measure of similarity between each cell and the tile images. And you’d need one that wouldn’t involve searching through all of the tile images to match each cell. I’d think about extending the sorting technique we’re using in this version. If you figured out a way to translate each color into a single number in some way that was perceptually meaningful, you could use the same sorting technique to find the closest available tile. Or, you could treat the tile images as red, green, and blue pixels and then combine three of them in close proximity (and at appropriate levels of color intensity) to produce the average color of any cell in the image.
- One aspect of Tom’s references not covered here is the use of typography. Rune Madsen’s Printing Code syllabus is an amazing resource for a lot of computational design and composition tasks in Processing and his section on typography would be especially useful for working in this direction.
- Finally, one way to break off of the grid that structures so much of this code would be to explore Voronoi stippling. This is a technique for converting a grayscale image into a series of dots of different weights to represent the darkness of each region in a natural way, much like a stippled drawing created by hand. Evil Mad Science Laboratories recently wrote an extensive post about their weighted voronoi stippling Processing implementation to create art for their Egg Bot machine. They generously provide Processing code for the technique which would make an interesting and convenient starting point.