I was traveling this month, and had the unlucky situation of needed to take six different planes (three each way) for a trip to and from Illinois. My fault for booking late, but it was an opportunity to think about latency, throughput, optimization and reconfigurable computing.
Actually I didn’t think about any of those things while strolling though the various airports. I thought mostly about where to get a beer and a burrito between flights. But that’s the same problem.
The design of an airport is, or should be, about getting people and their bags from one place to another at a predictable speed, maximizing the number of individuals who pass through on their way from there to there. And for the most part they are well designed for that purpose.
Note that I said predictable speed, not high speed. The airport managers might not care very much if takes you 5 minutes or 7 minutes to get from one gate to another. What they care about that it takes everyone pretty much the same amount of time, and that the system doesn’t get clogged up and delayed during times of peak capacity. The goal is high throughput, getting the maximum number of people through the system in a given time. This often requires a tradeoff in latency, which is the time taken for any one individual to make the journey. The same concept applies not only to airports but also to highways, to manufacturing and to pretty much every complex system out there that involves long lines of people, animals or objects requiring one or more constrained resources.
In places like Chicago O’Hare or Tokyo Narita you see this kind of throughput planning all over the place. Sometimes it doesn’t work very well, but mostly it does. You see it in the queues for checkin, in the lines for security, in the way passengers are sequenced when getting onto the planes, and in the way bags come down onto the carousels. You see it in the way you order, pay for and pick up your six dollar venti nonfat soy frappe macchiatoccino. If you could go beyond what you can see as a mere passenger, then you would find it on the taxiways and in the skies overhead, as aircraft are sequenced for landing and departure, and handed off from controller to controller as they move from sector to sector.
And what is all this? A whole lot of parallel, pipelined and interconnected systems.
Consider airport escalators for a moment. When you are on one, especially a very long one, it might seem awfully slow. But what an amazing and beautiful thing an escalator is, when you think about throughput.
I have a specific example in mind. If you take the express train from Tokyo to Narita airport, you and hundreds of other people will simultaneously arrive at a platform deep underneath the airport, somewhere around basement level 5. Everyone on that train will have one or two bags, some will have small children, some will be old people who walk slowly, some will be impatient and fast… hence it should be total chaos when the doors of that train open and people try to make their way up to the checkin lines, seven levels up. But the Narita escalators practically eliminate that chaos. They do this by forcing people to get on at a constant rate, one person, or a pair of people, for every two or three steps. At capacity there are many hundreds of people simultaneously moving up from level to level. The escalators provide a smoothing function, delivering people in an orderly manner to the lines at the counters, and at a much higher effective rate, more people per minute than could possibly be provided by elevators or stairs.
How does this relate to reconfigurable computing? Well… a non-technical friend recently asked me to explain the difference between traditional processing, using an x86 processor, for example, and parallel processing in an FPGA. I was tossing around words like “pipelining” and “throughput” without relating those concepts to the real world. Then it dawned on me that traditional, sequential processing is like an elevator, and reconfigurable processing is like an escalator.
The traditional processor is a single elevator that carries a small number of passengers (data and instructions) from one floor to another. For decades, processor vendors have focused entirely on making their elevators run faster, working to get small sequences of data and instructions from the bottom floor to the top floor as quickly as possible. They have also increased the capacity of the elevator by increasing the word size (64 bits, for example) and by adding more complex instructions. But the elevator approach has inherent limitations. At busy times of the day, an elevator becomes a bottleneck no matter how fast it runs.
If we eliminate the elevator and instead build an escalator, or better yet multiple parallel escalators, then we can move a whole lot more passengers in the same amount of time, and probably do it using a whole lot less power. No doors to open and close, not as much shuffling for position in the queue, and no snot-nosed little kid pushing every button and jamming up the system. And if our airport or system is truly reconfigurable, we can deploy exactly the right combination of parallel escalators that are needed at any given time. Shut down or eliminate unused escalators in the middle of the night, and add more of them when we expect a lot of trains or planes to come in at the same time.
And there’s something else to consider: parallel programming is sometimes described as difficult, exotic and unnatural. Something that most software programmers just aren’t ready for. Well… I dunno. Clever people have been designing lean production, package sorting and transportation systems, and shopping malls and airports, for an awfully long time and are getting pretty darn good at it. Maybe we just need to hire some of these people to write and optimize our parallel algorithms?