Update: Follow-up Discussion on Artima
What and how should we be measuring? A good benchmark is:
- Repeatable, so experiments of comparison can be conducted relatively easily and with a reasonable degree of precision.
- Observable, so if poor performance is seen, the developer has a place to start looking. Nothing is more frustrating than a complex benchmark that delivers a single number, leaving the developer with no additional information as to where the problem might lie.
- Portable, so that comparisons are possible with your main competitors (even if they are your own previous releases). Maintaining a history of the performance of previous releases is a valuable aid to understanding your own development process.
- Easily presented, so that everyone can understand the comparisons in a brief presentation.
- Realistic, so that measurements reflect customer-experienced realities.
- Runnable, so that all developers can quickly ascertain the effects of their changes. If it takes days to get performance results, it won’t happen very often.
Not all benchmarks selected will meet all of these criteria, but it’s important that some of them do. Make sure to select enough benchmarks so that important parts of your product’s performance envelope aren’t a surprise when it ships—and avoid selecting benchmarks that don’t really represent your customer, because your team will end up optimizing for the wrong behavior. Resist the temptation to optimize for the benchmark; the recent discovery that some operating systems have “improved system call performance†by moving the getpid(2) system call into user-land is a perfect example; no real application calls getpid(2) enough to matter.
Selecting a benchmark is asking for that aspect of performance to be optimized, probably at the expense of other aspects that are not being measured. If as an operating system developer you want faster system calls, design a benchmark that is a weighted average of the calls your customers’ applications make most frequently. Be careful what you ask for, because you’re likely to get it.
| M | T | W | T | F | S | S |
|---|---|---|---|---|---|---|
| « Feb | Apr » | |||||
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 | ||
RSS feed for comments on this post · TrackBack URI
Leave a reply