In my last post I outlined the three addons that I will be designing. Recently I’ve focused on designing the census addon, because before I can write it I need to be able to determine the frequency and breadth of samples as well as the total time to run my experiment. This is a vexing problem. I commented a long time ago on the importance of finding appropriate search terms before you can get your project off the ground. Today I had a minor breakthrough after throwing up my hands and consulting a former colleague at the FTC. I had wondered whether there weren’t situations analogous to mine, where one can view a characteristic of interest directly, but not its underlying source. He gave me a vital moment of inspiration when he pointed out that web analytics faces a similar problem when attempting to identify unique visitors to a webpage. After 45 minutes of scratching in the googly dirt, I hit upon a search phrase that produced myriad valuable results. The phrase was: “Unique visitors” probabilistic.
WHAM! That did it. I uncovered a slew of papers on a matter that is mostly of interest in database design: probabilistic counting.
This is some intense stuff. It will take a while to comprehend. But the main theme of this post is not the discovery, but the search! Added to my previous strategies for uncovering search terms, one must include “Thinking of analogous situations”. If you’re stumped on a problem, chances are high that someone else has been in a similar spot and come up with a cool new way of dealing with it. Analogous situations in fields that are far-flung from one another show up all the time, but their solutions are grounded in the same mathematics. By searching for papers on those analogous problems, you may find some very helpful ideas in designing a methodology to deal with your own.