Part of my work is to weigh current and new features and functionality against what user needs truly are (ranging from something entirely incapable of being done down to saving a user a few clicks).
At a quick glance, we have a few issues:
Building a database
How are all the cues to be standardized so that they can be referenced in a standardized manner, i.e. for the computer's "memory" image of a cue that it references when it looks through the camera? The initial time, collaboration, and trust necessary for every significant cue that is to be included would be miles high. What would happen to every cue not included in the database? For every cue that is well-documented enough to be included, there would be many times more that are unknown either because they are one-off customs, not made by a cue builder of any recognition (think of every one made in other countries), or nobody was willing to contribute the information. Even smaller things such as 29" vs. 30" sizing, cosmetic damage, aftermarket pieces (wraps, bumpers) would confuse the system. Crowdsourcing inputs would be possible, but at that point, you are simply left with a huge image gallery that runs into the next issue.
Camera Limitations
How do we curate every significant cue to to be photographed in the same manner? As a former studio photographer, I can tell you that there are huge barriers in overcoming specular highlights (glints of light reflecting on a shiny surface) that would get in the way, and this poses a significant stumbling point to a system that only learns by scanning pixels. Even more troublesome is when it comes to an end user's input photo; color balancing would generate significant noise in the accurate recognition, and tilt effects could result in keystoning that creates warped proportions that reduce accuracy in pattern design; a cue is a very difficult object to photograph in its entirety.
For example, say part of the cue recognition is in the joint thread pitch. While the technology does exist to measure the distance between points to get a reading of size, e.g. recognize 5/16x18 vs. 5/16x14, it currently falters at that scale (reading feet would be easier than millimeters).
Another example would be in identifiable components being on different portions of the cue. Take these
few quick snaps I took of a real Mezz Power Break II against a fake from China. Aside from the obvious difference in logo texts, it takes a bit of magnification to spot the differences, needing multiple input photos to create accurate feature mappings. At this point, the investment made to create an accurate result is nearly identical to our original solution, the system (human response vs. computer response) being the primary difference, and the computer response is entirely human input anyway.
There have been a few circles in which this sort of thing has been accomplished successfully. For example, Magic: the Gathering (a fantasy trading card game) has a few implementations of this idea that only works because every piece has identical dimensions, standard font, icon, and text locations, documentation (e.g. # 5 in a set of 320), production being limited to one company, and relative ease of photographing that is minimally subject to confusing references (more even proportions, matte surface). This took a fantastic amount of resources on the scale of years, and still is not considered particularly great.
Hope I haven't come off as too negative, as I too would like to see this sort of thing. But with the technology available to us in 2016 and the resources necessary to accomplish a task with a minuscule market demand, simply asking around remains the better solution today
