Something I hear a lot when it comes to the recent AI stuff like Stable Diffusion, Dall-E, chatGPT, etc. etc. etc. is some version of “this technology is just in its infancy, imagine what it will be able to do in a few years!” I’m not saying that these AI technologies won’t improve, but the thing is, it’s just not true that these technologies are in their infancy. These technologies are all developments of technologies which have been worked on for decades.
At their core, these approaches are various ways of doing massive quantities of massive matrix multiplications in order to encode the relationships between data. The T in chatGPT stands for “Transformer,” which is a variant of the previous types which had generally had some form of “neural” in the their name, such as Convolutional Neural Networks or Recurrent Neural Networks. In particular, Transformers (which were first mentioned publicly by a team at Google in 2017) replaced RNNs as the model of choice in natural language processing by being simpler and having the feature of being able to do the pre-training in parallel, which made vastly larger training data sets feasible.
Transformers were not a radically new idea that created a field which didn’t exist before; they were a new approach which was created because a large number of well funded smart people had been working in the field for a long time on relatively similar approaches. It’s an innovation which yielded noticeably better results, it might even be a breakthrough. What it’s not is the first dipping of humanity’s toe into something no one had ever done before. It may be the first supersonic flight; it is not the first flight at Kitty Hawk.
Moreover, the hardware to execute these things has been under development for a very long time. A huge breakthrough in performance came when the AI algorithms were adapted to run on GPUs (graphics processing units, the things that do all of the calculations for 3D graphics). This provided a relatively inexpensive source of incredibly high performance in number crunching that made the massive amount of processing involved in AI far more accessible. The thing is, this was like a decade ago. Since then special-purpose GPUs have been created to do the work even more cost-effectively (One I know of in the current generation of them is the nVidia A100 which costs around $10,000). But wait, there’s more!
Cerebras developed the Wafer Scale Engine—an AI processing chip the size of an entire silicon wafer—back in 2019. It’s an impressive piece of technology; it consumes about 22kW of electricity in a silicon wafer that’s 300mm in diameter (basically, 1 foot wide). It’s quite a technical achievement, but it went on sale back in 2019. There will be newer and better ones, to be sure, but it’s not a new idea with completely untapped potential.
Don’t get me wrong. I’m not saying that this is the end of technological development, or that AI won’t get any better. It would be outright shocking if there were no further improvements. My point is that the improvements that we’re going to see are most likely to be much slower than the people who don’t know anything about the history of AI development think it will be. We’re not at the very beginning of an exponential curve.
I do strongly suspect that generative AI is going to be useful, just as classificational AI has proven useful. The thing is, classificational AI has been with us for a while—it’s things like face unlock on phones and de-noising of video and audio and actually usable speech-to-text. It’s gotten better, and it continues to get better, but speaking as someone who develops technology: a technology becomes viable when it works all of the times for a use, not merely when it can do an impressive demo under favorable circumstances. And in the real world, edge cases are often 99% of the work and not being able to handle them often means that a tool is more work than it saves. The result is frequently a limited-use tool for cases which the new technology is good at, and it’s one more tool in the toolbox of a human being who can handle all the edge cases.
That’s why the result of all of the labor-saving devices is people being so busy all the time.