AI Terms: Transformers
/Transformers – A Google research paper in 2017 was the first to discuss the deep learning architecture known as transformers. Today's major AI models (including ChatGPT, GPT-4, and Midjourney) are built using these neural networks. Previously, recurrent neural networks (RNNs) processed data sequentially—one word at a time, in the order in which the words appear. Then, an “attention mechanism” was added so the model could consider the relationships between the words. When transformers came along, they advanced this process by analyzing all the words in a given body of text at the same time rather than in sequence. With transformers, it became possible to create higher-quality language models that could be trained more efficiently and with more customizable features.
More AI definitions here.