The term "generative AI" didn't enter mainstream tech vocabulary until 2022. But the capability — using models to generate text, copy, and creative content — was not invented by GPT-3. It was slowly assembled through the 2010s by researchers and practitioners who were using whatever tools the state of the art provided.
In 2015, we were generating marketing copy.
What We Had
The language model landscape in 2015 was limited by today's standards. Recurrent neural networks — LSTMs, specifically — were state of the art for sequence generation. Transformer architecture was still two years away (Attention Is All You Need published in 2017). The outputs were coherent within short spans and increasingly incoherent at longer range.
For short-form marketing text — headlines, calls to action, email subject lines — the quality was surprisingly usable when combined with the right selection and ranking step.
The System
We built a pipeline that:
-
Generated candidates from an LSTM trained on a corpus of high-performing marketing copy across our platform. The training data was the A/B test corpus we'd been building since 2012 — thousands of campaigns, with conversion rates attached to every variant.
-
Ranked candidates using a classifier trained on the same data. The classifier learned what linguistic patterns correlated with high conversion rates — specificity over vagueness, concrete numbers, active verbs, question-based CTAs for certain product types.
-
Filtered candidates through brand guidelines — length constraints, disallowed terms, required includes — and a diversity check (don't return five variants that are semantically identical).
-
Presented candidates to the marketer as suggestions, not mandates. The human was in the loop, selecting from AI-generated options rather than being replaced by them.
What the System Got Right
The AI-generated candidates were better than random but worse than expert copywriters. They were comparable to average copywriters and far faster.
More importantly, the system consistently generated combinations that humans wouldn't. A subject line with a number, a question, and a product-specific term would seem clunky to a copywriter but outperformed elegant writing in split tests repeatedly.
The pattern: AI-generated copy optimized for statistical performance metrics that human writers weren't explicitly targeting. The copy looked different. It worked better.
2015 vs. 2023
The difference between what we built in 2015 and what GPT-4 can do in 2023 is roughly the difference between a hand-cranked calculator and a supercomputer. The underlying idea — train a model on examples of good output, generate new examples, select the best — is the same.
We were early practitioners of an approach that would take the world by storm eight years later. The infrastructure for doing it — the training data, the evaluation framework, the human-in-the-loop workflow — we built from scratch.
Generative AI for marketing copy predates GPT by most of a decade. Hanzo's research into language model applications for commerce started in 2014.
Read more
Hanzo ML v2: The Year the Models Actually Got Good
In 2018, the Hanzo ML stack crossed a threshold — the models were better than what clients could build in-house, not just faster. The shift from 'convenient AI' to 'essential AI' changed our product positioning.
Collaborative Filtering at Commerce Scale: v1.0 of the Recommendation Engine
The first version of the Hanzo recommendation engine used matrix factorization to find latent preference signals in purchase data. Here's how we built it.
K-Means and Random Forest: How We Found Ad Combinations That Worked
In 2013 we applied K-Means clustering and Random Forest feature selection to the problem of discovering which ad combinations actually convert. This was the foundation of the Earle system.