a bunch of shiny balls in the dark

Generative AI: Machine Learning Insights for Digital Leaders

Generative AI tools are transforming work, but they’re powered by core machine learning fundamentals. Learn 5 essentials every digital leader should know.
Reading Time: 14 minutes

Aviso de Tradução: Este artigo foi automaticamente traduzido do inglês para Português com recurso a Inteligência Artificial (Microsoft AI Translation). Embora tenha feito o possível para garantir que o texto é traduzido com precisão, algumas imprecisões podem acontecer. Por favor, consulte a versão original em inglês em caso de dúvida.

Listen to this article:
0:00
0:00

Introduction

Generative AI tools are everywhere: summarising documents, drafting emails, writing code, and even generating images and video. For many leaders, they feel almost magical — type a prompt, get an impressive output. But beneath that smooth interface sit decades of machine learning research, from simple pattern recognition to trillion-parameter models trained on vast datasets.

If you’re responsible for AI strategy, product roadmaps, or digital transformation, you don’t need to become a data scientist. You do need a clear, non-hyped understanding of how generative AI tools actually work, what data they require, and where the real costs and risks lie.

In this article, we’ll unpack five machine learning fundamentals that sit behind modern generative AI tools and translate them into practical implications for your organisation.

The Three Machine Learning Strategies Behind Generative AI Tools

Most generative AI tools are built on three core machine learning strategies. They solve different types of problems and demand different kinds of data.

Supervised learning: when you know the answer

Supervised learning is like teaching by example. You feed the system historical cases where you already know the outcome:

  • Transactions labelled as fraudulent or legitimate

  • Customers tagged as churned or retained

  • Properties with known sale prices

The model learns patterns that link inputs (features) to these known outputs (labels). In practice, supervised learning usually takes two forms:

  • Classification – predicting categories

    • Will this customer churn? (Yes/No)

    • Should this ticket go to sales, support, or billing?

  • Regression – predicting numbers

    • What’s the likely sale price of this flat?

    • How many units will we sell next quarter?

Many familiar real-world systems are pure supervised learning:

  • Email spam filters learning from millions of messages users marked as spam or not spam

  • Medical image analysis systems trained on scans labelled by radiologists

Leadership takeaway: supervised learning is ideal when you have clean, labelled historical data and your future looks enough like your past. It will fail you if the world is shifting faster than your training data can keep up.

Unsupervised learning: discovering hidden patterns

Sometimes you don’t know what you’re looking for — you just suspect there are interesting patterns in the data. That’s where unsupervised learning comes in.

Instead of learning from labelled examples, unsupervised algorithms:

  • Find natural clusters of similar customers, behaviours, or products

  • Detect anomalies that don’t fit the usual pattern (e.g. potential fraud, system failures)

  • Reveal structures in high-dimensional data that aren’t obvious to the naked eye

Techniques like t-SNE and UMAP reduce complex data into simple 2D plots so you can literally see clusters and outliers. For example, a call centre might discover that customers who call exactly twice in their first month become their most loyal cohort — something no one thought to hypothesise beforehand.

Leadership takeaway: unsupervised learning is best for exploration, segmentation, and “unknown unknowns”. It generates insight, not ready-made decisions. Human judgement still decides what to do with the patterns it surfaces.

Reinforcement learning: optimisation through trial and error

Reinforcement learning is about learning by doing. An AI agent interacts with an environment, takes actions, receives feedback (rewards or penalties), and gradually learns which actions work best.

Typical examples include:

  • Optimising data centre cooling by trying different temperature and fan settings

  • Adjusting supply chain decisions in response to changing demand and constraints

  • Fine-tuning recommendation systems based on what users actually click or ignore

Crucially, reinforcement learning requires either:

  • A safe environment to experiment in (simulation, test environments), or

  • Strong safeguards in production, so bad actions can’t cause catastrophic harm

Modern generative AI tools also use a variant of this approach called reinforcement learning from human feedback (RLHF). People rate different responses (thumbs up/down, pairwise comparisons), and the model is nudged towards answers humans prefer.

Leadership takeaway: reinforcement learning is powerful for continuous optimisation in dynamic environments – but only when you can experiment safely. It’s a poor fit if experimentation risks harming customers, revenue, or safety.

Deep Learning: Why Scale Changed the Game

All three strategies above can be implemented with different types of algorithms. The ones powering modern generative AI tools are usually deep neural networks – layers of mathematical functions loosely inspired by the brain.

For something simple like handwritten digit recognition (the classic MNIST dataset), a small network with a few thousand parameters is enough to reach 95–98% accuracy. But as problems become more complex – understanding natural language, writing code, reasoning across documents – scale becomes critical.

Over the last decade, we’ve seen:

  • Early neural networks with thousands of parameters

  • Large language models (LLMs) with billions of parameters

  • Frontier models with hundreds of billions or more parameters

Each parameter is like a tiny dial that gets adjusted during training. More parameters mean more capacity to model subtle patterns in data – but also:

  • More compute required to train and run the model

  • Higher latency and energy costs

  • Bigger carbon footprint and infrastructure demands

The result is a genuine strategic trade-off:

  • Use smaller models when you need speed, low cost, and “good enough” answers for routine tasks

  • Use larger models when you need nuanced reasoning, handling of edge cases, or complex multi-step tasks

Leadership takeaway: “Bigger” isn’t always better. Choosing the right model size is a business decision, not a purely technical one.

Tokens, Parameters and Why Generative AI Tools Cost Money

When you interact with generative AI tools, you don’t pay by “question” or “document”. Under the hood, everything is measured in tokens.

A token is a small chunk of data:

  • A common word might be one token

  • A longer or technical term might be split into several tokens

  • Models have a maximum token window – an upper limit on how much they can “see” at once

API pricing from most providers is based on:

  • Input tokens – what you send in (your prompt, context documents, system instructions)

  • Output tokens – what the model sends back (the generated answer)

Larger models:

  • Use more compute per token

  • Are priced higher per million tokens

  • Can handle more complex reasoning and longer contexts

Smaller models:

  • Are dramatically cheaper

  • Often perfectly adequate for classification, extraction, simple drafting and routing tasks

Leadership takeaway: once you understand tokens and model size, you can have an informed conversation about unit economics:

  • Which use cases justify a large, expensive model?

  • Where can you standardise on a smaller, cheaper model without losing quality?

  • How will usage scale as adoption grows across teams?

Pre-Training, Fine-Tuning and Bespoke Generative AI Tools

Very few organisations train their own models from scratch. Instead, they build on pre-trained models from providers like OpenAI, Google, Anthropic, Meta, and others.

You can think of this in two stages:

Pre-training: the generalist education

In pre-training, a model ingests huge volumes of data – internet text, code, books, documentation – and learns broad patterns:

  • How language works

  • How code is structured

  • How facts and concepts relate

This is expensive, time-consuming, and mostly the domain of big labs and cloud providers.

Fine-tuning: the specialist training

Fine-tuning adapts a pre-trained model to your domain:

  • A law firm fine-tunes on contracts and case law

  • A bank fine-tunes on product documentation and policies

  • A retailer fine-tunes on product data, tone of voice, and customer support logs

You can also avoid fine-tuning altogether and use techniques like retrieval-augmented generation (RAG), where the base model stays frozen but looks up relevant documents from your own knowledge base at query time.

Leadership takeaway: the choice between base models, fine-tuned models, and retrieval-based systems is strategic:

  • Fine-tuning can improve performance but requires careful data curation and governance

  • Retrieval-based approaches are often easier to control and update

  • In both cases, your data quality and governance become the real differentiator

From Prediction to Creation: What Makes It “Generative”?

At heart, generative AI tools are still doing prediction – but applied in a clever, sequential way.

For text models:

  1. The model reads your prompt and context as tokens

  2. It predicts the most likely next token

  3. It appends that token to the sequence

  4. It repeats the process, one token at a time

This is called autoregressive generation. It’s the same underlying mechanism whether the model is:

  • Drafting an email

  • Translating a paragraph

  • Writing code

  • Summarising a fifty-page report

Importantly, the process is stochastic, not strictly deterministic. Instead of always picking the single most probable next token, the model:

  • Samples from among the top likely options

  • Uses a temperature parameter (set by developers) to control how adventurous it is

That controlled randomness prevents outputs from becoming dull and repetitive, and it’s why you can ask the same question twice and get slightly different answers.

Leadership takeaway: generative AI isn’t magic. It’s pattern prediction plus scale, applied sequentially. Understanding that removes some of the mystique and helps you think more clearly about where it will and won’t work.

Conclusion: Turning Machine Learning Fundamentals into Better AI Decisions

Generative AI tools are the visible tip of a much deeper machine learning iceberg. Behind the chat interfaces and demos lie:

  1. Three learning strategies – supervised, unsupervised, and reinforcement learning

  2. Deep learning architectures that scaled from thousands to billions of parameters

  3. Tokens and parameters that drive both capability and cost

  4. Pre-training and fine-tuning that transform general-purpose models into domain experts

  5. Autoregressive prediction that turns pattern-matching into new content

As a digital or product leader, understanding these fundamentals changes the questions you ask:

  • Do we have the right data for this kind of learning?

  • Does this use case need a large, expensive model or a smaller, efficient one?

  • Is this a prediction, exploration, or optimisation problem?

  • Should we rely on fine-tuning, retrieval, or both?

The technology will keep evolving, but these foundations will remain. Get them right, and you’ll be able to evaluate generative AI tools with clear eyes – and deploy them where they create real, defensible value.

FAQs: Machine Learning and Generative AI Tools

1. Do I need to understand neural network maths to use generative AI tools effectively?

No. You don’t need to derive backpropagation on a whiteboard. What you do need is a conceptual grasp of how supervised, unsupervised, and reinforcement learning differ, what tokens and parameters are, and how model size affects cost and performance. That’s enough to make sound strategic decisions and challenge vendors intelligently.

Not necessarily. Larger models can handle more complex reasoning and edge cases, but they’re slower and more expensive. Many day-to-day tasks – classification, extraction, templated drafting – run perfectly well on smaller, cheaper models. A good AI strategy deliberately matches model size to business value.

You’ll get the most value when you have:

  • Clean, representative historical data for supervised learning tasks

  • Rich behavioural or operational data for unsupervised exploration

  • Safe environments or simulations for reinforcement learning
    For many generative AI applications, well-structured internal documents, knowledge bases, and logs are more valuable than exotic data sources.

No. Fine-tuning shines when you have lots of high-quality, domain-specific examples and stable requirements. RAG is often better when your content changes frequently, you need transparency over sources, or you want to avoid maintaining multiple fine-tuned variants. In practice, many mature organisations use a mix of both.

Start from problems, not from the technology. Identify where:

  • Knowledge work is repetitive and pattern-based

  • Teams are overwhelmed by information and would benefit from summarisation or search

  • Customers would value faster, more tailored responses
    Then map those problems to the machine learning strategies above, assess your data readiness, and run small, tightly scoped experiments before scaling.

a bunch of shiny balls in the dark

Table of Contents

Post Tags

Support this site

Did you enjoy this content? Want to buy me a coffee?

Related posts

Stay ahead of the AI Curve - With Purpose!

I share insights on strategy, UX, and ethical innovation for product-minded leaders navigating the AI era

No spam, just sharp thinking here and there

Level up your thinking on AI, Product & Ethics

Subscribe to my monthly insights on AI strategy, product innovation and responsible digital transformation

No hype. No jargon. Just thoughtful, real-world reflections - built for digital leaders and curious minds.

Ocasionally, I’ll share practical frameworks and tools you can apply right away.