AI SEO

Search optimization for the age of language models.

DEJAN AI is the most advanced Australian AI SEO agency with global recognition for industry-defining innovations in AI search visibility. Our approach features a sophisticated multi-step process grounded in state-of-the-art machine learning and real data science.

20+

years of SEO expertise

1,000+

campaigns since 2008

230K+

AI rank datapoints analyzed

100%

of Gemini queries need grounding

Understanding & Control

Understanding

Our AI SEO discovery process leans on the methods from the emerging field of Machine Learning called Mechanistic Interpretability. Its objective is to understand the inner workings of deep learning models. We start by systematic model probing and mine for brand and entity perception.

Control

LLM and agent control is the ultimate goal of AI SEO. In machine learning this process is called Model Steering. Our objective is to utilize the knowledge gained from the model probing stage and form an AI SEO strategy designed to address any weaknesses in AI’s perception of our client’s products, services and brands.

What is AI SEO?

AI SEO is search engine optimization adapted for a world where search results are generated, not listed. Traditional SEO focuses on ranking web pages in a list of blue links. AI SEO focuses on being selected, cited, and accurately represented when language models generate answers.

When a user asks Google, ChatGPT, Perplexity, or any AI assistant a question, the model works through four stages:

1

Interprets the query intent

2

Retrieves relevant source material (grounding)

3

Synthesizes an answer from multiple sources

4

Selects which sources to cite

AI SEO optimizes for each stage of this process. The goal is not just visibility—it is accurate brand representation in AI-generated responses.

Why traditional SEO is no longer sufficient

The search paradigm has shifted. In 2013, DEJAN founder Dan Petrovic published “Conversations With Google,” predicting that search would evolve from query-response to conversational dialogue. That prediction has been realized.

The old model

User types query → Google returns ranked list → User clicks link

The new model

User asks question → AI retrieves and synthesizes sources → AI generates answer with optional citations

Brands that only optimize for traditional rankings may be invisible in AI-generated answers, or worse—misrepresented by models drawing on outdated or inaccurate sources.

The DEJAN AI SEO methodology

Our methodology is built on direct experimentation with language models. DEJAN founder Dan Petrovic trained a language model from scratch—not fine-tuning an existing model, but building from raw noise with a custom tokenizer and masked language modeling. This foundational work informs every aspect of our approach.

DEJAN AI SEO methodology

PHASE 1

Brand Knowledge Analysis

We begin by understanding what language models currently believe about your brand. This is not speculation—it is measurable. Token Probability Analysis examines how models complete sentences about your brand, and we analyze these probabilities to determine:

  • How strongly your brand is associated with key entities
  • Where associations are weak (high entropy) or strong (low entropy)
  • What the model “wants” to say about your brand versus what you want it to say

Tree Walker maps the branching paths of possible completions, revealing where models are confident about your brand and where they are uncertain or incorrect. Brand Relevance Scoring provides an exact probability score for the question: “Is this brand relevant for this entity?”

PHASE 2

Entity and Association Mapping

Language models understand the world through entities and their relationships. We map:

  • Core entities: Your brand, products, services, and key people
  • Associated entities: Topics, categories, competitors, and related concepts
  • Entity gaps: Associations that should exist but are weak or missing
  • Negative associations: Incorrect or undesirable entity relationships

This mapping uses our Query Fan-Out Model, available on Hugging Face, which generates expanded query variations to probe the full scope of model associations.

PHASE 3

Citation Mining

When AI systems generate grounded answers, they retrieve and cite sources. Citation Mining is our process for discovering which sources models actually select. Our Citation Mining Tool produces actionable data:

  • Top domains and exact URLs appearing in citations for your topic space
  • Confidence scores indicating citation strength
  • Position data showing where in source content the grounding occurs

Critically, we can also see what the model retrieved but chose not to cite. This reveals Selection Rate Optimization opportunities—content that is being seen but not selected.

PHASE 4

Grounding Prediction

Not every query triggers search grounding. Asking “what is 2+2” will not cause the model to search. Asking “what are the best project management tools in 2026” will. Our Query Deserves Grounding models predict whether a query will trigger grounding behavior in Google and OpenAI systems—preventing wasted optimization effort on queries that will never retrieve external sources.

PHASE 5

Optimization Execution

With complete diagnostic data, we execute targeted optimization across three fronts:

  • On-page: content restructuring for entity clarity, semantic completeness, and chunk optimization for retrieval (using our Chunk Norris tool)
  • Off-page: citation opportunity targeting, entity association building on high-authority sources, and selection rate improvement
  • Link: LinkBERT predicts natural linking opportunities, while Penguin ensures link patterns avoid penalty triggers

PHASE 6

AI Visibility Tracking

Traditional rank tracking measures position in a list. AI visibility tracking measures presence in generated answers. AI Rank tracks your brand’s visibility across AI systems over time, and AI Flux measures volatility in AI search results. We track citation frequency, brand mention accuracy and sentiment, entity association strength, and competitive share of AI visibility.

DEJAN AI visibility optimization cycle

Proprietary tools and models

DEJAN has developed an extensive suite of machine learning tools for AI SEO. These are not wrappers around third-party APIs—they are purpose-built systems based on original research.

Analysis & Diagnostic Tools

ToolFunction
Tree WalkerMaps token probability distributions to find high and low entropy completions in model output
Brand Relevance ToolCalculates exact probability scores for brand-entity relevance
Citation Mining ToolHarvests citations from AI responses with confidence scoring and source attribution
Query Deserves Grounding (Google)Predicts whether queries will trigger search grounding in Google AI
Query Deserves Grounding (OpenAI)Predicts whether queries will trigger search grounding in OpenAI systems
Gemini Token ProbabilitiesAnalyzes token-level probability data from Google’s Gemini models
Brand AI SentimentMeasures sentiment polarity in AI-generated brand mentions

Query & Content Tools

ToolFunction
Query Fan-Out GeneratorExpands seed queries into comprehensive query sets for testing
Query Fan-Out Model (Hugging Face)Open model for generating query variations at scale
Universal Query ClassificationClassifies query intent using machine learning
Oxy (Query Gap)Identifies synthetic queries and gaps from Search Console data
Content Substance ClassifierEvaluates content depth and substance
AI Content DetectionIdentifies AI-generated content

Link & Entity Tools

ToolFunction
LinkBERTPredicts natural internal linking opportunities within text
PenguinLink optimization tool for penalty avoidance
Google Entity SearchExplores Google’s entity understanding
Chunk NorrisOptimizes content chunking for retrieval systems

Tracking & Monitoring Tools

ToolFunction
AI RankTracks brand visibility in AI-generated responses
AI FluxMeasures volatility in AI search results
AlgorooMonitors traditional search volatility
Text SentimentGeneral sentiment analysis

Machine Learning Infrastructure

Our workflow incorporates Facebook Prophet for time series forecasting, Logistic Regression for classification tasks, the mixedbread-ai embedding model for semantic similarity, and FAISS for efficient vector search.

Selection Rate Optimization

When AI systems ground their responses, they often retrieve more sources than they cite. Selection Rate Optimization improves the likelihood that your content, once retrieved, is actually cited. We measure it as (times cited) / (times retrieved) × 100. Improving selection rate is often more efficient than pursuing new citation opportunities—you are optimizing content the model already knows about.

What industry leaders say

“Dan Petrovic made a super write up around Chrome’s latest embedding model with all the juicy details on his blog. Great read.”

Jason MayesJason Mayes
Web AI Lead, Google

“We were given our very own bespoke internal link recommendation engine that leverages world-class language models and data science. It’s one thing to theorize about the potential of machine learning in SEO, but it’s entirely another to witness it first-hand. It changed my perspective on what’s possible in enterprise SEO.”

Scott Schulfer
Senior SEO Manager, Zendesk

“Dan was so crucial and critical to the leaked document blog post that I wrote, and that’s had such big impacts on our company. So Dan, I really thank you for that.”

Mike King
iPullRank

“There’s a man named Dan Petrovic who does a lot of testing, and he has pulled in data specifically from Gemini that shows Google’s AI Overviews and AI Mode are really looking at a 160-character block of text to find the answer to that question.”

Lily Ray
Amsive Digital

“Dan Petrovic built an entire vector model that maps out all the concepts on a website… That’s the kind of AI innovation I’m most excited about—not AI replacing our jobs, but AI making our jobs easier.”

Gianluca Fiorelli
ILoveSEO

“Dan Petrovic is putting out some of the best, most advanced, most well-researched content in the SEO field right now.”

Moz
Industry recognition

Who needs AI SEO

Brands competing in informational queries

When users ask AI assistants for recommendations, comparisons, or explanations, will your brand be mentioned? If competitors are cited and you are not, you lose visibility in a channel that is rapidly growing.

Companies in complex or technical industries

Language models struggle with nuance. If your industry involves technical distinctions, regulatory specifics, or complex product differentiation, you need to ensure models represent you accurately.

Businesses affected by AI-generated misinformation

Models can perpetuate outdated information, competitor narratives, or simply incorrect claims. AI SEO includes identifying and correcting these misrepresentations.

Organizations seeking to own entity definitions

If you created a methodology, coined a term, or developed a unique process, AI SEO ensures models attribute these to you rather than generic descriptions or competitors.

Why DEJAN

  • Practitioner-built methodology: Our approach is not derived from guesswork about how AI works. Dan Petrovic trained a language model from scratch to understand the mechanics firsthand.
  • Decade of foresight: Dan’s 2013 article “Conversations With Google” predicted the conversational search paradigm now realized. This is not trend-chasing—it is long-term strategic development.
  • Proprietary tooling: The tools above are built and maintained by DEJAN. They are not repackaged third-party services.
  • Open research contribution: We publish models on Hugging Face and provide free tools for the SEO community. This reflects genuine expertise, not gatekeeping.
  • AI is SEO: We do not treat this as a separate discipline or use invented acronyms. Search optimization now requires understanding language models, and that understanding is core to what we do.

AI SEO glossary

Grounding

The process by which a language model retrieves external information to support its response. A “grounded” response cites real sources rather than generating from training data alone.

Token

The basic unit of text that language models process. Tokens may be words, parts of words, or punctuation. Token probabilities indicate how likely the model considers each possible next token.

Entropy

A measure of uncertainty in model output. High entropy means the model is uncertain between many possible completions. Low entropy means the model is confident about what comes next.

Citation Mining

The process of systematically querying AI systems and extracting which sources they cite, enabling analysis of citation patterns and opportunities.

Selection Rate

The percentage of times content is cited when it is retrieved by an AI system. Low selection rate indicates content is seen but not chosen.

Query Fan-Out

Expanding a single query into many variations to comprehensively test model behavior across phrasings and intents.

Entity Association

The strength of connection between two entities in a model’s understanding. Strong associations cause consistent co-occurrence in model outputs.

Query Deserves Grounding

A prediction of whether a given query will cause an AI system to retrieve external sources or answer from training data alone.

Masked Language Modeling

A training technique where the model learns to predict hidden portions of text. Used in training models like BERT.

Double Descent

The phenomenon where very large models, contrary to earlier assumptions, continue improving rather than overfitting. This discovery enabled modern large language models.

Ready to be the brand AI recommends?

Book a conference call with our senior strategy team to discuss your project in detail.