Inside LLMs - Foundations and the novel approach of DeepSeek

Jun 23, 2025

This is the first article in a series examining the rise of new large language model (LLM) vendors and the evolving risks that come with them.

For security teams, understanding how these models are built is foundational. Before assessing risks—whether they relate to model misuse, data exposure, or adversarial manipulation—it's essential to grasp the principles that shape an LLM's capabilities and constraints. Knowing how the underlying systems work enables more informed threat modeling, detection strategy, and governance.

Our first step: understand how these models are built, and why DeepSeek might be redefining the economics of LLM development.

Key Takeaways:

LLMs are built through a well-defined multi-stage pipeline, including pretraining, instruction tuning, and preference tuning.
DeepSeek innovates at multiple layers—particularly in reducing cost, increasing alignment efficiency, and prioritizing reasoning.
DeepSeek has open-sourced not just its models but also its training methodology—a strong signal of transparency that sets it apart from many competitors.
Hosted LLM services—whether from DeepSeek (online), xAI (Grok), or others—rely on hidden pre-prompts and filtering layers that raise real data governance concerns. These platforms may be influenced by geopolitical forces or individual personalities. In DeepSeek's case, operations are subject to Chinese state regulations. In xAI's case, alignment and moderation decisions may be shaped by the personal views of Elon Musk.

How Large Language Models Work: A Layered Process

Modern LLMs follow a structured, three-stage development pipeline:

1. Pretraining

A massive neural network is trained on a vast corpus of text to predict the next word in a sentence. This forms a probabilistic language model—the core capability of the LLM.

2. Instruction Fine-Tuning

The pretrained model is further trained on structured question-answer or task-specific pairs (e.g., "Explain this algorithm"). This teaches the model to follow human instructions.

For open-source models like LLaMA or Mistral, they share their models at this stage, often published with the “Instruct” suffix:

https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503

3. Preference Fine-Tuning

This stage is where the model starts to feel "smart."

Human annotators rank outputs, and a reward model is trained on these rankings. It is then used to align the LLM's responses with human preferences. This was the breakthrough that made ChatGPT feel dramatically more reliable than its predecessors.

Without this phase, models are far more likely to go off the rails—as we saw with Meta’s Galactica, which launched just two weeks before ChatGPT-3 and was quickly pulled after trolls exploited it to generate misinformation

DeepSeek follows this same general architecture—but with some key twists.

January 2025 - A DeepSeek Moment

Discover the end of the article directly on ThreatLink

<Vulnerability with an IT subcontractor: Who pays the bill?

The 5 steps to integrate AI into your cyber processes>