top of page

LLM Gateway Compliance: Kosmoy’s SLM Enforces the EU AI Act in Every Chat Turn

  • Writer: Umberto Malesci
    Umberto Malesci
  • Sep 2
  • 4 min read

Europe’s AI Act has shifted from legal theory to binding law. As of February 2025, entire categories of AI applications are outright banned in the European Union. Others are classified as “high-risk” and will soon carry stringent obligations.


This creates a dilemma for enterprises: most compliance today happens at the project level. A use case is reviewed, approved, and filed away. But what happens when a chatbot, approved for harmless customer queries, is suddenly asked to screen job applicants or analyze facial images? On paper, the use case was compliant. In practice, the live system just broke the law.


Kosmoy’s answer is a small language model (SLM) designed specifically to enforce the AI Act in real time. It doesn’t generate text; it acts as a guardrail, blocking requests that drift into prohibited or high-risk territory .


What the AI Act actually forbids


The EU AI Act’s Article 5 bans several practices outright. Companies cannot deploy AI systems that:


  • Manipulate users in ways that materially distort their decisions and cause harm.

  • Exploit vulnerable groups such as children, the elderly, or people in difficult circumstances.

  • Score or rank people — the so-called “social scoring” seen in some parts of the world.

  • Predict criminal behavior based only on profiling or personality traits.

  • Scrape facial images from the internet or CCTV to build recognition databases.

  • Categorize people by sensitive attributes inferred from biometric data (e.g. race, sexual orientation, political views).

  • Use emotion recognition in workplaces or schools.

  • Deploy real-time remote biometric identification in public spaces, outside of very narrow exceptions.


These prohibitions are already enforceable today.


Beyond this, the European AI Act defines high-risk applications (Annex III). These aren’t illegal, but they carry extra duties: registration, risk management, human oversight, and logging. High-risk categories include:


  • Biometric systems (outside the banned list).

  • Education (e.g. grading and admissions).

  • Employment and HR (e.g. recruitment, promotion).

  • Access to essential services (e.g. credit, insurance).

  • Critical infrastructure, law enforcement, migration/asylum, and justice systems.


If your chatbot touches one of these areas, regulators expect far more than a policy PDF.


The compliance gap: one-time validation vs runtime monitoring


Most enterprises today focus on validating the use case. Before launch, a project is reviewed: What’s the purpose? Who will use it? What model is connected? Does it fall under banned or high-risk categories? If the answers check out, the project gets a green light.


That’s necessary — but not sufficient.


Why? Because compliance risk emerges during live interaction. A chatbot that starts as an FAQ assistant can be nudged into dangerous territory with just a few prompts:


  • An HR bot asked to “filter candidates by their emotional tone” — effectively running emotion recognition at work.

  • A sales bot asked to “rank prospects by likelihood to buy, based on their social media” — a drift toward social scoring.

  • A banking bot asked to “analyze whether this customer deserves credit, using their online history” — a high-risk creditworthiness assessment.


None of these scenarios were written into the initial project plan. But they happen at runtime. And under the AI Act, what matters is not just the design of the use case, but the behavior of the system in practice.


Kosmoy’s guardrail SLM: small, fast, purpose-built


Kosmoy has built a fine-tuned small language model — under one billion parameters — that sits inside the company’s LLM Gateway . Its job is simple: inspect each request and block those that would cross into prohibited or high-risk use.


Why an SLM and not another giant LLM?


  • Speed: enterprises can’t afford 500 ms of extra latency per call. A small model can decide in milliseconds.

  • Cost: guardrails should be cheap enough to run everywhere, on-prem or in private clouds.

  • Focus: a compliance model doesn’t need creativity; it needs consistency.



When the guardrail SLM detects a problematic request, it doesn’t argue. It blocks. That binary decision is what compliance teams need — a hard stop before the system violates the law.


Continuous compliance, not checkbox compliance


Kosmoy’s philosophy is that compliance is not a one-off approval. It’s an ongoing process, embedded in every interaction. By placing the SLM in the runtime path, the system continuously enforces rules that map to the AI Act.


  • At the input stage, it inspects what the user is asking. If the request looks like it would create a prohibited use case, it’s stopped immediately.

  • At the output stage, it checks whether the system’s response drifts into territory that should not be delivered.

  • Across the session, it tracks context — not just single prompts — to catch violations that emerge over multiple turns.


For enterprises, this means peace of mind. Compliance officers get logs showing what was blocked and why. AI leaders can scale deployments without fearing that one user prompt will trigger a regulatory violation.



Why this matters for executives


For a Chief AI Officer or Head of Compliance, the implications are clear:


  • A policy document cannot stop a rogue prompt.

  • A one-time review cannot anticipate every real-world use.

  • Regulators care about outcomes, not intentions.


The only practical solution is runtime enforcement, and that’s exactly what Kosmoy’s SLM delivers.


It’s part of a bigger trend: by 2027, enterprises are expected to rely on small, task-specific models three times more often than general-purpose LLMs . Guardrail SLMs are one of the first clear examples — not glamorous, but absolutely essential.


Bottom line


The EU AI Act is the first regulation with teeth, and the fines for violations will be significant. Enterprises cannot afford to treat compliance as a checkbox. Kosmoy’s SLM guardrail offers a different path: compliance at the speed of conversation, enforced every time an employee interacts with an AI system.


Sometimes the most valuable model isn’t the one generating dazzling prose. It’s the small, quiet one that knows when to say “no.”


bottom of page