An analysis of the GPAI model guidelines published by the European Commission
An analysis of the GPAI model guidelines published by the European Commission
Koen Holtman and Ze Shen Chin – AI Standards Lab. 29 July 2025.
On July 18, 2025, The European Commission published its first Guidelines on the scope of obligations for providers of general-purpose AI models under the AI Act. In this post, we provide an analysis of these guidelines. As these obligations for providers go into force on 2 August 2025, we decided that a timely publication of several issues found in this analysis was warranted, both to inform stakeholders affected by these guidelines, and to inform the broader societal debate about the AI Act.
We have found two problems in the guidelines, both related to the case where GPAI models are modified by an actor downstream in the value chain:
The guideline text has some ambiguities, which makes the intent of the Commission hard to understand. We call on the Commission to resolve the ambiguities we identify below as soon as possible, because their existence affects the safety of the AI value chain.
If we move ahead with our analysis, based on some likely assumptions about how the ambiguities are resolved, then we find that the revealed intent of the Commission has some unexpected and unwanted consequences. Below, we will highlight specifically:
Unwanted consequences for the legal liabilities of GPAI model providers, and more specifically liabilities for providers making models that are above the 1023 FLOP threshold for training but below the 1025 FLOP threshold indicating systemic risk. These liabilities also affect open source projects and non-commercial academic efforts publishing models
Unwanted consequences for copyright holders
Unwanted consequences for the enforcement of the AI Act against bad or careless actors, including the ability of the national authorities to enforce high-risk AI provisions of the Act
Unwanted consequences for trust and safety in the value chain, and for the growth of the AI industry in Europe
With respect to point 2.1, we have included observations for GPAI model providers who want to limit their legal liabilities. But more generally, given point 2, we call on the Commission to update its guidelines as soon as possible, to avoid the unwanted consequences.
While we found problems, we also note that the remainder of the guidelines document does offer many clarifications on the AI Act, and welcome details about what enforcement by the AI Office will look like.
We now discuss the main ambiguities we found in the guidelines. All of these have to do with the case where GPAI models are modified by an actor downstream in the value chain.
Ambiguity: Is there a cross-referencing error in the guidelines document, where the references to paragraph 60 really intend to reference paragraph 63?
There are three references to paragraph 60 in the document: the footnote 6 to paragraph (30) and paragraph (32) both mention ‘the threshold laid down in paragraph 60’, and paragraph (65) further mentions ‘the criterion set out in paragraph 60’.
However, if we examine the text in detail, then we consider it likely that the Commission intended to say ‘paragraph 63’ in these cases.
We have reached out to the Commission about this, and the Commission confirmed that there was a text formatting error that shifted the paragraph numbers in the published guidelines document. They indeed meant to reference paragraph 63.
Resolution of the ambiguity: should be paragraph 63, not paragraph 60.
The Commission told us they will correct this formatting error when they publish the guidelines in all languages. The Commission also shared that there are further cross-referencing errors in the guidelines document now on the web page we link to above: the references to paragraph 31 should go to 32, 115 to 118, and 116 to 119.
Ambiguity: Does the Commission consider that, if the indicative criterion in (63) is not met, then it is valid to conclude that the criterion in (62) is also never met?
Paragraph 62 says that
(62) Instead, the Commission considers a downstream modifier to become the provider of the modified general-purpose AI model only if the modification leads to a significant change in the model’s generality, capabilities, or systemic risk.
The text then proceeds with defining an ‘indicative criterion’ as a threshold in (63), where it is argued that if this threshold is exceeded, then the modifier meets the criterion in (62).
(63) Based on the considerations above, an indicative criterion for when a downstream modifier is considered to be the provider of a general-purpose AI model is that the training compute used for the modification is greater than a third of the training compute of the original model (see paragraph 115 for how ‘training compute’ should be understood in these guidelines).
However, the text is ambiguous, about how the Commission intends to proceed if a downstream modifier does not meet the (63) indicative criterion. Will the Commission be using other criteria to evaluate the condition in (62) too? This question is all the more relevant because technically speaking, there are many model modifications that do not meet the threshold (63), but would definitely create ‘a significant change in the model’s generality, capabilities, or systemic risk’ as considered in (62).
Technical side bar: impactful model modifications
Many GPAI models, especially LLMs, that are put on the market right now are more than just a set of weights.
The input-output behavior of the LLM as a product, its behavior when users or programs interact with it, is also often determined by
The contents of the LLM’s system prompt,
Any input of output filters present, for example
An output filter to detect and suppress bad outputs, either by re-running the LLM so that it produces a better output, or by simply ending the session with the model. The bad output detection function may itself be implemented by running a second LLM model to evaluate the output of the first.
An output filter to stop the model from producing outputs that resemble copyrighted works too closely
The temperature parameters of the model
The nature of the autonomous access that the model has to any tools, or to the Internet
A change to any of the above elements can have a significant impact on the model’s input/output behavior, and therefore on its capabilities and systemic risk as mentioned in paragraph (62).
Moreover, GPAI providers are required by article 63(1)(b) to provide up-to-date information to ‘enable providers of AI systems to have a good understanding of the capabilities and limitations of the general-purpose AI model’. A change in the system prompt can produce a major change in the model bias: if the change is significant, then downstream providers will need to be informed.
All of the above elements can be modified without using any training compute at all, so these changes would not trigger the indicative criterion in paragraph (65), which is that the model modification uses training compute of more than one-third or the compute used to create the original model.
However, the AI literature is also full of reports where, by using very little training compute, the bias, capabilities, or safety of an LLM model have been changed significantly.
For example, the Badllama 3 paper by Volkov 2024 (https://arxiv.org/abs/2407.01376) details a method for removing safety fine-tuning from a medium-sized LLM model (LLaMa 3 70b) in around half an hour of training on an A-100, around 6×10¹³ FLOP, compared to the ~7-9x1024 FLOP needed for training it. The literature is full of other examples, like https://arxiv.org/pdf/2406.11717, https://arxiv.org/pdf/2308.10248, https://arxiv.org/pdf/2412.05346, https://arxiv.org/abs/2310.03693, and https://arxiv.org/abs/2310.20624 .
Model backdoors can also be inserted using little training compute, see https://arxiv.org/pdf/2401.05566 for a review. For an example of improving capabilities, see https://arxiv.org/pdf/2501.19393. Crucially, in these examples the reported tuning action stays far below the one-third threshold of paragraph (63).
So should upstream providers and downstream modifiers expect that the Commission will blindly apply the technically strange criterion in (63), or should they hedge their bets and expect frequent, or occasional, different judgements?
In question 8 of the consultation that preceded the writing of the guidelines, the Commission asked for comments on using the indicative criterion that has now been reflected in (63), despite it having ‘various shortcomings’:
Many downstream modifiers will have to assess whether they need to comply with the obligations for all providers of general-purpose AI models and the obligations for providers of general-purpose AI models with systemic risk. A pragmatic metric is thus highly desirable to limit the burden on downstream modifiers having to make this assessment, especially on smaller entities. Do you agree that training compute is currently the best metric for quantifying the amount of modification, despite its various shortcomings?
To resolve this ambiguity in the most likely way, we therefore reason that the Commission indeed intends to use this single indicative criterion alone, at least until further notice: even though it has various shortcomings, at least it creates some legal certainty. Still, we also believe that the word ‘indicative’ implies that the Commission might conclude differently in ‘exceptional’ cases, following the logic explained most clearly in paragraph (20) of the guidelines.
Assumed resolution of the ambiguity: for now, the Commission (the AI Office) will use the indicative criterion in (63) to assign the model provider role most of the time. But being ‘indicative’ only, the Commission does reserve the right to judge otherwise in exceptional circumstances.
While we believe the above assumed resolution is reasonable, we believe it is entirely possible that some readers of the guidelines will walk away with the impression that indicative criterion will not be used almost all the time, but that it will be used only when it aligns with their technical common sense. Given this, we are concerned about the case where an upstream provider of a model comes to a different conclusion than the downstream modifier of the model. In that case, we may end up in a situation where neither party in the value chain believes they are responsible for achieving certain AI act requirements with respect to the modified model, causing the value chain to become unsafe.
We therefore call on the Commission to resolve the above ambiguity in very clear writing as soon as possible, for example by writing examples in a frequently asked questions document.
Ambiguity: Consider the case where the Commission concludes that a downstream party modifying an upstream model will not in fact be the provider of that modified model, so that this party is not responsible for ensuring that the requirements in article 53, 54, and sometimes 55 are met by this modified model. Does this then mean that
a) the Commission believes that the provider of the original upstream model is responsible for making sure that the modified model meets the requirements in articles 53, 54, and sometimes 55, or
b) will, in fact, nobody be held responsible for ensuring that the modified model meets these requirements?
This ambiguity exists, for example because paragraph (24), which explains that providers modifying their own models are definitely responsible for ensuring that these modified models (“throughout its entire lifecycle”) also meet meet the requirements in article 53 and sometimes 55; but with regards to the model lifecycle, paragraph (23) says that ‘Different considerations apply if another actor modifies the model (see Section 3.2)’.
Now consider the case of upstream GPAI model providers making models that are above the 1023 FLOP threshold for training but below the 1025 FLOP threshold indicating systemic risk. These may be small for-profit companies, open source projects, or academic projects releasing their models in an open weight fashion. We believe it is entirely likely that such readers of the guidelines will walk away with the mistaken impression that, if somebody downstream modifies their model, b) above applies: that as an upstream provider they are not responsible for ensuring that the modified model will meet article 53 and sometimes 55 requirements. Again, common sense reasoning would point some of them in this direction.
To be clear, based on our reading of the AI Act, option a) above is the only one that creates regulatory outcomes compatible with the intent of the legislator: we read the AI Act as saying that every GPAI model brought to market or put into service must have a party who is the provider, a party that is made responsible for achieving certain outcomes with respect to that model. The Commission is empowered by the act to be somewhat creative in how they will assign provider status, but they are not empowered to use mere guidelines documents to define that certain GPAI models brought to market or put into service are exempt from the AI Act entirely.
Assumed resolution of the ambiguity: case a) applies. The rest of the analysis in this document is premised on this assumption.
Again, based on the value chain safety concerns above, we call on the Commission to resolve the above ambiguity in very clear writing as soon as possible, by explicitly describing the consequences for model providers who have released models in a way that allows for them to be modified by downstream actors.
As an example of how to clarify the ambiguity, in a way that would make sense to most software developers, we recommend that the Commission explicitly introduce the concept of model versions, where the model versions can be arranged in a family tree. We then recommend that the Commission clarifies that if a downstream actor creates a modified version that satisfies the indicative criterion, this new version should no longer be considered part of the original models’ family tree: instead it becomes a ‘new model’ in the sense of recital 97 and the modifying party becomes the provider of this new model.
We now look at the consequences of the guidelines published by the Commission, under the assumption that the ambiguities above are all resolved in the way we expect them to be resolved.
As also noted by the Commission in paragraph (67), most modifications of GPAI models with systemic risk would remain below the paragraph (63) indicative criterion, so that the original provider also becomes the provider of a modified model even if the modification is made by a downstream actor. We consider this outcome to place a generally acceptable burden on the upstream provider.
One part of the burden is that the upstream provider is responsible for making sure that the modified model complies with Article 53(1) obligations, which include
Drawing up some documentation for the modified model, to be provided when the AI Office or national competent authorities ask for it, according to Article 53(1)(a).
Ensuring that downstream users of the modified model who incorporate it into an AI system have enough information to enable them ‘to have a good understanding of the capabilities and limitations of the general-purpose AI model and to comply with their obligations pursuant to this Regulation’, according to Article 53(1)(b)(i)
Ensuring that the modified model has ‘publicly available a sufficiently detailed summary about the content used for training it’, according to Article 53(1)(d)
If we look, for example, at how OpenAI has structured its facilities, allowing downstream actors to create fine-tuned models (see e.g. https://platform.openai.com/docs/guides/supervised-fine-tuning, https://platform.openai.com/docs/guides/reinforcement-fine-tuning), then we can note that OpenAI already has a lot of the needed infrastructure for 1 and 2 in place (see e.g. https://help.openai.com/en/articles/8554397-creating-a-gpt).
Not all frontier model providers may want to offer such model modification facilities to their customers. But for those who do, the burdens of still being classified as the provider of the modified model generally seem to be acceptable, at least under the assumption that the customers making the modifications are honest customers who perform these modifications within the infrastructure under the oversight of the original model provider. We consider the case of downstream bad actors further below.
We now consider consequences for an open source project releasing a GPAI model that does not fall in the systemic risk category. Reading the AI Act, we see a clear intent by the legislator to make life as easy as possible for such an open source project.
However, in the case that a downstream actor modifies their model while staying below the paragraph (63) indicative criterion threshold, roughly the same logic as explored above applies. First, the upstream open source project is responsible for ensuring that the modified model has ‘publicly available a sufficiently detailed summary about the content used for training it’, according to Article 53(1)(d). This responsibility also applies to any content that the downstream actor could have used to train or fine-tune the model further. If publication somehow does not happen, then the open source or academic project might be subject to fines under the AI Act.
If the downstream modifier releases their modified model as non-open-source, then the open source project also becomes responsible for ensuring that the modified model has sufficient documentation under article 53(1)(a) and (b).
We believe that placing these burdens on open source projects providing non-systemic risk models is entirely inappropriate, given the intent of the legislator. Making them responsible puts open source projects in an impossible legal position, where the only way to protect project members from legal liability is to stop releasing GPAI-level open source models.
We call on the Commission to amend the indicative criterion with further criteria, so that the Article 53 burdens for downstream modified models fall clearly on the party doing the modification.
Some vendors will be selling GPAI models for profit, in a way where the customer gets the weights and is able to modify them, or modify other parts of the model In such cases vendors would also be responsible for making sure that the modified models comply with article 53(1), as long as the modification does not meet the indicative criterion.
These vendors could use contract law to bind their customers, so that they are obliged to help the vendor publish the necessary documentation, or take full responsibility of publishing the necessary documentation themselves, on behalf of the vendor. So these vendors are left in a better position than open source projects.
However, we think the burdens are still somewhat unreasonable, if we consider the case of what happens when the model vendor customers are dishonest. We will explain these concerns as part of our broader analysis of regulatory pressure below.
We now consider a consequence that does not just affect GPAI model providers, but more fundamentally affects the question of how the AI Act will be enforced, and what the consequences of this are on the safety of the AI value chain.
We have seen above that model vendors can use technical measures, as well as contract law, to keep the AI value chain downstream of their model safer. We have also seen that the AI Act provides clear incentives for them to use both tools.
But beyond what these vendors can do, the AI Act also grants duties and privileges to the AI Office and to national competent authorities, duties to oversee the market and the privilege to fine bad actors.
We now consider a case where a bad actor modifies a model and places it on the market in a dangerous or unethical way, such as training it on copyrighted data against the wishes of rightsholders, or lying about the technical specifications of the model in the documentation. In order for the AI Office to subject such a bad actor to fines, the Office would first have to make a determination that the party in question is, in fact, a provider. The Office has to find that they are either a model provider (so that fines for violating Articles 53 and 55 become possible), or an AI system provider, allowing fines based on other Articles.
Now, the Guidelines declare that, except maybe in exceptional circumstances, the Commission will not consider parties downstream from the original GPAI model provider to ever be GPAI model providers themselves, as long as they do not modify the model using more than a third of the original training compute. This implies that the AI Office will also not be willing to ever fine such parties under Articles 53 and 55, no matter what they get up to.
It also implies that the AI Office will not, except maybe in exceptional circumstances, apply Article 53(3) to these parties, compelling these parties to ‘cooperate as necessary with the Commission and the national competent authorities in the exercise of their competences and powers pursuant to this Regulation’.
We therefore believe that the current guidelines create a value chain that will not be regulated well enough. While we have seen that GPAI model providers also have some tools to constrain bad actors, and have some motivation to use them, we do not consider these tools themselves to be sufficient to create a trustworthy market that will produce safe outcomes, as considered by Article 1 of the AI Act.
We are worried that bad actors will read the guidelines as being an open invitation, extended by the Commission, to stop caring about the AI Act. An open invitation stating that, as long as they avoid triggering the indicative criterion in paragraph (63), they can do whatever they want without ever risking fines. If they violate the upstream model license agreement, they may have to worry about what the upstream model vendor could do to them using the courts, but they do not need to worry about being investigated by the AI Office or national competent authorities.
We call on the Commission to clarify these matters, so that potential bad actors will be sufficiently deterred from actions with negative safety consequences in the AI value chain. Specifically, the Commission could extend the indicative criterion in (63) with additional indicative criteria, acting as triggers under which bad actors would get model provider status.
Such criteria would not only deter bad actors, but also offer some reassurance to good actors. Specifically, we are looking for guidance that will reassure GPAI model providers that, if they make other ‘clear and unequivocal’ exclusions in their license agreements or safety documentation accompanying the model, the Commission will be willing to fine bad actors parties who ignore these exclusions, at least when ignoring them has safety implications, or implications for the rights of copyright holders.
Paragraph (59) gives an example where, if a downstream AI system provider ignores the fact that a GPAI provider of a model with systemic risk 'has excluded, in a clear and unequivocal way, the distribution and use of the model on the Union market', the Commission will in fact consider a party placing or using that model on the Union market anyway to be a model provider, which therefore makes them subject to fines. This example is a good first step at reassurance, but further guidance would be needed, because we believe that this single example also has an unwanted effect on the willingness of legally conservative GPAI model providers to enter the Union market. Right now, paragraph (59) almost reads as an invitation to the legal departments of such providers to conclude that, if their model is at all susceptible to modification, the only solid way to mitigate their legal risks under Article 53(1) is to stay out of the Union market entirely, by excluding both the distribution and the use of their model on the Union market.
The GPAI guidelines, as published on 18 July 2025 that we have reviewed here, complement the General-Purpose AI Code of Practice that was published on 10 July 2025. Beyond analysing the Guidelines, we also analysed the Code, and we are happy to report that all three Chapters of the Code look pretty good, and do not have the kinds of ambiguities or unintended consequences we report on here for the Guidelines.
Article 56(2)(d) of the AI Act says that the Code that has been drawn up as a clarifying document, should ‘take into account the specific challenges of tackling [systemic] risks in light of the possible ways in which such risks may emerge and materialise along the AI value chain.’ This then leads to the question of the Code, in combination with these Guidelines, indeed adequately fulfills this Article 56(2)(d) criterion. We have argued above that this adequacy criterion is not fulfilled, and that more work needs to be done on the Guidelines:
We have identified unwanted ambiguities and call on the Commission to fix these, where one fast route to fixing them may be to update the FAQ documents accompanying the guidelines as quickly as possible.
We have also identified unwanted consequences for the value chain, stemming from the use of only one single indicative criterion to determine model provider status for modified versions of a model: we call on the Commission to define and publish additional criteria.
The parts of the AI Act covered by the Code and the guidelines go into force on 2 August 2025: we have therefore decided to publish this analysis quickly. We may follow up with an additional publication to make more specific suggestions for point 2, suggestions for a set of additional indicative criteria that would adequately remove the unwanted consequences we identified.