Opinions expressed by Entrepreneur contributors are their own.
The field of AI is advancing rapidly. Big companies are constantly releasing new foundational models. However, the definition of a fully open AI model is not clear. Many models claim to be “open”, but only a subset of their components are released openly, while the rest use restrictive licenses. This creates a spectrum of partial openness. For example,
- You may be able to make the model architecture and weights public, but not the training data and code.
- Trained Weights may be released under licenses that prohibit commercial use or restrict derivative works.
- Alternatively, you might release the trained weights under a permissive license and the code under a restrictive license.
This ambiguity about what is truly “open” impedes progress in AI adoption and impedes the creation of products and services for end users. It creates legal risks for entrepreneurs who may inadvertently violate the terms of partially open models. A clear framework for evaluating the openness nature of models is needed. Such a framework should help AI entrepreneurs, researchers, and engineers make informed decisions about which models to use, build derivative works from, or contribute to.
example
Consider a fictional AI startup called “yet-another-chat-bot.” They are developing an AI chatbot to improve customer support responses. To accelerate their development, they leveraged a fictional pre-trained language model called “llam-stral.” The authors of “llam-stral” have published a paper on arXiv describing their architecture and performance. They make the pre-trained weights available for download.
The engineers of “yet-another-chat-bot” used “llam-stral” to prototype their chatbot, but later discovered that the license explicitly prohibited commercial use and the creation of derivative works. Also, the training data and code used to train it have not been made public. They are now exposed to legal risks and potential IP infringement issues.
The right thing to do would have been to make llam-stral compliant with the Model Openness Framework and use standard open licenses such as Apache 2.0 for the code and CC-BY-4.0 for the weights and dataset. It would have been very obvious for the startup yet-another-chat-bot to use it commercially and build on top of it.
To achieve effective reproducibility, transparency, and usability of AI, we need a framework that defines model completeness and openness. Leveraging something like the Model-Openness framework published by GenAICommons helps both model creators and consumers understand what the key artifacts are, which are open and which are not. A fully open model releases all components, including training data, code, weights, architecture, technical reports, and evaluation code, under a permissive license.
RELATED: Scarlett Johansson asks why ChatGPT is like her
Components of an AI model
By publishing all artifacts and components related to large-scale language models under a permissive license, authors can claim that their models are truly and fully open, which promotes transparency, reproducibility, and collaboration in the development and application of large-scale language models.
Some of the required components are:
- Training data: A dataset used to train large-scale language models.
- Data preprocessing code: The code used to clean, transform, and prepare the training data.
- Model Architecture: Designing and structuring AI models, including layers, connections, and hyperparameters.
- Model parameters: The learned weights and biases of a trained AI model.
- Training code: The code used to train an AI model, including the training loop, optimization algorithm, and loss function.
- Evaluation Code: Code used to evaluate the performance of a trained AI model on validation and test datasets.
- Evaluation data: A dataset used to evaluate the performance of a trained AI model.
- Model documentation and technical reports: Detailed documentation of the AI model, including its objectives, architecture, training process, performance metrics, etc. A scholarly paper or technical report describing the AI model, its methodology, results, and contribution to the field.
The more open and permissively licensed works there are, the more open the model will be.
Related: OpenAI and metamodels will soon have “inference” capabilities
A truly open model accelerates innovation
Access to truly open AI models levels the playing field for AI entrepreneurs and helps them innovate. Instead of building every component from scratch, you can leverage state-of-the-art models and datasets. This allows you to prototype ideas faster, validate performance, and accelerate time to market.
Instead of spending time and resources reinventing the wheel or recreating baseline functionality, AI entrepreneurs can now focus on domain-specific challenges and identify ways to add value. The open licensing used with models that comply with the Model Openness Framework (MOF) also provides entrepreneurs with confidence that they can legally use their models in commercial products and services.
There’s no need to worry about the risk of IP infringement claims or sudden changes in license terms: the entire training data and code is accessible under an unrestricted license, allowing entrepreneurs to audit the provenance of their models and ensure regulatory compliance.
Additionally, engineers can inspect datasets for potential biases. Developers have access to the entire codebase, allowing them to find performance bottlenecks and improve performance. This makes models portable to different environments and improves maintenance over time. Thus, fully open models reduce the barriers to building AI-powered products and services, driving innovation.