8 Intellectual Property and Commercial Questions to Ask Your Generative AI Tool Provider

10 minute read | August.03.2023

As a rapidly increasing number of generative artificial intelligence products come to market, companies interested in incorporating these technologies must evaluate such offerings by balancing speed of implementation and technical capabilities with financial and legal considerations. Providers of such tools are pushing the envelope on what’s possible from a technology perspective while also attempting to assess ethical considerations, tune the models to reflect concerns and build in user guardrails.

Companies are not immune from legal, regulatory and ethical concerns created by generative AI tools merely because they are a “user” and not the “developer.”  In fact, companies using third-party generative AI tools to develop public-facing content may become the initial recipients of third-party demands and regulatory inquiries about generative AI-sourced content distributed to a largely unassuming general public.

Before using generative AI products or services, a company should consider addressing the questions below that focus on core intellectual property (“IP”) and commercial considerations. Other legal areas, including data privacy, cybersecurity, employment and international trade, will also likely be important to separately consider.

1. What terms apply to the planned use of the generative AI tool?

  • The same (or substantially similar) generative AI tool may be available pursuant to different types of agreements to account for different usage types, such as “enterprise” or “free” tiers. A tool also may have a variety of applicable sets of legal terms that govern its use, such as a terms of use, acceptable use policy, privacy policy or data processing agreement.
  • Depending on the provider, an enterprise license can provide for a materially different allocation of rights and risks that could propel or inhibit a customer’s business. While it may be more expensive, a customer should consider pursuing an enterprise agreement to more explicitly provide for their use case and secure certain set terms during the term of the agreement (as online terms of use typically may be updated from time to time by the provider).

2. What are the sources of training data for the generative AI tool?

  • Generative AI tools offered to date have been trained on a wide array of datasets, with some datasets fully and expressly authorized and others predominantly scraped from the internet. Such datasets often include some combination of first-party data, data licensed directly from third parties (either pursuant to a commercial license or open data license) and data scraped from the web and other sources.
    • Where the licensor of the tool is not also the owner or primary source of the training data (which is frequently the case), the licensor may not have a comprehensive understanding of the gaps and limitations within the training data used to develop its generative AI tool.
    • If detailed background information around training data is not included on any available model card, it’s valuable to ask detailed questions in this area to assess the strengths and weaknesses of the licensor’s training program and tool veracity and reliability.
  • Since generative AI tool suppliers may resist providing specifics to prospective customers on this question, a customer ultimately may face a business decision on to what extent the potential benefits outweigh possible legal or technical risks surrounding development and training of a particular tool. The provider’s reputation and a customer’s level of trust with them represents another key factor in making this business determination.

3. Is the provider willing to agree to representations, warranties or indemnities concerning customer use of the generative AI tool?

  • A provider’s insight into the training process of its models and sources of its underlying training data can significantly influence their willingness to agree to customer protections in any agreement (e.g., if the training data was “scraped” from a third-party website versus being obtained pursuant to a paid license that itself contains representations, warranties or indemnities regarding such training data).
  • Multiple class-action lawsuits have been filed challenging generative AI tools on intellectual property and privacy grounds. The technology industry has broadly taken the position that using copyrighted material to train machine learning models is a fair use and that scraping public data is not a privacy violation. Until the courts resolve such issues, users should pay particular attention to whether training datasets were authorized and the degree to which tool providers offer commercial protections.
  • With free licenses in particular (but also paid licenses in many instances), providers will likely resist providing representations, warranties or indemnities that customers may expect in the typical software license scenario. Their likely resistance is due to the volume of third-party data dependencies, challenge of quantifying this new type of risk and difficulty/impossibility of tracking all elements that went or go into building and testing generative AI tools.

4. What steps has the provider taken to monitor the quality of its generative AI tool and for what applications has the tool been tested?

  • Despite being a known issue, incorrect responses confidently asserted as fact by a generative AI tool (frequently referred to as “hallucinations”) remain a key challenge. For some industries, such as healthcare or finance, this could limit the meaningful implementation of generative AI tools for the foreseeable future.
  • To evaluate the ongoing reliability of a generative AI tool and to maintain quality, it’s critical to keep a human in the loop, especially where the resulting content is shared externally.

5. What rights to customer data does the generative AI tool provider want or need?

  • Depending on the sensitivity of the input data and uniqueness of a customer’s use case, the customer may or may not want to contractually allow a generative AI tool provider to use input as training data to further develop and refine provider algorithms/models. Where the confidentiality of input data is necessary based on a customer’s separate contractual obligations or limited use is otherwise competitively valuable, customers should push for broad confidentiality terms with each generative AI tool provider and prohibit the use of customer data for further training of the provider’s algorithms or models.
  • Since the output of generative AI tools can closely emulate the associated training data, this may be of particular concern where a customer uses such tools for a niche use case or data may indirectly be attributable back to such customer. For instance, in some cases with tailored prompting, competitors may be able to reverse engineer the nature of a customer’s use of a given generative AI tool, including the input or output.
  • Where possible as a customer, avoid providing representations, warranties or indemnities as to input, particularly if the provider intends to train its generative AI tools on such input, as such terms could lead to broad liability exposure. At minimum, include a limitation of liability provision capping customer liability.
  • If the customer intends to use generative AI tools on behalf of its own customers or in ways that will otherwise relate to another third-party relationship, a customer should consider whether it will need to amend these agreements. It also should consider what obligations (including confidentiality standards) it would need to impose on the prospective tool provider to avoid the need to renegotiate terms for existing contracts.
  • Providers of generative AI tools may push back that they need a broader license to customer data to improve their offering, address security issues or for other regulatory purposes. A customer should assess such provider contracting requirements on a case-by-case basis and monitor them over time as customer internal uses of generative AI tools (and thereby the nature of the input) evolve.
  • In addition, providers may want broad rights to “feedback” or “suggestions” the customer provides, regardless of whether it includes confidential customer information. In that case, it’s important to expressly exclude confidential customer information and customer data from any use of feedback.

6. Who owns the IP rights associated with the output of the tool?

  • Depending on a customer’s use case, it may or may not be critical to be assigned sole ownership of tool output and associated IP rights. If it is crucial that a user be able to prevent a competitor from using the output, then the customer must consider prohibiting the use of generative AI in such areas, or adopting special procedures to ensure that sufficient human-authored expression exists in the final product to warrant IP protection.
  • The U.S. Patent and Trademark Office, U.S. Copyright Office and U.S. federal courts have thus far refused to register any inventions or works created solely by generative AI tools.
    • The U.S. Copyright Office has recognized that works created using generative AI tools may be registrable at least in part, depending upon the degree of human-authored expression and the degree of human control over any such tool’s output.
    • Depending on the nature of a customer’s use and the terms of the agreement with the generative AI tool provider, it may be possible to claim trade secret protection over some output.
    • Additionally, as between the customer and the provider, the customer may wish to allocate any current or future IP rights associated with output to itself in case standards and as IP rights in the United States evolve, and to account for foreign jurisdictions where registered IP rights for generative AI developments are available.
  • Due to the accessible nature of free generative AI tools, form agreements frequently include language that allows the same output to be provided to multiple customers. To the extent a customer considers any output to be material or proprietary, such lack of exclusive assignment in output creates a material risk from an IP ownership perspective and customers cannot assume exclusive use.

7. How does the provider assess and implement rules and regulations related to generative AI and monitor for updates?

  • The speed of generative AI development means that only limited interpretations of current regulations exist. Regulators are only now discussing purpose-specific guidelines and regulations for generative AI.
  • A pragmatic provider of generative AI tools should already consider the implication of current regulatory interpretations (such as with the GDPR) and proposed regulations (such as the EU’s AI Act) on their operations. Providers also should monitor industry and government statements, events and publications to consider probable regulatory areas and minimize future product disruptions, such as in areas like user notice, bias and sensitive applications.

8. If a customer receives questions from a regulator or other third party around the operation of the generative AI tool, does the provider have an obligation to provide information and help the customer react?

  • Enforcement to date around algorithms and early regulator statements on generative AI issues indicate that understanding how algorithms were trained and operate will be critical to fully respond to these questions. The typical customer of generative AI tools may not have all relevant information at its disposal.
  • Customers should consider adding audit or information rights in any commercial agreement for a generative AI tool, as well as some obligation on the provider to assist in the customer’s compliance efforts.
  • Customers should avoid granting overbroad rights to the provider to share confidential customer information or proprietary data for purposes of a provider’s compliance. Such a permission may go beyond customary exceptions to confidentiality protection obligations and enable an overbroad use of such data by the provider.

Want to know more? Contact one of the authors. The Orrick team is happy to connect with your product, engineering and legal personnel to brainstorm ways to harness generative AI innovations and accelerate your business cases.