Use of Artificial Intelligence in the Processing of Confidential Information at the M&A Transaction Preparation Stage

Are you preparing for an M&A transaction and planning to use AI to process the counterparty’s documents? There may be unpleasant consequences!

Or maybe not? Let’s take a closer look.

M&A transactions are always associated with the exchange of large volumes of highly sensitive information: financial statements, client databases, contract terms, technical documentation on assets, and much more. Analyzing such documents requires significant time and financial resources. As a result, parties often seek to accelerate the process by uploading information into AI tools.

Undoubtedly, AI assistants can summarize hundreds of pages, verify title and rights documentation, extract financial indicators, and identify risks in a matter of minutes.

The question is no longer whether AI tools will be used in document-related work.

The question is how to do it without violating either NDAs or data protection laws.

When documents are uploaded into an AI tool, the data they contain may be used to train the model, stored on external servers, or transmitted via third-party APIs without any transparency for the user. There is a real risk that AI could use your “confidential” data for self-learning - and later disclose that information in responses to other users.

Such cases have already occurred in practice: according to the Society for Computers and Law, a partner at a UK law firm, under time pressure, uploaded confidential M&A transaction documents into a free AI tool. Several months later, a competing firm using the same platform unexpectedly received precise details of that transaction’s structure in a generated response.

However, not every AI tool will collect your transaction data or use it for training. The key criterion is not the provider (Perplexity, Claude, ChatGPT, Gemini, etc.), but the pricing plan under which you operate. The same provider may handle uploaded data differently depending on the selected plan.

Let’s look at the categories of plans, from the most risky to the most secure:
  1. Free plans – the highest risk zone - Available after simple registration, without payment or additional restrictions. Your prompts and uploaded documents (PDF, Word, Excel) are automatically added to datasets for fine-tuning or improving future versions of the provider’s models. There are no contractual confidentiality guarantees.
  2. Paid personal plans (ChatGPT Plus, Perplexity Pro) – better, but not sufficient
    These options are safer than free plans, but are not suitable for directly handling confidential transaction data. By default, policies often allow learning from user data. Disabling this requires manually adjusting settings. At the same time, such settings are not legally binding—they are technical configurations that the provider may change. The legally relevant documents remain the Privacy Policy and Terms of Service of the respective platform.
  3. Enterprise / Zero Data Retention – the optimal option for M&A preparation
    Tools in this category ensure that your data is not used to train models. This level of protection can be achieved in several ways:
  • API access (e.g., OpenAI API, Anthropic API, YandexGPT API) – providers apply strict corporate policies and do not use client data for training. In practice, API access functions as a contractual arrangement with safeguards.
  • Enterprise subscriptions: ChatGPT Enterprise, Microsoft Copilot for M365, Gemini for Google Workspace Enterprise.
  • Specialized LegalTech platforms: Datasite AI, Harvey, Spellbook, iDeals – these operate via secure APIs or isolated environments (VPCs) under a strict non-use-of-client-data principle.

Practical takeaway:

Using free AI tools in M&A preparation is highly discouraged. Your relationship with the AI provider is governed only by public “terms of use,” while your counterparty relationship is subject to an NDA requiring specific protective measures.

When using paid personal plans, ensure that training on your data is disabled in the settings. However, the most reliable safeguard is anonymizing documents before uploading them—removing or replacing identifying details (party names, amounts, dates, addresses).

Importantly, anonymization should not be performed using the same AI tool. The moment you upload the original document for anonymization, the provider already gains access to the full dataset and may store or use it for training. Anonymization must be completed in advance—before any interaction with external services—either manually or using internal corporate tools.

Importantly, anonymization should not be performed using the same AI tool. The moment you upload the original document for anonymization, the provider already gains access to the full dataset and may store or use it for training. Anonymization must be completed in advance—before any interaction with external services—either manually or using internal corporate tools.

In the world of AI, no one is 100% protected from data leaks. However, users are responsible for taking all reasonable measures to minimize these risks.

 


Authors: Inna Semenova, Yahor Kulazhenka.

Напишите нашему юристу, чтобы узнать подробности

Написать юристу