Articles

Fusion Protein Sequence Disclosure in Patent Applications: A Practical Guide

If you are working in the biotechnology or pharmaceutical space, you already know that fusion proteins are among the most commercially significant inventions of the last two decades. From cancer-targeting antibody-drug conjugates to receptor-ligand constructs used in autoimmune therapies, fusion proteins have reshaped modern medicine. But when it comes to protecting these inventions through patents, the sequence disclosure requirements can be surprisingly complex.

A fusion protein is created by joining two or more protein domains, each encoded by different genes, into a single continuous polypeptide chain. The resulting construct inherits functional properties from each component domain. Because of this multi-domain structure, disclosing fusion proteins in a patent application is not as straightforward as listing a single protein sequence. It requires careful planning, technical accuracy, and full compliance with international sequence listing standards.

This guide is designed to help inventors, patent attorneys, and technical writers understand exactly how to handle fusion protein sequence listing in patent applications, without getting lost in legal jargon or technical complexity.

Understanding the Legal Requirement for Sequence Disclosure

Before drafting your application, you need to understand why sequence disclosure is mandatory. Patent offices around the world, including the USPTO, EPO, WIPO, and IP India, require that any patent application containing nucleotide or amino acid sequences of ten or more residues must include a formal sequence listing.

This requirement is governed by internationally recognized standards. As of January 2022, WIPO Standard ST.26 replaced the older ST.25 standard, making XML-based sequence listings the required format for all new international and most national filings. If your application contains a fusion protein, you are almost certainly required to submit a compliant fusion protein sequence listing as part of your application documents.

Failing to include a proper sequence listing, or submitting one with errors, can result in serious consequences including objections, delays, or even abandonment of your patent application. This is why getting the disclosure right from the beginning is so important.

How to Structure a Fusion Protein Sequence Listing?

One of the most common mistakes applicants make is treating a fusion protein as a single, monolithic sequence without adequately explaining how it is constructed. Patent offices expect more than just a raw amino acid string. They want complete, structured, and informative disclosure.

Here is what a well-structured fusion protein sequence listing should include:

  • The complete amino acid sequence of the full-length fusion protein, listed as a single continuous sequence in the sequence listing file. Each residue must be identified using standard IUPAC single-letter or three-letter codes.
  • Individual component sequences listed separately, meaning each domain or functional unit of the fusion protein should ideally be disclosed as its own sequence entry. This helps examiners and future readers understand which part of the fusion construct originates from which source.
  • The linker sequence, if any, used to join the domains. Linker sequences are often overlooked but they are part of the invention. Whether you use a flexible glycine-serine (GS) linker or a rigid alpha-helical linker, it must be included and annotated properly.
  • The corresponding nucleotide sequences, particularly the coding DNA sequences that encode the fusion protein. These must be provided if they form part of the claims or the detailed description.
  • Proper annotations and feature keys within the XML file, such as source organism information, modified residues, signal peptides, and region boundaries between each domain.

Following this structure ensures that your fusion protein sequence listing is complete, examiner-friendly, and legally defensible.

Common Challenges in Fusion Protein Sequence Disclosure

Even experienced patent professionals encounter specific challenges when preparing a fusion protein sequence listing. Understanding these challenges ahead of time helps you avoid costly mistakes.

1. Numbering and Cross-Referencing

When you have multiple sequences in a listing, each one receives a unique sequence identifier (SEQ ID NO). Your patent specification and claims must reference these identifiers consistently. For example, if SEQ ID NO: 1 is the full-length fusion protein, SEQ ID NO: 2 is domain A, and SEQ ID NO: 3 is domain B, then your claims and description must use these numbers accurately throughout the document. Any mismatch between the specification and the sequence listing can trigger office actions.

2. Modified or Non-Standard Residues

Fusion proteins sometimes incorporate non-natural amino acids, pegylated residues, or chemically modified domains. These modifications need special annotation in the ST.26 XML file. You cannot simply skip them or replace them with a standard residue. Failing to disclose modifications properly may weaken your patent protection or lead to rejection.

3. Handling Signal Peptides and Propeptides

Many fusion proteins include signal peptides that are cleaved post-translationally. The question of whether to include the signal peptide in the sequence listing, and how to annotate it, is a recurring issue. The general guidance is to include the full precursor sequence and annotate the signal peptide region separately using appropriate feature keys within the XML.

4. Consistency Between Drawings and Sequences

If your application includes schematic diagrams showing the domain architecture of the fusion protein, the sequence boundaries shown in those figures must align with what is described in the text and what is listed in the sequence file. Inconsistencies here are a red flag for examiners and can complicate prosecution significantly.

ST.26 Compliance: What You Must Know for 2024 and Beyond

Since the mandatory transition to WIPO Standard ST.26, every fusion protein sequence listing submitted to major patent offices must be in XML format. This is a significant change from the older plain-text ST.25 format, and it has introduced new requirements that directly affect how fusion proteins are disclosed.

Under ST.26, the following rules apply specifically to protein sequences in patent applications:

  • Amino acid sequences must use the standard IUPAC codes and are entered within specific XML elements designated for peptide sequences.
  • Every sequence entry must include a mandatory source organism field. For synthetic fusion proteins, the source is typically listed as “artificial sequence” with an explanatory comment in the feature annotation.
  • The XML file must be generated using WIPO-approved software such as WIPO Sequence, or validated using an accepted validator tool before submission.
  • Sequences that were previously filed under ST.25 for applications entering the national phase must be converted to ST.26 format if the national office requires it.

Understanding these requirements is critical for anyone preparing a fusion protein sequence listing for an international patent application filed after July 1, 2022.

Why Working With a Specialist Matters

Preparing a compliant fusion protein sequence listing is not just a technical task. It sits at the intersection of molecular biology, patent law, and data formatting. A single error in sequence numbering, annotation, or XML structure can delay your patent or create gaps in your protection.

Working with a specialist service that focuses exclusively on sequence listing preparation can make a real difference. These professionals understand both the scientific complexity of fusion protein architecture and the precise legal and formatting standards required by global patent offices. They can review your sequences for completeness, check your XML files for compliance, and ensure that everything in your application tells a consistent, accurate, and strategically strong story about your invention.

Final Thoughts

Fusion proteins represent some of the most innovative and valuable inventions in modern biotechnology. Protecting them through patents is absolutely worth the effort, but doing it correctly requires more than just submitting a sequence file. A properly prepared fusion protein sequence listing is the foundation of a strong patent application. It demonstrates scientific rigor, supports your claims, and satisfies the disclosure requirements that patent offices around the world expect.

Whether you are filing with the USPTO, EPO, or under the PCT system, taking the time to prepare a complete, accurate, and ST.26-compliant sequence listing will protect your invention and save you from unnecessary complications during prosecution.

If you need professional assistance with your fusion protein sequence listing, The Sequence Listing team is ready to help you navigate every step of the process with precision and expertise.

We are the leading Patent Sequence Listing Company

At our Sequence Listing Company, we specialize exclusively in creating perfect patent sequence listings for biotechnology and pharmaceutical companies. Founded by patent attorneys and bioinformatics specialists with over 10 years of experience, we understand the critical intersection of scientific innovation and intellectual property protection. Our dedicated team has helped hundreds of companies successfully navigate the complex regulatory requirements of sequence listings across global patent offices. We combine technical precision with regulatory expertise to ensure your valuable innovations receive the protection they deserve without delays or complications.

Our Expertise

Trust Your Patent Sequence Listings to the Industry's Leading Experts

Powered by

Effectual Services is an award-winning Intellectual Property (IP) management advisory & Consulting firm.

Office
@2026 The Sequence Listing. All rights reserved.