The effective design of functional peptide sequences remains a fundamental challenge in biomedicine. For example, cell-penetrating peptides (CPPs) are capable of delivering macromolecular cargo to intracellular targets that are otherwise inaccessible. However, design of novel CPPs with high activity and unique structure remains challenging. In this thesis, methods to design and characterize highly active CPPs for antisense oligonucleotide delivery were explored.
Machine learning is a promising method for de novo design of functional peptide sequences. A deep learning model inspired by directed evolution was used to optimize abiotic sequences that traffic antisense oligomers to the nucleus of cells. The model was able to predict activities beyond those in the training dataset, and simultaneously decipher and visualize sequence-activity predictions. The validated miniproteins (40-80 residues) were more effective than any previously known variant in cells. By augmenting the machine learning model to over-represent shorter sequence space, the model also predicted a short peptide (18-residues) with comparable activity to a positive control peptide. Empirical sequence-activity studies demonstrated reliance on the cationic residues as well as the C-terminal cysteine residue. These sequences were nontoxic, able to deliver other biomacromolecules to the cytosol, and efficiently delivered antisense cargo in mice.
A different approach to discover and characterize CPP sequences was also taken, by extracting peptides taken up into cells and analyzing their relative quantities or identifying their sequences by mass spectrometry. First, several mirror-image D-peptides had similar delivery activity to their native forms, while demonstrating complete proteolytic stability. Mixtures of fully intact antisense-peptide conjugates could be recovered from whole cell and cytosolic lysates, and relative concentrations were quantified by MALDI-TOF. This method was then extended to the discovery of de novo sequences from a combinatorial library of antisense-peptide conjugates containing unnatural residues. Following cell treatment with the biotinylated antisense-peptide library, the cytosol of cells was extracted and internalized peptides recovered via affinity capture. De novo sequencing was achieved by Orbitrap tandem mass spectrometry, and several unique, unnatural sequences were identified that could effectively deliver the antisense oligomer to the nucleus.
In summary, machine learning and mass spectrometry-based strategies to discover and characterize novel CPP sequences for antisense delivery were explored. In the future, we envision combining these methods in order to use lists of library hits to train a machine learning model to design sequences composed of fully unnatural amino acids.
Ph.D.