What is Tapioca?

Tapioca, an integrative machine learning based computational pipeline for de novo prediction of dynamic PPIs. Tapioca integrates mass spectrometry (MS) curve data from thermal proximity coaggregation (TPCA), co-fractionation (CF-MS), or ion- based proteome-integrated solubility alteration (I-PISA), with protein physical properties, domains (PFAM, now InterPro), and tissue-specific functional networks (HumanBase). Tapioca itself consists of eight distinct logistic regression models that take unique combinations of these features as inputs. The predictions of these eight models are combined into a final interaction score for each pair of proteins in a given dataset. To run Tapioca on your own data, please go to the Tapioca Github.

Protein-Protein Interaction Dynamics in Herpesvirus Infections

The Tapioca website contains Tapioca scores for protein-protein interactions for thousands of proteins detected throughout cell culture infections with herpes simplex virus 1 (HSV-1; Justice et al . 2021), human cytomegalovirus (HCMV; Hashimoto et al . 2020) and Kaposi’s sarcoma associated herpesvirus (KSHV; unpublished).
Using the primary search bar, proteins can be searched by their Uniport ID or gene name, bringing up a table of all Tapioca scores calculated for that protein within a given viral infection. To speed up searching, only protein pairs with scores above 0.15 are displayed. However, all scores can be viewed by clicking “Download data”, which downloads all scores for the currently searched protein.
In the display table, Protein 1 and Protein 2 columns show the protein pair for which a Tapioca score has been computed. The hours post infection (HPI) column shows the number of hours after viral infection, and the Replicate column shows the biological replicate. The Score column shows the Tapioca score computed for a protein pair. The higher the score, the more likely the protein pair represents a true protein-protein interaction. Generally, a score above 0.5 is considered a likely interaction. Some proteins have many high scoring interactions, while others have very few. To better sort through and prioritize Tapioca predictions we calculated z-scores from the set of interactions scores relative to a single protein, which are shown in the Relative z-score column. For a given pair of proteins, there are two unique relative z-scores, however here we only display the maximal relative z-score.

Citation

Tapioca has recently been submitted for publication>