What is Tapioca?
Tapioca, an integrative machine learning based computational pipeline for
de novo prediction of dynamic PPIs.
Tapioca integrates mass spectrometry (MS) curve
data from thermal proximity coaggregation (TPCA), co-fractionation (CF-MS),
or ion-
based proteome-integrated solubility alteration (I-PISA), with protein physical
properties, domains (PFAM, now
InterPro), and
tissue-specific functional networks (
HumanBase).
Tapioca itself consists of eight
distinct logistic regression models that take unique combinations of these features as inputs. The
predictions of these eight models are combined into a final interaction score for each pair of proteins
in a given dataset. To run
Tapioca on your own data, please go to the
Tapioca Github.
Protein-Protein Interaction Dynamics in Herpesvirus Infections
The
Tapioca website contains
Tapioca scores for protein-protein
interactions for thousands of proteins detected throughout cell culture infections with
herpes simplex virus 1 (HSV-1;
Justice et
al . 2021), human cytomegalovirus
(HCMV;
Hashimoto et al .
2020) and Kaposi’s sarcoma associated herpesvirus (KSHV; unpublished).
Using the primary search bar, proteins can be searched by their Uniport ID or
gene name, bringing up a table of all Tapioca scores calculated for that
protein within a given viral infection. To speed up searching, only protein
pairs with scores above 0.15 are displayed. However, all scores can be
viewed by clicking “Download data”, which downloads all scores for the
currently searched protein.
In the display table, Protein 1 and Protein 2 columns show the protein pair
for which a Tapioca score has been computed. The hours post infection (HPI)
column shows the number of hours after viral infection, and the Replicate
column shows the biological replicate. The Score column shows the Tapioca
score computed for a protein pair. The higher the score, the more likely the
protein pair represents a true protein-protein interaction. Generally, a score
above 0.5 is considered a likely interaction. Some proteins have many high
scoring interactions, while others have very few. To better sort through and
prioritize Tapioca predictions we calculated z-scores from the set of
interactions scores relative to a single protein, which are shown in the
Relative z-score column. For a given pair of proteins, there are two unique
relative z-scores, however here we only display the maximal relative z-score.
Citation
Tapioca has recently been submitted for publication>