Most cancer deaths are due to metastasis and epithelial-mesenchymal transition (EMT) is thought to play a role in this process: during EMT, epithelial cells lose their attachment to other cells and the basement membrane, and become motile, elongated cells able to intravasate into blood vessels and metastasize to remote locations. EMT can be induced by different stimuli, of which TGFβ is one.
Here, we derived a gene signature for TGFβ-induced EMT, then used this to find a subset of cancer cell lines and breast cancer patients showing evidence of this signature. We applied two types of meta-analysis on ten published microarray datasets of TGFβ-stimulated EMT in cancer cell lines. We used the comparative analysis method product of rank (PR), and performed integrative analysis using ComBat and surrogate variable analysis. For both approaches we used the limma package to obtain test statistics. The functional and clinical significance of the signature genes were assessed in breast cancer patient data as well as in cancer cell line data from the TCGA database, Cancer Cell Line Encyclopedia, NCI60 and the GOBO online tool. Finally, we scored cancer cell lines and patients against our gene signature using Gene-Set Variation Analysis (GSVA) and single-sample Gene-Set Enrichment Analysis (ssGSEA).
Applying the PR technique, we obtained 268 differentially expressed genes (DEGs) while integrative analysis yielded 195 DEGs; 162 genes were common between the two sets. We included all the DEGs obtained from these two methods to give our 301 gene signature of TGFβ-induced EMT. Using GSVA and ssGSEA, we identified ~15% of cell lines (161 out of 1053) and ~4% of patients (25 out of 597) with evidence for this TGFβ-EMT gene signature. If EMT is indeed playing a role in metastasis in these patients, druggable elements of the TGFβ signaling pathway represent attractive therapeutic targets. Following validation, these genes will be further analyzed for diagnostic and therapeutic purposes.