5 Proteomic Landscapes of CCOC

5.1 Loading and Transforming RPPA data

Reverse Phase Protein Arrays (RPPA) is processed by Functional Proteomics RPPA Core at MD Anderson Cancer Center. A total of 66 samples (2 replicates for each cell line + medium) were probed with 453 antibodies. rppa_ab_info.csv contains antibody infromation and rppa_normal.csv contains 453 log2-transformed normalized relative protein level for each samples.

ab.info.path <- 'data/RPPA/rppa_ab_info.csv'
ab.info.df <- read.csv(ab.info.path)
ab.info.df$Gene.Name <- lapply(ab.info.df$Gene.Name,function(x){str_replace_all(x,"\\[PAR Modification\\]","PAR")})
rppa.normal.path <- 'data/RPPA/rppa_normal.csv'
#taking out ab_info and NormalLog2 
rppa.normal.df <- read.csv(rppa.normal.path)
rppa.normal.df <- rppa.normal.df[,colSums(is.na(rppa.normal.df)) == 0]
ab.validation.status <- rppa.normal.df[4,-c(1:9)] %>% unlist %>% unname
rppa.normal.df <- rppa.normal.df[-c(1:8,10),]
names(rppa.normal.df) <- str_replace_all( rppa.normal.df[1,]," ","_" )
rppa.normal.df <- rppa.normal.df[-1,]
rppa.normal.df <- rppa.normal.df[!apply(is.na(rppa.normal.df) | rppa.normal.df == "", 1, all),]
rppa.normal.df <- rppa.normal.df[,colSums(rppa.normal.df != "") != 0 ]
cell.culture.type <- unname(sapply( rppa.normal.df["Sample_description"], function(x) str_extract(x,"(3D)|(2D)")))
cell.line <- sapply(rppa.normal.df$Sample_description, function(x){ y <- str_replace_all(x,"_[2-3]D_[1-2]$","")})
rppa.normal.df$Sample <- unname(sapply( rppa.normal.df["Sample_description"], function(x){ y <- str_replace_all(x,"_[1-2]$",""); str_replace_all(y,"-","_") }))
rppa.normal.df <- cbind( cell.line=cell.line, cell.culture.type=cell.culture.type , rppa.normal.df)
rppa.normal.df <- rppa.normal.df %>% filter(!str_detect(Sample_description,"NOSE16"))
#names(rppa.normal.df)<- c(names(rppa.normal.df)[c(1:9)], as.character(ab.name))
#names(rppa.normal.df)<- lapply(names(rppa.normal.df),function(x){str_replace_all(x,"\\[PAR Modification\\]","PARP")})
rownames(rppa.normal.df) <- str_replace_all(rownames(rppa.normal.df),"-","_")

5.2 Removing Samples from RPPA that failed QC

The following samples are excluded for further as they do not meet QC threshold set by MD Anderson RPPA core for having an overall low protein concentration:

  • HCH1_3D_2

  • JHOC9_3D_1

  • JHOC9_3D_2

  • OVTOKO_3D_1

5.3 RPPA Experiment Summary

In this assay, 453 probes were available to capture 392 proteins in which 91 is phosphorylated

5.4 Checking Technical Replicate Qulatity

5.4.4 PCA plot all sample replicates

5.5 Merging replicates

To conduct differenial protein expression analysis, replicates for each samples were merged by taking the mean of log2-transformed normalized relative protein level. PCA and Heatmap was carried out to confirm the merged samples were similar to individual replicates.

## Warning: Setting row names on a tibble is deprecated.

5.5.2 PCA for merged RPPA data

## Warning: Setting row names on a tibble is deprecated.

5.6 Significance Analysis of Microarray (SAM)

5.7 SAM Results

5.7.2 table

Delta p0 False Called FDR cutlow cutup j2 j1
0.1 0.4159848 371.17 420 0.3676217 -0.2962185 0.1596744 199 232
1.1 0.4159848 32.23 214 0.0626504 -1.9398355 1.7717970 97 336
2.1 0.4159848 0.94 108 0.0036206 -3.4676973 3.1939353 45 390
3.2 0.4159848 0.01 45 0.0000924 -4.8968783 5.3570293 38 446
4.2 0.4159848 0.00 27 0.0000000 -5.9973094 7.3867023 26 452
5.2 0.4159848 0.00 18 0.0000000 -7.0261086 Inf 18 453
6.2 0.4159848 0.00 8 0.0000000 -8.3942860 Inf 8 453
7.2 0.4159848 0.00 5 0.0000000 -9.6834131 Inf 5 453
8.2 0.4159848 0.00 2 0.0000000 -11.9922925 Inf 2 453
9.3 0.4159848 0.00 0 0.0000000 -Inf Inf 0 453

5.8 SAM Results for ∆ = 1.21

FDR threshold is set to be less than 5% which will make the value of parameter delta be somewhere between 1.009075 and 1.009076. Rounding it up to 1.21 for further analysis. NB The delta estimate method seems to be heuristic which might yield different delta in each runs that lead to different # of total DE protein. However, delta doesn’t affect top ranked DE proteins.

5.8.1 SAM plot

## The threshold seems to be at 
##      Delta Called      FDR
## 5 1.207411    208 0.050158
## 6 1.207412    207 0.048994

5.8.2 SAM summary

Delta p0 False Called FDR cutlow cutup j2 j1
1.21 0.4159848 24.38 207 0.0489938 -2.063406 1.913195 95 341

5.8.3 SAM protein table

5.9 Top 10 SAM up/down-regulated proteins for ∆ = 1.21

5.9.2 Heatmap for top 10 SAM up/down-regulated proteins for ∆ = 1.21

Subsetting the genes of ineterest from previous heatmap link text top 10 up-regulated protein doesn’t show too much varaince among each cell lines except PRAP (PAR-R-C). On the other hand,top 10 down-regulated proteins show strong differentiation

5.9.2.2 Interactive heatmap

## Warning in fix_not_all_unique(rownames(x)): Not all the values are unique - manually added prefix numbers

5.10 Figure 4. Targeted proteomic profiling of CCOC models

## Warning: position_dodge requires non-overlapping x intervals
## position_dodge requires non-overlapping x intervals