An Overview of Analysis Methodology for Proteomic Profile Data

Richard Pelikan

Proteomic profiling through use of mass spectrometry is an up-and-coming clinical tool which is vaunted for its ability to uninvasively diagnose patients. In this work we investigate several aspects of the proteomic profile which can be interpreted and analyzed in different combinations. The centerpiece revolves around reducing the high-dimensional proteomic profile to a low-dimensional representation that accomplishes the task of discriminating between patients with or without (in this case), pancreatic cancer. Obviously, the choice of reduction strategy weighs heavily on the biological interpretation of the results. We discuss potential caveats with the proteomic profile data, which require attention and complicate the problem. We present insight as to why common heuristics and strategies work well (or poorly) for classifying healthy versus diseased patients based on their proteomic profile. In addition, we also discuss how to properly validate the findings brought about by these techniques in order to verfiy that we learn a genuine discriminatory signal with statistical significance.