Skip to content

1. Overview about cancer and survival analysis

Smruti Panda edited this page Jan 26, 2024 · 2 revisions

Cancer occurs when body cells proliferate out of control and invade other bodily regions. With trillions of cells making up the human body, cancer can begin practically anywhere. Human cells typically divide to create new cells as needed by the body by growing and multiplying. New cells replace old ones when they die as a result of ageing or injury. This controlled mechanism can occasionally malfunction, causing damaged or aberrant cells to proliferate and expand. Tumors are lumps of tissue that can be formed by these cells. Cancerous or benign tumors can both occur. Cancerous tumors can migrate to far-off locations in the body and infiltrate or spread into neighboring tissues.

The main distinction between cancer genes and normal genes is that normal genes are those that code for proteins that are necessary for typical cellular processes. They are necessary to sustain regular cell division, growth, and differentiation. Mutated versions of normal genes are called oncogenes, or cancer genes. They encourage unchecked cell division and proliferation when activated, which results in the emergence of cancer.

Stages of Cancer

Cancer is broadly divided into 5 stages, staging plays a vital role in both diagnosing and treating cancer. Among these qualities are:

  • The location and size of the cancer

  • How far it’s spread and to where

  • How aberrant or aggressive tumor cells appear when evaluated in a lab

In addition to assisting with treatment planning, this data can be used to estimate survival rates, determine which clinical trials may be an option, and determine the likelihood that the cancer will return after treatment. Usually, staging is carried out following a battery of tests for diagnosis. The initial stage that is attributed to a cancer usually remains the same, even if it may progress or change. A new categorization may be introduced after therapy or if the cancer recurs.

Stage 1, also known as early-stage or localised cancer, is characterised by the absence of a deep tissue invasion and the lack of metastases to lymph nodes or areas other than the primary tumour.

Stage 2, or early locally progressed cancer, is characterised by the tumour cells' deeper penetration into surrounding tissue without spreading to other parts of the body.

Stage 3, also known as advanced-stage or locally advanced cancer, is characterised by the disease's deeper penetration into nearby tissue and its spread to lymph nodes, but not to distant regions within the body.

Stage 4, often known as metastatic or advanced cancer, is characterized by the spread of cancer cells into lymph nodes and other body parts, including organs, possibly distant from the initial site of the tumor.

image

(Taken from https://www.indushealthplus.com/stomach-cancer-types-stages.html)

RWR significance

When analyzing survival data, random walk scores can be quite useful, especially when examining the significance of certain genes or characteristics in predicting survival outcomes. To determine how long it will take for an event of interest—such as death or relapse—survival analysis is frequently employed in biological research. Random walk scores have the following applications in survival analysis:

  1. Identifying Prognostic Genes: Genes can be ranked according to their significance in biological networks using random walk scores. Genes that are key to the network and possibly important for the underlying biological processes may be indicated by high random walk scores. The correlation between random walk scores and survival times can be used to identify prognostic genes, which are linked to survival outcomes.

  2. Feature Selection for Survival Models: In survival models, random walk scores can be used as features. Traditional survival models can perform better when network-based scores are used because they capture the intricate relationships and interconnections between genes. The genes that provide the greatest information on survival outcomes can be found using feature selection strategies.

  3. Network-Based Stratification: Using the random walk scores of pertinent genes to stratify patients can result in more homogeneous groupings with unique survival patterns. By identifying patient subgroups with similar underlying biological traits, this could enhance personalized medicine techniques and assist customize treatment plans.

  4. Validation of Predictive Models: Using the proper validation approaches, assess the predictive ability of survival models that include random walk scores. The generalizability of the models can be evaluated using split-sample validation or cross-validation, verifying that they function well on fresh, untested data.

  5. Integration with Clinical Data: To create complete survival models, incorporate clinical and demographic data with random walk scores. A more comprehensive understanding of the variables impacting survival outcomes may be possible through the combination of molecular information with clinical data.

  6. Network-Based Biomarker Discovery: Focus on genes associated with high random walk scores and survival outcomes to find possible biomarkers. Decisions about prognosis, diagnosis, and treatment may be affected clinically by these biomarkers.

  7. Understanding Biological Mechanisms: Look into the physiological processes or roles linked to genes that have high random walk scores. This can direct further experimental research and offer insights into the underlying mechanisms impacting survival outcomes.

  8. Dynamic Assessment of Gene Importance: Examine how random walk scores have changed over time to get a sense of how the significance of genes for survival prediction may alter. Dynamic assessments may reveal genes with distinct effects at different phases of the evolution of a disease, and they can be especially pertinent in longitudinal studies.

How are Survival rates impacted without Random walk?

  1. Without incorporating random walk scores or network-based features, traditional survival analysis methods may focus solely on clinical and demographic variables or simple molecular features like gene expression levels or genetic mutations.

  2. Limited Understanding of Network Dynamics

  3. Traditional survival analyses might treat genes as independent variables, neglecting their interactions and dependencies.

  4. Survival analyses that do not incorporate network-based features may identify prognostic genes based solely on their individual association with survival outcomes.

  5. Without leveraging network information, the stratification of patient subgroups may be less precise.

  6. Ignoring random walk scores may limit the ability to tailor interventions based on the network properties of genes, potentially missing opportunities for more effective and personalized treatments.

Role of Survival Analysis in Cancer

A statistical technique called survival analysis is used to examine how long it will take for an interesting event to occur. In order to determine how long it will take for an event to occur and to evaluate the variables that could affect this time, survival analysis is frequently used in cancer studies.

Kaplan Meier: A non-parametric statistic called the Kaplan-Meier estimator, also referred to as the product limit estimator, is used to calculate the survival function using lifetime data. It is frequently used in medical research to calculate the percentage of patients who survive for a specific period of time following therapy.