We are in a reproducibility crisis. We have been for almost a decade. According to multiple reports, 50% - 89% of published studies are irreproducible (Figure 1). The high rate of irreproducibility stimulated the formation of The Reproducibility Project: Cancer Biology (RP:CP) in 2014. The objective of this group was to reproduce 50 high-impact cancer biology papers (https://www.cos.io/rpcb). This ambitious project was supposed to take 1 year. Flash forward to 2018, and the Project had to reduce the number of projects to 18. Why did they decide to give up trying to reproduce the other 32 projects? Many of the published work lacked certain methodological details. One detail that was often missing was the density of cells that were plated for each experiment. As a result, the group had to test multiple plating densities for each study. The studies that took too long to troubleshoot and optimize were dropped.
Figure 1. Studies reporting the prevalence of irreproducibility. Source: Freedman et al., PLoS Biol., 2009
10 of the 18 studies have been completed and the results are published on eLife (https://elifesciences.org/collections/9b1e83d1/reproducibility-project-cancer-biology). Out of the 10 studies, only 5 were mostly repeatable. Other groups, however, were able to reproduce the original findings.
These conflicting reports provide an important lesson. Inconsistency in culturing conditions has huge implications on the research that follows.
In this article, I will present two studies which demonstrate the impact of inconsistent cell culturing conditions on the results of cell metabolism assays, then end with a brief discussion of modern techniques that may help control for this confounding variable.
Mammalian cell culture is a versatile technique used in the biomedical sciences. In the field of cell metabolism, cell culture is utilized to delineate metabolic signaling pathways that are dysregulated in disease (e.g. cancer, diabetes) and to identify novel targets for therapy. Importantly, preclinical studies that evaluate the toxicity of new chemotherapeutic agents rely on in vitro cell viability assays, and the results obtained from those studies determine whether a drug is further tested in vivo and eventually in human trials. Thus, errors or inconsistencies during cell culture can derail investigations of the pathogenesis of human disease and drastically slow down the discovery of novel treatment strategies.
Study 1: Cell density changes optical readouts of metabolism
Metabolically active cells (i.e. cells that are proliferating) utilize oxygen to produce glucose, which then gets processed into pyruvate. Pyruvate is utilized by the mitochondria as fuel to drive ATP production by oxidative phosphorylation. For pyruvate production, cells need NADH; thus, metabolically active cells have high levels of NAD(P)H (Figure 2).
Figure 2. The difference in metabolic programming between metabolically active and inactive normal cells, and cancer cells. In the presence of oxygen, normal cells will generate glucose which gets metabolized into pyruvate. Pyruvate functions as fuel for the electron transport chain, which produces CO2 and ATP. This process if called oxidative phosphorylation and is an indicator of a metabolically active cell. In the absence of oxygen, normal cells will use pyruvate to generate lactate in a process called anaerobic glycolysis. Tumor cells use the majority of pyruvate to generate lactate regardless of the availability of oxygen. This is called anaerobic glycolysis or the Warburg effect.
In contrast, metabolically inactive cells (i.e. cells that are no longer proliferating, due to a deficit of nutrients) utilize pyruvate to generate lactate. This process, also known as anaerobic glycolysis, results in the production of NAD+. Therefore, metabolically inactive cells have higher levels of NAD+ (Figure 2).
Cancer cells, however, metabolize most of its pyruvate to generate lactate through process called aerobic glycolysis, or the Warburg effect, regardless of the availability of oxygen (Figure 2).
Autofluorescence imaging (AFI)- fluorescence lifetime imaging microscopy (FLIM) is an optical assay to evaluate the metabolic state of a cell or tissue. In preclinical studies, AFI-FLIM has been used to distinguish healthy from pathological tissues. It does so by relying on the fact that NAD(P)H (reduced) naturally emits a green fluorescence when exposed to blue light (340-390nm) while NAD+ (oxidized) does not (Figure 3).
Figure 3. Reduced NAD(P)H autofluoresces green, while oxidized NAD+ does not.
In other words, healthy, metabolically active cells which have high levels of NAD(P)H, will autofluoresce green, while tumor cells which have high levels of NAD+ will not (Figure 4A-C).
To make the assay a bit more quantifiable, AFI-FLIM takes the fluorescence data and creates a measurement called the mean lifetime value. Simply put, the mean lifetime value of a cell is the length of time it takes for the autofluorescence (green) to decay. In other words, the more NAD(P)H a cell has, the longer it will autofluoresce and the greater its mean lifetime value will be. In contrast, cells that have more NAD+ will experience decay much faster and have a lower mean lifetime value (Figure 4D-E).
Figure 4. Summary of how FLIM works. (A) Metabolically active cells, with high levels of NADH will fluoresce green. (B) Metabolically inactive cells, with less NADH will not fluoresce green. (C) Cancer cells, with high levels of NAD+, will fluoresce red. (D) The Mean fluorescence lifetime value is calculated using an intensity decay curve. The more NADH cells have, the longer it will fluoresce, and the longer the mean lifetime value. (E) A color spectrum, red to blue, is used to qualitatively demonstrate mean lifetime value of cells.
To demonstrate the influence of cell confluency on the mean lifetime value of a monolayer of cells, Chacko and Eliceri plated cells into 12 wells in a logarithmically descending seeding number. They cultured these cells for 3 days without changing any medium to control for the amount of nutrients. They also controlled for the percentage of CO2 in the incubator, as well as the temperature and humidity. After 3 days of culture, they took images of the cells using AFI and calculated the mean lifetime value of the cells in each well (Figure 5A).
Within the same cell line, the mean lifetime value of cells plated at lower density was significantly higher than cells seeded at higher density. In other words, the more confluent the cells were, the less they autofluoresced green. This demonstrated that at higher seeding densities, the cells became metabolically inactive faster (Figure 5B).
Figure 5. Study 1 design and summary of results. (A) Different densities of cells were seeded onto a 12-well plate, then imaged after 3 days in culture using AFI. The mean lifetime fluorescence was measured for each well of cells. (B) The higher the seeding density, the lower the mean lifetime fluorescence across all cell types.
This phenomenon was observed across cell types, in both normal and cancer cells. Thus, the researchers concluded that confluency is a key factor when comparing samples using optical assays of lifetime metabolism.
Take home message of study 1: At higher cell density, cells become metabolically inactive faster. At lower cell density, cells stay metabolically active longer. You must control for confluency when designing experiments that assess cell metabolism over time.
Scenario in which this impacts reproducibility: You are testing the metabolic activity of a cell before and after treatment with some new chemical. For the first experiment, you seed cells at a high density, and they reach confluency before the treatment period ends. Your results show that the cells are metabolically inactive after treatment. For the second experiment, you seed cells at a lower density, and they don’t reach confluency by the end of the treatment period. You results show that cells are metabolically active after treatment. Which result is the correct one?
Study 2: Cell density influences the outcome of drug efficacy studies
Cells in culture will go through various growth phases during the course of the culture period. The metabolic state of cells varies greatly as cells enter a new growth phase. Cells in the log or exponential phase of growth are metabolically active. They proliferate, absorb nutrients, and uptake chemicals that are present in the culture medium. After the log-phase, most cell lines will transition into the stationary phase, in which they no longer divide and are metabolically inactive.
Seeding density determines how quickly cells reach the stationary phase. At lower density, cells have ample amounts of nutrients and physical space to grow and divide. Hence, they will stay in log phase longer. In contrast, at higher density, cells would run out both space and nutrients at a faster rate, and ultimately stop dividing due to contact inhibition, thus, entering the stationary phase (Figure 6).
Figure 6. A schematic summarizing how seeding density impacts metabolism.
A key step in developing therapeutic agents for cancer is to test drugs in human cancer cell lines. Using mammalian cell culture, the effects of the drug-candidate on cell proliferation or survival are investigated over a period of time. These cell viability assays are impacted by the metabolic state of the cells. Considering that cell density affects the metabolic state of cells, drug efficacy assays may also be impacted by cell confluency.
To test this hypothesis, Muelas and colleagues designed a study in which cancer cells were seeded at varying densities then treated with the chemotherapeutic agent glutaminase-1-enzyme inhibitor (GLS1i). They utilized two types of cells, A549 and H358. A549 was previously reported to be sensitive to the cytotoxic properties of GLS1i, while H358 was reported to be resistant.
In the first experiment, they plated 8 x 105 cells/well and allowed the cells to grow for 24 hours. After 24 hours, the cells reached 100% confluency. At this point, cells were treated with GSLi. 48 hours after treatment with the drug, they assessed cell survival. To their surprise, neither A549 nor H358 cells died in response to treatment (Figure 7).
Then they adjusted the cell culturing conditions to ensure that (1) cells were in their exponential growth phase 24 hours after seeding and (2) confluence was reached as late as possible. With experimentation, they determined that a lower seeding density for both cell lines (2x105 for A549 and 3 x105 for H358) would fulfill those criteria.
When these culturing conditions were utilized, the cells appropriately responded to the drug. A549 cells died in response to GSLi treatment, while H358 cells did not (Figure 7).
Figure 7. Study 2 design and summary of results. Cells were seeded at high density or at a lower density then treated with GSL1i. Cells seeded at higher density were non-responsive. When cells were seeded at lower density, they responded appropriately to the drug. A549 cells that were reported to be sensitive died in response to treatment, while H359, cells reported to be resistant did not.
Thus, the investigators concluded that the cytotoxicity of GSLi is dependent on cell confluency.
Take home message of study 2: Inappropriate seeding densities can significantly alter the outcome of drug efficacy assays. Also, it is important to choose the appropriate time frame to conduct an experiment so that the growth phase does shift during the assay, or between experimental replicates.
Scenario in which this impacts reproducibility: You are testing the toxicity of a new chemotherapeutic agent. For experiment 1, you treat 1 million cells with the drug. None of the cells die. For experiment 2, you treat 250,000 cells with the same amount of drug, all of the cells die. Which result reveals the cytotoxicity of the drug?
So, where do we go from here?
Irreproducible studies cost an estimated $28.2 billion in the U.S alone. More than a quarter of those irreproducible findings are a result of poor study design. This means that errors in experimental design costs approximately $7.8 billion (Figure 8). The studies presented above clearly demonstrate that cell confluency can impact the reproducibility of a study, and although cell culture methodologies shouldn’t differ between researchers or research groups, often times they do. Therefore, we need to come up with a solution to ensure that cell culturing conditions are more consistent. Perhaps automating the various steps in the culture process (e.g. counting cells) or in the way we analyze the data (e.g. determining confluency) would reduce variability.
Figure 8. Estimated preclinical research spending in the U.S and the types of errors that contribute to irreproducibility. (A) At least 50% of preclinical studies are not reproducible, costing estimated $28.2 billion dollars in the United States. (B) Irreproducible studies can be broken down into four categories. Errors in study design accounts for 27.6% of irreproducible results. Source: Freedman et al., PLoS Biol., 2009.
So, what are our options?
Commonly used protocols to evaluate the confluency of adherent cells in vitro are invasive (requires exogenously added fluorophores), destructive (end-point assays in which cells are lysed or stained with dyes), and do not allow for continued, undisturbed growth of cells within flasks/plates.
Fortunately, recent developments in nondestructive image-based methods have greatly improved our ability to assess cell proliferation. In addition to AFI-FLIM, technology like genetically encoded fluorescent sensors (LINK) allow us to monitor cell growth throughout the course of an experiment. Thus, utilization of these technologies may drastically improve the reproducibility of biomedical studies.
It is time we get ourselves out of this reproducibility crisis.
Christina Cho, PhD
Is a postdoctoral researcher at the University of Pennsylvania where she is studying the molecular pathogenesis of solid tumors.