A Risk-Management Approach to Cleaning-Assay Validation

By Brian W. Pack,Jeffrey D. Hofer

The authors recommend a strategy for classifying similar nonstainless-steel surfaces into three groups based upon the analytical recovery that was observed in this study.

Cleaning validation and verification are based on the premise of risk management. Several regulatory and guidance documents make this clear. The International Conference on Harmonization's (ICH) guideline on risk management outlines several approaches to making and documenting risk-based decisions (1). It clearly states that risk management should be based on scientific knowledge and that personnel should evaluate the effect of potential failures on the patient. In addition, it notes that the levels of effort, formality (e.g., use of tools), and documentation of the quality risk-management process should be commensurate with the level of risk.
The US Code of Federal Regulations states that equipment and utensils shall be cleaned, maintained, and sanitized at appropriate intervals to prevent malfunctions or contamination that would alter the safety, identity, strength, quality, or purity of the drug product (2). In accordance with 21 CFR 211.67, ICH issued recommendations on equipment maintenance and cleaning (Q7A, Sections 5.20–5.26) for compliance and safety that include similar, but more detailed requirements (3).
The US Food and Drug Administration's 1993 guidance on cleaning inspections states that for a swab method, recovery should be established from the surface (4). The guidance contains no specific requirements about how to establish these recovery estimates, or the acceptance limits. It is up to the manufacturer to document the cleaning rationale (i.e., process and acceptance limits) for maintaining the quality and purity of the drug product being manufactured.
Cleaning validation and verification
Cleaning verification consists of routine monitoring (e.g., swab analysis) of equipment-cleaning processes. Cleaning validation confirms the effectiveness and consistency of a cleaning procedure and eliminates the need for routine testing (5). For example, cleaning limits are established to determine the maximum allowance of Product A that can carry over to Product B. The calculation of these limits is well documented and includes factors that increase the margin of safety to protect the patient (6, 7). Because it is not feasible to swab every square inch of the equipment, swabbing locations are chosen based upon factors such as how difficult the area is to clean, the size of the equipment, and the areas where product buildup is likely. All product-contact surfaces must be considered during cleaning verification to demonstrate that equipment is clean, and a recovery value is expected to be established for each product-contact surface during method validation. The recovery is used to correct the submitted swab result for incomplete removal from the surface and to compare it with the acceptance limit. This last aspect of risk management (i.e., establishing the surface recovery) is the focus of this article.


Figure 1: (ALL FIGURES ARE COURTESY OF THE AUTHORS)
Analysts have many ways to establish the swab-recovery value for a particular product-contact surface. Stainless steel is the most common material in a manufacturing environment (see Figure 1). Some companies therefore establish a recovery value for stainless steel and apply that standard to all swab submissions. Other companies attempt to establish a recovery value for each product-contact surface for every compound. From an analytical standpoint, supporting this activity becomes arduous, if not impossible to sustain. For example, equipment in a clinical-trial materials (CTM) manufacturing area is used for many compounds in the company's portfolio. New equipment might have different product-contact surfaces. Each compound in the portfolio manufactured on a new piece of equipment would require a method revalidation to add a recovery factor for the new product-contact surface. As the number of materials of construction increases, the difficulty of sustaining that approach also increases. Grouping materials of construction for analytical-method development in support of cleaning verification and validation activities is an excellent opportunity to apply a quality risk-management approach, especially when the total product-contact surface area is considered. Stainless steel accounts for approximately 95% of the surface area in a CTM manufacturing and packaging environment. Other product-contact surfaces account for only 5% of the total surface area. When polymer surfaces are considered in a CTM packaging environment, the number of minor product-contact surfaces can grow significantly. A risk-management approach allows the majority of the time and effort to be spent on activities that ensure the cleanliness of the stainless-steel area while identifying, analyzing, evaluating, and communicating the risks associated with the small fraction of remaining surfaces. This strategy does not ignore the surfaces other than stainless steel, but divides them into three recovery groups to support analytical-method validation. By choosing representative recovery surfaces for those nonstainless-steel materials, the effort proportionally addresses the risk.
Design of experiments
Several variables (i.e., roughness average, material of construction, active ingredient, and spiked amount) were evaluated in a randomized fashion to prevent systematic bias that could be introduced by going from the lowest to the highest acceptance limit, from the smoothest to the roughest surface, or from one material of construction to the next. The initial design of experiments included two active pharmaceutical ingredients (APIs), three spiked acceptance-limit levels (i.e., 0.5, 5.0, and 50 μg/swab), seven surface types, four target roughness averages (Ra < 25, 75, 125, and 150 μin.), and six replicates per surface. These Ras were targeted to evaluate whether surface recovery depended on the surface Ra. Coupons were divided into a group of polymers [i.e., Lexan (polycarbonate), acetal (Polyoxymethylene), and PTFE] and a group of metals (i.e., stainless steel 316L, bronze, Type III hard-anodized aluminum, and cast iron). These surfaces were chosen to represent a cross section of surfaces found in the CTM manufacturing and packaging areas and required 1008 swab determinations to complete the study. The remaining product-contact surfaces found in the clinical-trial manufacturing and packaging areas were evaluated according to the initial design of experiments. These surfaces included nickel, anodized aluminum, Rilsan (polyamide), Oilon (blended-oil nylon), and stainless steel 316L with a 4 × 4-in. area.
The authors chose two APIs for this evaluation on the basis of their solubility profiles to represent the most- and least-soluble compounds a company would likely manufacture. Compound A, the less soluble, is slightly soluble in methanol and insoluble across the pH range, but Compound B is soluble in all solvents. In addition, Eli Lilly (Indianapolis, IN) identified Compound A as one of the most difficult compounds to clean from equipment, based on its low solubility and staining properties. A control (i.e., stainless steel 316L, 0.5 μg/swab, Compound A) was run each day that data were generated.
Equipment and operating conditions


Table I: High-performance liquid chromatography (HPLC) operating conditions.
The authors used an Agilent 1100 high-performance liquid chromatography (HPLC) analyzer (Agilent, Santa Clara, CA) for all experiments. The HPLC operating conditions were validated according to ICH standards for precision, linearity, limit of detection (LOD), limit of quantitation (LOQ) and specificity (see Table I) (8). Precision was 1.85% and 3.13% for Compounds A and B, respectively, and was determined at 0.025 μg/mL (i.e., 25% of the lowest spike). The method was linear across the equivalent range of 0.5 μg/swab to 5 μg/swab (R = 0.999). The LOQ was calculated to be 0.005 μg/mL for Compound A and 0.008 μg/mL for Compound B. The LOD was calculated to be 0.001 μg/mL for Compound A and 0.0024 μg/mL for Compound B. Swabs and solvents did not result in interfering peaks. The authors performed swabbing consistently using Texwipe Alpha large swabs (ITW Texwipe, Kernersville, NC). First, 10 vertical swipes, then 10 horizontal swipes were performed for the 2 × 2-in. surfaces. For the 4 × 4-in. surfaces, 20 swipes were executed in each direction. Methanol was used as the swabbing solvent. Spike amounts were 0.5, 5, and 50 μg per surface and were extracted into 5 mL of mobile phase, which corresponded to 0.1-, 1.0-, and 10-μg/mL standard concentrations, respectively. The authors used a Quanta FEG 200F field-emission scanning electron microscope (SEM, FEI, Hillsboro, OR) to generate the surface images. Results and discussion
In this study, a single analyst evaluated the analytical swab recovery from a representative set of surfaces found in the CTM manufacturing and packaging areas. The surfaces were manufactured specifically for this study to have a broad range of Ras. In addition to Ra, the effect of the material of construction, acceptance limit, compound, and method variability also were evaluated. Based upon these data sets, the authors used a strategy involving three groups of materials to represent all of the surfaces in CTM operations. Merck and Co. used a similar strategy to establish five recovery groups (9). The authors expanded on Merck's strategy by adding a detailed study supporting the groups and an approach for determining the appropriate placement of new surfaces into pre-established groups.
Roughness average (Ra). The Ra targets listed above were difficult to achieve. The intermediate Ra values were significantly lower than the target values given in the design of experiments section above. Both intermediate Ra values, initially targeted for 75 and 125 Ra, were measured to be approximately 40 μin. Although the machining process at each level yielded visually different surfaces, the measured Ra changed little from surface to surface. The authors decided to proceed with the surfaces and define smooth surfaces as Ra < 100 μin. and rough surfaces as Ra > 100 μin. This approach allowed for an assessment of the anticipated relationship between Ra and analytical recovery.


Figure 2
The Ra had little impact on the observed analytical-swab recovery, but the recovery was expected to improve with lower Ras. Figure 2 shows roughness grouped by surfaces that had a measured Ra > 100 μin. and by surfaces that had a Ra < 100 μin. Only 5- and 50-μg spikes are represented in Figure 2; the variability in the 0.5-μg spikes confused the interpretation of the data slightly, but is consistent. As Figure 2 shows, the recovery within each roughness group was approximately the same for a given analyte on a given material and did not correlate to Ra. Therefore, Ra should not be used as a predictor of analytical recovery or as a grouping criterion.

Figure 3
Material of construction. Because Ra was eliminated as a factor contributing to recovery losses, the authors performed data analysis by combining all average recovery values and assessing the effect of the material of construction. The data in Figure 3 were first separated by API, and groups were generated to represent the logical separations in recovery. Figures 3(a–c) contain the data for the 0.5-μg spikes, the 5-μg spikes, and the 50-μg spikes, respectively. The data from the 0.5- and 5-μg spikes exhibited a trend similar to that of the 50-μg spikes. The variability in the results increased as the spiked amount decreased, and the 50-μg spike results were substantially less than that of the other spike levels. For both compounds, the Type III hard anodized aluminum exhibited the poorest recovery. The next logical break point grouped bronze and cast iron. The recovery of Compound B from bronze suggested that the material was representative of Group 1. The recovery of Compound A on bronze was lower and more variable, however, so the authors placed bronze into Group 2. For the majority of the surfaces, the recovery of Compound A was lower than that for Compound B at a given limit. In some cases, the recovery was approximately the same (i.e., of 5- and 50-μg spikes on cast iron, and of the 50-μg spike on Type III hard anodized aluminum). In addition, the predominant trend was that the average recovery of a compound increased as the spiked amount increased on a given material of construction. For example, the recovery of Compound B from stainless steel 316L was approximately 74%, 90%, and 95% at 0.5-μg, 5-μg, and 50-μg swabs, respectively.


Figure 4
Ra was originally considered a variable in the experiments previously outlined and did not affect swab recovery. To understand the surface attributes that might contribute to incomplete recovery for the different materials of construction, the authors acquired SEM images for Group 1, Group 2, and Group 3 surfaces (see Figure 4). Stainless steel is a relatively smooth surface with some striations from machining (see Figure 4a). Cast iron has a pitted surface that could provide opportunities for an API in solution to be trapped during a spiking experiment (see Figure 4b). The anodization process makes Type III hard anodized aluminum, the worst recovery surface, porous, thereby creating the greatest opportunity to lose analyte (see Figure 4c). Note that polymers were grouped together with metals and might not be considered to be similar on first pass. The SEM image of Lexan in Figure 4d, however, illustrated that the polymer surface was smooth, albeit with some surface debris, which prevented the loss of analyte. The polymer surface was grouped with stainless steel in Group 1. The SEM images were good supporting evidence that the groupings were logical based upon surface characteristics.


Table II: Grouping of material surface of construction.
Table II is based on the data shown in Figure 3. The top surface in Table II represents the surface that was validated for recovery in each group. This recovery value represented all others within a given group. The groupings were supplied to the CTM areas, and the group number was included on the swab submission to the analytical laboratory so that the correct recovery factor was applied to each surface. In addition, the table served as a tool for engineering to determine whether newly purchased equipment contained a new product-contact surface. Analytical methods. The model and worst-case compound evaluation did not replace any analytical-method validation activities. Analytical recovery must be established for each compound in the portfolio, but not on all surfaces. If multiple limits are to be considered, or if a range of reporting is required, the lowest limit may be evaluated, and that recovery can be applied to all acceptance limits as a conservative estimate. In the analytical method, three recovery factors were presented: Group 1, Group 2, and Group 3. Methods could be validated for any surface within a group and could be considered representative. The authors chose stainless steel 316L because it is the most prevalent, and cast iron because it is a common material on a tablet press. Type III hard anodized aluminum is the only surface in Group 3. This strategy did not ignore any uncommon surfaces. It grouped them appropriately, swabbed them, and applied a representative recovery factor.
Variability. Method variability was evaluated by performing a control sample (Compound A, 6 replicates, 0.5-μg swab, stainless steel 316L, Ra = 3.5) each day. The mean recovery of the entire experiment was 52%. These data suggested that the swabbing ability of the analyst did not change over time. The standard deviation within a day typically was less than 6. The pooled-within-run standard deviation was 3.99 over the course of the experiments. This value was used as a criterion for grouping new surfaces. The day-to-day standard deviation was 15.34.


Figure 5
Incorporating new materials of construction into the grouping strategy. Periodically, new equipment will be introduced into the CTM area that incorporates a product-contact surface made of a material of construction that is not listed in Table II. This problem is often caused by alloys of metals that have already been evaluated and by polymers of proprietary composition. Because surface recovery must be evaluated, personnel need a way to incorporate new surfaces into the groupings outlined in Table II. When a new piece of equipment is purchased, CTM-engineering employees prepare the needed documentation to evaluate the equipment with regard to the cleaning program before use. If an identified sampling location is made of a new material of construction, the engineer asks the person responsible for the cleaning program and analytical development to perform the next steps, which are shown in Figure 5. Suppose that new equipment incorporated three new materials: a crystalline thermoplastic polyester marketed under the trade name of Ertalyte (Quadrant Engineering, Reading, PA), stainless steel 420, and stainless steel 630. Without the grouping strategy in place, method revalidation would have to occur for all compounds handled by this piece of equipment. With the strategy in place, these surfaces are placed into groups based on model-compound recovery. No method revisions are required.
The analytical recovery of Compound A was evaluated for Ertalyte, stainless steel 420, and stainless steel 630 at the 5.0-μg/in. 2 level. The authors used the validated method to evaluate the recovery of the three new surfaces compared with a representative surface from Group 1 (i.e., stainless steel 316L), Group 2 (i.e., cast iron), and Group 3 (Type III hard anodized aluminum). Recovery was evaluated for both the group representative and the new surfaces on the same two days with three replicates on each day. As an alternative, six replicates may be performed on the same day as the controls because the comparison of recovery is relative.



The new surfaces are placed in one of the three groups or define a new lower group, based on how close their average is to that of the group control. The authors obtained a cutoff value of 3.0% in the following manner. Based upon the data for the control, the within-day standard deviation was calculated to be 3.99 and was used in Equation 1. A series of one-sided hypothesis tests with an error rate of α = 0.10 were performed to assess whether the new surface mean was less than a specified control-surface mean. The confidence limit half-width for the difference between the means of two surfaces was computed using the within-day standard deviation because the replicates for each surface had to be run on the same two days, with each day treated as a block. For these calculations, it was assumed that this standard deviation was known. The lower one-sided confidence limit for the difference in means was derived using the following steps: This calculation did not indicate a difference between a control surface and the new surface under evaluation if the recovery differed by less than 3.0%. This approach was conservative because Eq. 1 categorized a new surface into Group 2 if it differed from stainless steel by more than 3.0%, which could be viewed as a strict criterion. Because the results were obtained on the same days and runs, the authors believed that this approach was reasonable. When evaluating the recovery of a new surface, this strategy helps personnel to place each material into the appropriate group. The grouping starts with a comparison of the new surface average to that of Group 1 (stainless steel 316L) and continues sequentially. If the new surface recovery (NSR) is more than 3.0% less than that of the group reference, it is compared with the reference surface in the next lower group until a group is found with which it does not differ by more than 3.0%. If no such group is found, then the new surface forms a new, lower group. The procedure is as follows:
  • If the NSR > mean stainless-steel recovery – 3.0%, the new surface belongs in Group 1.
  • If the NSR > mean cast-iron recovery – 3.0%, the new surface belongs in Group 2.
  • If the NSR > mean Type III anodized aluminum recovery – 3.0%, the new surface belongs in Group 3.
  • If the NSR < mean Type III anodized aluminum recovery – 3.0%, the new surface becomes a new group or becomes the worst-case surface for Group 3 and is used in all future method validations.


Table III: Swab recovery of three new materials of construction compared with controls from each representative group.
The data for the three new surfaces are outlined in Table III. Based on this approach, Ertalyte was placed into Group 1 because the difference between its recovery and that of stainless steel 316L (i.e., Group 1) was less than 3.0% (i.e., 2.48%). The recovery from stainless steel 420 and stainless steel 630 was more than 3.0% less than that from stainless steel 316L, but greater than that from cast iron. The authors placed stainless steel 420 and stainless steel 630 into Group 2. The placement of these two grades of stainless steel into Group 2 highlighted the conservative nature of this approach because their recoveries were only 4% and 8% less than that from stainless steel 316L. The recoveries were 12% and 9% greater than that from cast iron for stainless steel 630 and 420, respectively. Table II was updated to reflect this placement. Because the grouping strategy was conservative, it prevented the underestimation of recovery factors when assay values were reported and prevented the formation of additional groups for method validation unless the recovery value for a new surface is sufficiently low to warrant such an addition.
Conclusion
The authors' data-driven risk-management approach to cleaning verification methods uses analytical-recovery values for a model compound to place product-contact surfaces into groupings for analytical-method validation. The data generated during the studies supported the formation of three recovery groups to validate analytical swab methods. Groups 1–3 were represented by stainless steel 316L, cast iron, and Type III hard anodized aluminum, respectively. This approach allowed all surfaces to be considered during analytical-method validation and provided an objective mechanism to incorporate new surfaces into the strategy.
The benefits of this strategy are numerous. First, only three surfaces must be validated on each compound, which drastically minimizes the number of recovery values established to support the entire portfolio. Second, the strategy includes a way to add new materials of construction to the cleaning program if new equipment is purchased. Traditionally, all swab methods must be revalidated to incorporate the new surface. With this strategy in place, a model compound is evaluated, the new surface is grouped, and no changes to existing methods are required. Third, the strategy allows for a constant state of compliance. A relative recovery value is known for any material of construction for all equipment.
Because the grouping strategy is applied to a small fraction of the total surface area, no surface material of construction is ignored, each molecule undergoes a typical method validation, and the strategy places surfaces into groups conservatively. The authors believe that the strategy controls risks appropriately and that the data set given in this study scientifically supports the strategy of grouping materials of construction to support analytical methods within the cleaning program.
Acknowledgments
The authors would like to acknowledge the following colleagues at Eli Lilly: Gifford Fitzgerald, intern, for generating the swab-recovery data; Ron Iacocca, research advisor, for the SEM data; Sarah Davison, consultant chemist; Mike Ritchie, senior specialist; Mark Strege, senior research scientist; Matt Embry, associate consultant chemist; Kelly Hill, associate consultant for quality assurance; Bill Cleary, analytical chemist; and Laura Montgomery, senior technician, for their contributions and insightful suggestions throughout the project. In addition, Leo Manley, associate consultant engineer, provided the roughness measurements in support of this project.
Brian W. Pack* is a research advisor for analytical sciences research and development, and Jeffrey D. Hofer is a research advisor for statistics, discovery and development, both at Eli Lilly and Company, Indianapolis, IN, tel. 317.422.9043, packbw@lilly.com [packbw@lilly.com]
.

*To whom all correspondence should be addressed.
Submitted: Oct. 12, 2009. Accepted: Dec. 22, 2009.
References
1. ICH, Q9 Quality Risk Management, Step 5 version (2005).
2. Code of Federal Regulations, Title 21, Food and Drugs (Government Printing Office, Washington, DC), Part 211.67.
3. ICH, Q7 Good Manufacturing Practice Guide for Active Pharmaceutical Ingredients, Step 5 version (2000).
4. FDA, Guideline to Inspection of Validation of Cleaning Processes (Rockville, MD, July 1993).
5. L. Ray et al., Pharm. Eng. 26 (2), 54–64 (2006).
6. PDA, Technical Report 29, "Points to Consider for Cleaning Validation" (PDA, Bethesda, MD, Aug. 1998).
7. G.L. Fourman and M.V. Mullen, Pharm. Technol. 17 (4), 54–60 (1993).
8. ICH, Q2 Validation of Analytical Procedures: Text and Methodology, Step 5 version (1994).
9. R.J. Forsyth, J.C. O'Neill, and J.L. Hartman, Pharm. Technol. 31 (10), 102–116 (2007).

Figure 1: (ALL FIGURES ARE COURTESY OF THE AUTHORS)
Table I: High-performance liquid chromatography (HPLC) operating conditions.
Figure 2
Figure 3
Figure 4
Table II: Grouping of material surface of construction.
Figure 5

Table III: Swab recovery of three new materials of construction compared with controls from each representative group.

GMPs for Method Validation in Early Development: An Industry Perspective (Part II)

IQ Consortium representatives explore industry approaches for applying GMPs in early development.

The authors, part of the International Consortium on Innovation and Quality in Pharmaceutical Development (IQ Consortium), explore and define common industry approaches and practices when applying GMPs in early development. A working group of the consortium aims to develop a set of recommendations that can help the industry identify opportunities to improve lead time to first-in-human studies and reduce development costs while maintaining required quality standards and ensuring patient safety. This article is the second in the paper series and focuses on method validation in early-stage development.
The International Consortium on Innovation and Quality in Pharmaceutical Development (IQ) was formed in 2010 as an association of over 25 pharmaceutical and biotechnology companies with a mission to advance science-based and scientifically-driven standards and regulations for medicinal products worldwide. In the June 2012 issue of Pharmaceutical Technology, a paper was presented which described an overview of IQs consolidated recommendations from the Good Manufacturing Practices (GMPs) in Early Development working group (WG) (1). The focus of this IQ WG has been to develop recommended approaches on how to apply GMPs in early phase CMC development activities covering Phase I through Phase IIa. A key premise of the GMPs in Early Development WG is that existing GMP guidances for early development are vague and that improved clarity in the definition of GMP expectations would advance innovation in small-molecule pharmaceutical development by improving cycle times and reducing costs, while maintaining appropriate product quality and ensuring patient safety.
A consequence of the absence of clarity surrounding early phase GMP expectations has been varied in interpretation and application of existing GMP guidances across the industry depending on an individual company's own culture and risk tolerance. Internal debates within a company have frequently resulted in inappropriate application of conservative "one-size-fits-all" interpretations that rely on guidelines from the International Conference on Harmonization (ICH) that are more appropriate for pharmaceutical products approaching the point of marketing authorization application. In many cases, erroneous application of these commercial ICH GMP expectations during early clinical development does not distinguish the distinct differences in requirements between early development and late-stage development (Phase IIb and beyond). A key objective of this IQ WG, therefore, has been to collectively define in early development—within acceptable industry practices—some GMP expectations that allow for appropriate flexibility and that are consistent with existing regulatory guidances and statutes (2).
As outlined in the previous introductory paper, the efforts of the GMPs in Early Development WG have focused on the following four areas of CMC activities: analytical method validation, specifications, drug-product manufacturing, and stability. The initial scope of these efforts has been limited to small-molecule drug development which supports First in Human (FIH) through Phase IIa (Proof-of-Concept) clinical studies. A series of papers describing a recommended approach to applying GMPs in each of these areas is being published within this journal in the coming months. In this month's edition, the authors advocate for a life-cycle approach to method validation, which is iterative in nature in order to align with the evolution of the manufacturing process and expanding product knowledge space.
A pharmaceutical industry collective perspective on analytical method validation with regard to phase of development has not been published since 2004 (3). Genesis of the 2004 paper occurred during a set of workshops sponsored by the Analytical Technical Group of the Pharmaceutical Research and Manufacturers of America (PhRMA) in September 2003. The referenced paper summarized recommendations for a phased approach to method validation for small-molecule drug substance and drug products in early clinical development. Although a few other reviews on method validation practices have been published (4), this paper provides a current, broad-based industry perspective on appropriate method validation approaches during the early phases of drug-product development.
This broad industry assessment of method validation also uncovered the need to clearly differentiate the context of the terms of "validation" and "qualification." Method qualification is based on the type, intended purpose, and scientific understanding of the type of method in use during the early development experience. Although not used for GMP release of clinical materials, qualified methods are reliable experimental methods that may be used for characterization work, such as reference standards and the scientific prediction of shelf-life.
A perspective on some recent analytical method challenges and strategies, such as genotoxic impurity methods, use of generic methods, and methods used for testing toxicology materials or stability samples to determine labeled storage conditions, retest periods and shelf life of APIs and drug products are also presented. The approach to method validation described herein is based on what were considered current best practices used by development organizations participating in the IQ consortium. In addition, this approach contains some aspects which represent new scientifically sound and appropriate approaches that could enable development scientists to be more efficient without compromising product quality or patient safety. These science-driven acceptable best practices are presented to provide guidance and a benchmark for collaborative teams of analytical scientists, regulatory colleagues, and compliance experts who are developing standards of practice to be used during early phases of pharmaceutical development. The views expressed in this article are based on the cumulative industry experience of the members of the IQ working group and do not reflect the official policy of their respective companies.
Early-phase method parameters requiring validation


Table I: Summary of proposed approach to method validation for early- and late-stage development.
In early development, one of the major purposes of analytical methods is to determine the potency of APIs and drug products to ensure that the correct dose is delivered in the clinic. Methods should also be stability indicating, able to identify impurities and degradants, and allow characterization of key attributes, such as drug release, content uniformity, and form-related properties. These methods are needed to ensure that batches have a consistent safety profile and to build knowledge of key process parameters in order to control and ensure consistent manufacturing and bioavailability in the clinic. In the later stages of drug development when processes are locked and need to be transferred to worldwide manufacturing facilities, methods need to be cost-effective, operationally viable, and suitably robust such that the methods will perform consistently irrespective of where they are executed. In considering the purpose of methods in early versus late development, the authors advocate that the same amount of rigorous and extensive method-validation experiments, as described in ICH Q2 Analytical Validation is not needed for methods used to support early-stage drug development (5). This approach is consistent with ICH Q7 Good Manufacturing Practice, which advocates the use of scientifically sound (rather than validated) laboratory controls for API in clinical trials (6). Additionally, an FDA draft guidance on analytical procedures and method validation advocates that the amount of information on analytical procedures and methods validation necessary will vary with the phase of the investigation (7). IQ's perspective regarding which method parameters should be validated for both early- and late-stage methods is summarized in Table I. In this table, identification methods are considered to be those that discriminate the analyte of interest from compounds with similar (or dissimilar) structures or from a mixture of other compounds to assure identity. This category includes, but is not limited to identification methods using high-performance liquid chromatography (HPLC), Fourier transform infrared spectroscopy (FTIR), and Raman Spectroscopy. Assay methods are used to quantitate the major component of interest. This category includes, but is not limited to drug assay, content uniformity, counter-ion assay, preservative's assay, and dissolution measurements. Impurity methods are used for the determination of impurities and degradants and include methods for organic impurities, inorganic impurities, degradation products, and total volatiles. To further differentiate this category of methods, separate recommendations are provided for quantitative and limit test methods, which measure impurities. The category of "physical tests" in Table I can include particle size, droplet distribution, spray pattern, optical rotation, and methodologies, such as X-Ray Diffraction and Raman Spectroscopy. Although representative recommendations of potential parameters to consider for validation are provided for these physical tests, the specific parameters to be evaluated are likely to differ for each test type.
When comparing the method-validation approach outlined for early development versus the method-validation studies conducted to support NDA filings and control of commercial products, parameters involving inter-laboratory studies (i.e., intermediate precision, reproducibility, and robustness) are not typically performed during early-phase development. Inter-laboratory studies can be replaced by appropriate method-transfer assessments and verified by system suitability requirements that ensure that the method performs as intended across laboratories. Because of changes in synthetic routes and formulations, the impurities and degradation products formed may change during development. Accordingly, related substances are often determined using area percentage by assuming that the relative response factors are similar to that of the API. If the same assumption is used to conduct the analyses and in toxicological impurity evaluation and qualification, any subsequent impurity level corrections using relative response factors are self-corrective and hence mitigate the risk that subjects would be exposed to unqualified impurities. As a result, extensive studies to demonstrate mass balance are typically not conducted during early development.
In addition to a smaller number of parameters being evaluated in preclinical and early development, it is also typical to reduce the extent of evaluation of each parameter and to use broader acceptance criteria to demonstrate the suitability of a method. Within early development, the approach to validation or qualification also differs by what is being tested, with more stringent expectations for methods supporting release and clinical stability specifications, than for methods aimed at gaining knowledge of processes (i.e., in-process testing, and so forth). An assessment of the requirements for release- and clinical-stability methods follows. Definitions of each parameter are provided in the ICH guidelines and will not be repeated herein (5). The assessment advocated allows for an appropriate reduced testing regimen. Although IQ advocates for conducting validation of release and stability methods as presented herein, the details are presented as a general approach, with the understanding that the number of replicates and acceptance criteria may differ on a case-by-case basis. As such, the following approach is not intended to offer complete guidance.
Specificity. Specificity typically provides the largest challenge in early-phase methods because each component to be measured must be measured as a single chemical entity. This challenge is also true for later methods, but is amplified during early-phase methods for assay and impurities in that:
  • The chemical knowledge regarding related substances is limited.
  • There are frequently a greater number of related substances than in commercial synthetic routes.
  • The related substances that need to be quantified may differ significantly from lot-to-lot as syntheses change and new formulations are introduced.
A common approach to demonstrating specificity for assay and impurity analysis is based on performing forced decomposition and excipient compatibility experiments to generate potential degradation products, and to develop a method that separates the potential degradation products, process impurities , drug product excipients (where applicable), and the API. Notably, requirements are less stringent for methods where impurities are not quantified such as assay or dissolution methods. In these cases, specificity is required only for the API.
Accuracy. For methods used in early development, accuracy is usually assessed but typically with fewer replicates than would be conducted for a method intended to support late-stage clinical studies. To determine the API in drug product, placebo-spiking experiments can be performed in triplicate at 100% of the nominal concentration and the recoveries determined. Average recoveries of 95–105% are acceptable for drug product methods (with 90–110% label claim specifications). Tighter validation acceptance criteria are required for drug products with tighter specifications. For impurities, accuracy can be assessed using the API as a surrogate, assuming that the surrogate is indicative of the behavior of all impurities, including the same response factor. Accuracy can be performed at the specification limit (or reporting threshold) by spiking in triplicate. Recoveries of 80—120% are generally considered acceptable, but will depend on the concentration level of the impurity. For tests where the measurements are made at different concentrations (versus at a nominal concentration), such as dissolution testing, it may be necessary to evaluate accuracy at more than one level.
Precision. For early-phase methods, only injection and analysis repeatability is examined. Area % relative standard deviation (RSD) is typically determined from 5 replicates. Repeatability is determined at 100% of nominal concentration for the API with impurities being evaluated at the reporting threshold using the API as a surrogate. Acceptance criteria of 1% RSD (injection repeatability) or 2% RSD (analysis repeatability) for API are frequently targeted. For impurities, higher precision limits (e.g., 10–20%) are acceptable and should consider the level of the impurity being measured (injection and analysis repeatability). For tests where the measurements are made at different concentrations (versus at a nominal concentration), such as dissolution testing, it may be necessary to evaluate repeatability at more than one level.
Limit of detection and limit of quantitation. A sensitivity assessment is necessary to determine the level at which impurities can be observed. Using the API as a surrogate, a "practical" assessment can be made by demonstrating that the signal of a sample prepared at the reporting threshold produces a signal-to-noise ratio of greater than 10. A limit of quantitation can be determined from this assessment by calculating the concentration that would be required to produce a signal to noise ratio of 10:1. Similarly, a limit of detection can be calculated as the concentration that would produce a signal-to-noise ratio of 3:1. However, it is emphasized that the "practical limit of quantitation" at which it is verified that the lowest level of interest (reporting threshold) provides a signal at least 10 times noise and thus can be quantitated, is of paramount importance.
Linearity. Linearity can be determined from 3-point calibration curves at test concentrations of 70, 100, and 130% of nominal (API) for assay or from 3 points ranging from the reporting threshold to 130% of the specification limit for impurities. API is used as the surrogate analyte for impurities. For both analyses, a validation criterion can be established as R 2 > 0.995. For tests where the measurements are needed over broader concentration ranges, such as dissolution testing, a broader linear range may be examined using a 3-point calibration.
Range. As for late-phase methods, the range is inferred from the accuracy, precision, and linearity studies.
Robustness. Full robustness testing is not conducted during early development. However, an assessment of solution stability should be conducted to demonstrate the viable lifetime of standards and samples. Specifically, solutions should be considered stable when the following conditions are met:
  • The API assay changes by not more than 2%
  • No new impurities greater than the reporting threshold are observed
  • Impurities at the reporting threshold change by not more than 30%; impurities at levels between the reporting threshold and the specification limit change by not more than 20%; and impurities at or above the specification limit change by not more than 15%.
Notably, if validation is performed concurrently with sample analysis as an extended system suitability, solution stability must be assessed separately. This assessment is typically conducted as part of method development.
Early-phase methods requiring validation
During discussions held to develop this approach to early-phase method validation, it was evident that the context of the terms "validation" and "qualification" was not universally used within all the IQ member companies. To facilitate a common understanding of this approach, the authors will therefore refer to "validated methods" as those methods which perform as expected when subjected to the series of analytical tests described in this approach. "Qualified methods" are considered to be analytical methods which are subjected to less stringent testing to demonstrate that they are scientifically sound for their intended use. In the following sections, the authors recommend which types of methods typically employed in early development require either validation or qualification.
Methods for release testing and to support GMP manufacturing. In early development, specifications are used to control the quality of APIs and drug products. Consideration of specifications places great emphasis on patient safety since knowledge of the API or drug product process is limited due to the low number of batches produced at this stage of development. Specifications typically contain a number of different analytical tests that must be performed to ensure the quality of the API or drug product. Typical material attributes, such as appearance, potency, purity, identity, uniformity, residual solvents, water content, and organic/inorganic impurities, are tested against established acceptance criteria. The API and drug-product specific methods for potency, impurity, uniformity, and others should be validated as described above and demonstrated to be suitable for their intended use in early phase development prior to release. If compendial methods are used to test against a specification (e.g., FTIR for identification and Karl Fischer titration [KF] for water content), they should be evaluated and/or qualified to be suitable for testing the API or drug product prior to use without validation. Materials used in the manufacture of GMP drug substance and drug product used for early-phase clinical studies for which specifications are not outlined in a regulatory filing (e.g., penultimates, starting materials, isolated intermediates, reagents, and excipients) need only to be qualified for their intended use. Method transfer is less rigorous at this early stage of development and may be accomplished using covalidation experiments or simplified assessments.
As mentioned, method qualification is often differentiated from method validation. The experiments to demonstrate method qualification are based on intended purpose of the method, scientific understanding of the method gained during method development and method type. It is an important step in ensuring that reliable data can be generated reproducibly for investigational new drugs in early development stages. The qualified methods should not be used for API or drug product release against specifications and concurrent stability studies. However, reference material characterization may be done with qualified methods.
Generation of process knowledge in early development is rapidly evolving. Numerous samples are tested during early development to acquire knowledge of the product at various stages of the process. The results from these samples are for information only (FIO) and methods used for this type of testing are not required to be validated or qualified. However, to ensure the accuracy of the knowledge being generated, sound scientific judgment should be used to ensure the appropriateness of any analytical method used for FIO purposes.
"Generic" or "general" methods. A common analytical strategy often employed in early development is the use of fit-for-purpose generic or general methods for a specific test across multiple products (e.g., gas chromatography for residual solvents). These methods should be validated if they are used to test against an established specification. The suggested approach to validating these methods in early development is typically performed in two stages. Stage 1 involves validating the parameters that are common for every product with which the method can be used. Linearity of standard solutions and injection repeatability belong to this stage. Stage 2 of the validation involves identifying the parameters that are specific to individual product, such as accuracy. Specificity may be demonstrated at Stage 1 for nonproduct related attributes and at Stage 2 for product related attributes. Stage 1 validation occurs prior to GMP testing. Stage 2 validation can happen prior to or concurrent with GMP testing. This approach to validation of fit-for-purpose methods can provide efficiency for drug development by conserving resources in the early phases of development and can ensure reliability of the method's intended application.
Methods for GTI and tox batch qualification. The need for analytical methods to demonstrate the control of genotoxic impurities (GTI) has developed recently because of expectations and guidances provided by regulatory authorities (8, 9). Often, these methods require high sensitivity with limits of quantitation in the parts-per-million (ppm) range. Although the control levels for GTIs (referred to as the threshold of toxicological concern) is less stringent for early clinical studies (e.g., patient intake < 50 ug/day for clinical studies < 30 days vs 1.5 ug/day for longer clinical studies), regulatory authorities expect that GTI control is demonstrated during early development. Depending on when a GTI is potentially generated during an API synthesis, GTIs may be listed in specifications. Validation of these methods is again dependent upon the intended use of the method. Methods used for assessment may be qualified unless they are used to test against a specification as part of clinical release. Method qualification is also considered appropriate if the method is intended for characterization or release of test articles for a toxicology study.
Methods for stability of APIs and drug products. Batches of API and drug product are typically exposed to accelerated stress conditions and tested at timed intervals to assess whether any degradation has occurred. The shelf-life of the API or drug product—that is, the time period of storage at a specified condition within which the drug substance and drug product still meets its established specifications, is based on analytical data generated from these studies. For this application, analytical methods need to be stability-indicating (e.g., capable of detection and quantitation of the degradants) to ensure quality, safety, and efficacy of a drug substance and drug product. Often, the analytical methods used to perform stability tests are the same methods used to test against a specification for release testing; these methods should be validated. However, if additional tests are performed which are not included in the established specification, they may be qualified for their intended use, rather than validated.
In-process testing methods. In-process testing (IPT) during manufacturing of drug substance and drug product can be done on-line, in-line, or off-line. The results generated from IPT are used to monitor processes involving reaction completion, removal of solvents, removal of impurities, and blend content uniformity. Manufacturing parameters may be adjusted based on IPT results. IPT methods are often very limited in scope. In early development, the primary benefit of performing IPTs is the generation of process knowledge, and not as a control or specification. As a result, even though IPT is essential for manufacture of drug substance and drug product, method qualification for an IPT method is appropriate in early-phase development.
Documentation and other requirements. The extent of documentation and associated practices in early development should be aligned with the appropriate level of method validation as discussed above. In this paper, the authors provide a perspective on the appropriate level of documentation, protocol and acceptance-criteria generation, instrument qualification, and oversight of the quality assurance unit for early-phase method validation and qualification. This approach provides development scientists with flexibility to efficiently adapt to the dynamic environment typical within early phase pharmaceutical development, while ensuring patient safety and the scientific integrity of the validation process.
With respect to documentation, it the IQ perspective that the raw data which is generated during early phase method validation should be generated and maintained in a compliant data storage format. The integrity of raw data should be controlled such that it can be retrieved to address future technical and compliance-related questions. Proper documentation of data and validation experiments should also be considered an important aspect of early phase validation. The availability of electronic notebook (ELN) systems has provided a viable, more efficient alternative to the use of traditional bound-paper notebooks. In developing policies to implement ELNs, the goal should not be that all documentation practices used with paper notebooks are replicated. Rather, the ELN should possess sufficient controls for the intended use of the data. In many cases, electronic systems such as ELNs will transform the work process, and the controls it provides will be achieved in a completely novel manner compared to the outdated system being replaced.
Although data needs to be documented as described above, it is the authors' position that formal, detailed method and validation reports are not required to ensure compliance in early development. Adequate controls need to be in place to ensure method parameters used to execute validated methods are equivalent to parameters used during validation. Generation of brief method and validation summary reports are required only when needed to fulfill regulatory filing requirements or to address requests or questions from health authorities. Validation summaries are not required to present all of the validation data, but rather a summary of the pertinent studies sufficient to demonstrate that the method is validated to meet the requirements of its intended use. Once reports are generated and approved internally, approved change control procedures should be available and followed to maintain an appropriate state of control over method execution and report availability.
Although the authors' perspective is that a validation plan needs to exist for early phase method validation, analytical organizations could consider different mechanisms to fulfill this need. For example, internal guidelines or best practice documents may sufficiently outline validation requirements such that a separate validation plan need not be generated for each method. In the absence of such a guideline or procedure, a validation plan could be documented in a laboratory notebook or ELN which includes a brief description of validation elements and procedures to be evaluated. Validation plans should ensure that the method will be appropriate for its intended use. The use of strict validation criteria within the validation plan should be limited at these early stages of development. Validation studies for early development methods may be performed on fit-for-purpose instruments which are calibrated and maintained, but not necessarily qualified or under strict change-control standards.
The role of the pharmaceutical quality system and the oversight over early phase method validation practices and documentation is another area for consideration. In the pharmaceutical industry, quality management is overseen by a "Quality Unit" that qualifies and oversees activities in the areas of GMP materials such as laboratory controls. In practice, the size and complexity of the Quality Unit overseeing GMP manufacturing varies based on a manufacturer's size and stage of drug development. Regardless, the basic aspects of a quality system must be in place. In early development, IQ's position is that, because API and drug-product manufacturing processes are evolving, the analytical methods do not yet require full validation as prescribed in ICH Q2. Correspondingly, the quality system implemented during early phases could consider that evolving analytical methods are intrinsic to the work being performed to develop the final API and drug product processes and could allow flexibility to readily implement method changes during early development. For example the Quality Unit should delegate oversight for validation plan approval, change control, approval of deviations and reports to the analytical departments prior to finalization and performing full ICH Q2 validation of the analytical methods. This approach would be consistent with Chapter 19 of ICH Q7A. However, analytical departments must ensure that early phase validation studies are conducted by qualified personnel with supervisory oversight who follow approved departmental procedures. Clearly, agreements between Quality Units and analytical departments to implement an appropriate strategic, phase-based quality oversight system would provide many benefits within the industry.
Conclusions
Within this paper, IQ representatives have presented an industry perspective on appropriate requirements and considerations for early phase analytical method validation. A suggested outline of acceptable experiments that ensure analytical procedures developed to support API and drug product production of early phase clinical materials are suitable for their intended use has been presented. Additionally, the authors have provided a position on phased approaches to other aspects of method validation such as documentation requirements, generation of method validation plans, validation criteria, and the strategic involvement of quality unit oversight. When applied appropriately, this approach can help to ensure pharmaceutical development organizations provide appropriate analytical controls for API and drug product processes which will serve the ultimate goal of ensuring patient safety. Although the extent of early-phase method validation experiments is appropriately less than employed in the later stages of development, we view that any risks related to this approach will not be realized, especially when considering the overall quality and safety approach used by pharmaceutical companies for early phase clinical studies.
It is the authors' hope that providing such an approach to early-phase method validation, along with the approaches outlined in this series of early-phase GMP papers, will serve as a springboard to stimulate discussions on these approaches within the industry and with worldwide health authorities. To encourage further dialogue, this IQ working group is planning on conducting a workshop in the near future to promote robust debate and discussion on these recommended approaches to GMPs in early development. These discussions will ideally enable improved alignment between R&D development, Quality, and CMC regulatory organizations across the pharmaceutical industry, and most importantly with worldwide regulatory authorities. Agreement between industry and health authorities regarding acceptable practices to applying GMPs in the early phases of drug development would clearly be beneficial to CMC pharmaceutical development scientists and allow for a more nimble and flexible approach to better address the dynamic environment typical of the early phases of clinical development, while still guaranteeing appropriate controls to ensure patient safety during early development.
Donald Chambers is in analytical sciences at Merck Research Laboratories, Gary Guo is in analytical R&D at Amgen, Brent Kleintop* is in analytical and bioanalytical development at Bristol-Myers Squibb Co., Henrik Rasmussen is in analytical development at Vertex Pharmaceuticals, Steve Deegan is in GMP Quality Assurance Operations at Abbott, Steven Nowak is in NCE Analytical R&D, Global Pharmaceutical R&D, at Abbott, Kristin Patterson is in emerging markets R&D at GlaxoSmithKline, John Spicuzza is in analytical development at Baxter, Michael Szulc is in analytical development at Bioden Idec, Karla Tombaugh is in R&D/Commercialization Quality, Merck Manufacturing Division, at Merck & Co., Mark D. Trone is in analytical development, small molecule, at Millennium Pharmaceuticals, and Zhanna Yuabova is in analytical development, US, at Boehringer Ingelheim Pharmaceuticals.
*To whom all correspondence should be addressed.
References
1. A. Eylath at al., Pharm. Technol. 36 (5) 54–58 (2012).
2. FDA, Guidance for Industry: cGMP for Phase 1 Investigational Drugs (July 2008 FDA).
3. S.P. Boudreau et al., Pharm. Technol. 28 (11) 54–66 (2004).
4. M. Bloch, "Validation During Drug Product Development – Considerations as a Function of the Stage of Drug Development," Method Validation in Pharmaceutical Analysis, a Guide to Best Practice, Eds. J. Ermer, J.H. Miller (Wiley, 2005), pp. 243–264.
5. ICH, Q2 (R1) Validation of Analytical Procedures: Text and Methodology (Nov. 2005).
6. ICH, Q7A Good Manufacturing Practice Guidelines for Active Pharmaceutical Ingredients (Aug. 2001).
7. FDA, Draft Guidance for Industry: Analytical Procedures and Method Validation, Chemistry, Manufacturing and Controls Documentation (Aug. 2000).
8. EMA, Guideline on the Limits of Genotoxic Impurities (June 2006).
9. FDA, Draft Guidance for Industry: Genotoxic and Carcinogenic Impurities in Drug Substances and Products: Recommended Approaches (2008).

Table I: Summary of proposed approach to method validation for early- and late-stage development.

FDA's New Process Validation Guidance: Industry Reaction, Questions, and Challenges

By Mike Long,Hal Baseman,Walter D. Henkels

The authors desribe the three-stage approach to validation that is outlined in the new guidance and discuss questions surrounding implementation.

FDA's 2011 Process Validation: General Principles and Practices guidance created a systemic shift in industry's approaches to validation programs. The authors describe the three-stage approach to validation that is outlined in the new guidance and discuss questions surrounding implementation.
In January 2011, FDA published its long-awaited guidance for industry on Process Validation: General Principles and Practices (1) . For many in the pharmaceutical industry, the guidance created a systemic shift in the expectations of their validation programs. Although the new guidance aligns process-validation activities with the product life-cycle concept and with existing harmonized guidelines such as the Internatioanl Confernce on Harmonization's Q8(R2) Pharmaceutical Development, Q9 Quality Risk Management, and Q10 Pharmaceutical Quality System, it may have created as many questions for the industry as it has answered. In this article, the authors briefly describe the three-stage approach to validation that is outlined in the new guidance as well as implications for manufacturers regarding their current approaches to process validation. Specific emphasis is placed on questions surrounding industry implementation.
Design for assurance
The regulatory basis for process validation is contained in a number of places. Process validation is legally enforceable per the Federal Food, Drug, and Cosmetics Act. The requirements are called out in 21 CFR Parts 210 and 211 of the CGMP regulations, more specifically in Part 211.100 (a). This section is what the FDA describes as the regulatory "foundation for process validation" (1). Here, manufacturers are required to have production and process-controls procedures in place that are "designed to assure" drug products have a certain level of quality and that their products are manufactured safely, effectively, and purely.
How is that assurance to be provided, however? How can industry design to assure these qualities? One way is through direct observation; another is through prediction. Direct observation may require the destruction of the products, meaning that industry generally must use predictive methods to determine, with a certain level of confidence, that its processes and controls are designed to assure quality. Validation is the predictive method for providing this assurance.
A brief history of validation
In the 1970s, there was a series of contamination issues in large-volume parenteral bottles that, with a handful of other significant adverse events, led regulatory authorities and manufacturers to focus more on process understanding and quality assurance. The idea that finished product testing was not enough to assure product quality began to grow. Industry needed a better system to determine whether a product was "good" (2).
In the mid-1970s, FDA's Ted Byers and Bud Loftus began to promote the idea that a focus on process design and an evaluation of support processes and functions would assist in assuring that processes were under control (rather than just " testing quality into the product"). At a 1974 conference, Byers called this new predictive approach "Design for Quality" (3).
In 1987, at the request of industry, FDA published its original guideline on General Principles of Validation (4). The agency established a more formal approach to process validation and created the assertion that multiple batches need to be run. This concept eventually translated into the three-batch approach, where the successful running of three consecutive product batches represents a validated process. Herein are fundamental questions that still exist 20 years after the publication of the guidance:
  • If companies were following the guidance, why would companies continue to see processes fail during commercial manufacture?
  • Were companies really providing high degrees of assurance their processes worked reliably?
  • Did companies understand how or why running three batches proved that their processes were adequately controlled and capable of consistently providing required results?
When these questions are asked of people in the industry, the answers, in the authors' experience, are invariably, "Well, stuff happens. Things happen. Things we didn't anticipate." "Stuff happens" tends to refer to "process variation," a key term and a key reason why the 1987 guidance was updated.
The new guidance
The new guidance ushers in a life-cycle approach to validation. This approach is most apparent in the guidance's new definition for process validation: "...the collection and evaluation of data, from the process design through commercial production, which establishes scientific evidence that a process is capable of consistently delivering quality products" (1)
The words used have been selected carefully. The new definition talks about establishing process capability using scientific evidence while previous definitions used the phrase "documented evidence" (4). This earlier definition may have contributed to validation being viewed largely as a late-stage documentation exercise. The new definition, however, describes process validation as a continuous process of collection and evaluation of data, rather than as a three-batch static event. As one FDA representative commented at a recent PDA workshop, the number of batches is not an acceptance criteria; however, the results of the data obtained from the batches are (5). The new definition of validation caused one industry member to lament at the workshop that for the past 30 years, industry has been told that process validation is a documentation exercise. Now, FDA expects industry to consider process validation as a scientific endeavor. That is quite a shift and 30-year habits are hard to break, he noted (5).
This statement underlies the dichotomy that exists within the industry regarding the new guidance. For years, the industry criticized regulators that the industry owned the expertise, knew their processes better than regulatory agencies, and should have flexibility in how they could validate these processes. The agency listened and came back with a guidance that is deliberately non-prescriptive; it doesn't tell industry how many samples to pull or how many batches to run. The industry has to be able to answer these questions and they are not always easy to answer.
Defining the new process-validation stages
As noted, the 2011 guidance enforces the life-cycle model described in ICH Q10 and in FDA's 2006 guidance for industry on Quality Systems Approach to Pharmaceutical CGMP Regulations, which states that, "quality should be built into the product, and testing alone cannot be relied on to ensure product quality" (6, 7). The method of building quality into a product or "designing for assurance" challenges manufacturers to better:
  • Understand the sources of variation along the supply chain in the manufacturing process
  • Detect the existence and amount of variation that is imparted into the product from sources within the supply chain, the equipment, and personnel
  • Understand the impact of the detected variation on the finished product
  • Control the variation detected based upon the understanding and knowledge of the sources of the variation that proportional to its risk to product and patient (1).
The new guidance document describes validation activities in three stages using a life-cycle model (see Figure 1). Although explained as discrete stages, some activities can occur in multiple stages while others may overlap between stages. Risk assessments are a good example of such activities. The guidance describes these three stages as follows (see Figure 1):
  • Stage 1–Process Design: The commercial-manufacturing process is defined during this stage based on knowledge gained through development and scaleup activities.
  • Stage 2–Process Qualification: During this stage, the process design is evaluated to determine whether the process is capable of reproducible commercial manufacturing.
  • Stage 3–Continued Process Verification: Ongoing assurance is gained during routine production that the process remains in a state of control (1).


Figure 1: The three stages of the validation life-cycle model based on the new FDA process validation guidance. (Authors)
These stages are defined in more detail in the following sections. Stage 1–Process Design. Process validation is an ongoing process within the product life cycle. This cycle begins at Stage 1. Here, the commercial process starts to be defined using knowledge gained through development and scale-up activities (1). Process knowledge and understanding is captured in Stage 1 to determine initial process capability and to evaluate sources of variation. Design of experiments can be used to identify and establish process-parameter relationships with quality attributes. Risk-management efforts should begin in earnest at this phase to assist in capturing the product and process knowledge and to prioritize development efforts, which can be ranked by the relative importance of the quality attributes. Process-control strategies can begin to be developed during this stage through initial understanding of the risks and sources of variation (8).
Stage 2–Process Qualification. Although the 2011 guidance document's three-stage description of validation may be new to the industry, the content and concepts within Stage 2 should be the most familiar. During this stage, the process design is confirmed as being capable of reproducible commercial manufacturing. This phase involves evaluating the facility and equipment for its fitness for use. Utility systems and equipment are verified to be built and installed properly, and operators ensure that they operate within the intended and anticipated operating ranges. The conclusion of this stage commences with the execution of a process-performance qualification (PPQ) exercise. PPQ is a significant milestone in the product life cycle. The decision to distribute product to the market will be determined by the successful completion of PPQ. The amount and usefulness of the product and process knowledge gained up to this point in the life cycle will determine the approach a company takes with its PPQ (1). The focus at this stage should not be on the number of batches needed to produce a successful PPQ run, but rather, on whether enough data has been produced and evaluated to answer two basic questions:
  • What scientific evidence is there to provide the appropriate level of assurance that the manufacturing system has been designed to consistently deliver a quality product to the market?
  • Are there systems in place to properly monitor and control that manufacturing system?
Stage 3–Continued Process Verification. Stage 3 of the 2011 guidance is the continuous-improvement phase of the process-validation life cycle. In this phase, data obtained from routine production is used to provide ongoing assurance that the process remains in the state of control. This activity is more than an annual product review; rather, this action involves using a system or multiple systems of assuring control.
Control strategies originally conceived in Stage 1 are implemented during this phase as well. Tracking and trending of data allows for the detection of special-cause variability and the reduction of inherent common-cause variability. Determination on the state of control of the processes in commercial distribution should be calculated using appropriate statistics and derived from appropriate confidence levels. These confidence levels should be based upon risk factors, experience, and attribute criticality (1).
A continuous process of evaluation
With the new life-cycle approach, process validation is no longer a one-time milestone event. Process validation should be considered a continuous process of valuation. The five steps shown in Figure 2 represent stages of validation. These steps require industry to:
  • Know the process
  • Know the variables
  • Have confidence before going into commercial manufacture
  • Create vigilance in understanding variation through monitoring and continuous improvement that the process is under control.


Figure 2: The process-validation sequence.(Figures are courtesy of authors)
Initial process understanding for a given product is most often based upon a combination of early development work on the product along with prior knowledge of the existing product and manufacturing platforms that will be used to commercialize the product. It is quite rare that a product entering the validation lif cycle will have a unique product or manufacturing platform. This initial process knowledge allows for a more robust process-design phase, which creates an enhanced understanding of the sources of variation and their impact on the product's quality attributes. Understanding the sources of variation provides confidence to go into commercial manufacture and to create an ongoing monitoring program that will track and trend data for continuous improvement. Industry reaction and challenges
The industry reaction to the 2011 process validation guidance, in general, has been favorable. The document is nonprescriptive. It is relatively flexible. It leaves much to the companies to decide how they wish to meet its expectations. But, as the authors discovered through a series of industry workshops and seminars, there still is a need for further explanation and clarification regarding the guidance as well as a real need for detailed implementation suggestions. Although there are many questions to be answered, this paper is limited to three of the most frequently asked questions about implementing the new guidance.
Common questions. The classic and most frequently asked question about the new guidance is, How many batches do I have to run to validate my processes under the new guidance? The answer does not delight many people: "It depends."
The number of batches needed to validate a process depends on how much data are required to provide a company with an appropriate level of statistical confidence and scientific justification to begin commercial production. There is nothing wrong with a company choosing to run three batches for validation, but that company has to be prepared to explain why the data obtained from three batches is sufficient to determine that the process is ready to move into commercial manufacturing. Likewise, if a company chooses to run fewer than three batches, it has to be able justify the amount of data created by those batches. In short, there is no longer a uniform number of validation batches that can be applied globally to the industry. The approach is now process- and product-specific, depending on a company's ability to test critical parameters, conditions, and attributes.
Overall, the batches are a means to an end, and in this case, that end is data (9).
The discussion of the number of batches required invariably leads to a second frequently asked question, How do we determine the amount of data we need in our validations? The question revolves around the amount of data needed in Stage 2–Process Qualification. FDA has been quite clear that statistical justification for sampling is required. Specifically, the guidance states that:
  • The number of samples should be adequate to provide sufficient statistical confidence of quality both within a batch and between batches.
  • The confidence level selected can be based on risk analysis as it relates to the particular attribute under examination.
  • Sampling during this stage should be more extensive than is typical during routine production (1).
A key to success here will be appropriate use of risk analysis to assist in selection of appropriate confidence levels for a product's critical quality attributes (CQAs). All CQAs are not alike, the higher the risk for a given CQA, the higher the confidence level required during validation. The amount of data required might then increase for those CQAs that require higher confidence levels. This decision will depend on the test methods used and the type of data collected. Test methods that create dichotomous data (pass/fail, go/no go) will require many more samples than those that produce continuous (i.e., variable) data. This difference needs to be taken into consideration early in the validation life-cycle process becaues it also impacts how data will be trended in Stage 3.
Another question brought up frequently is, Do the requirements in the guidance conflict with existing global regulatory requirements? This question has two components to it.
First, there is a specific concern that validation terms within the document do not align with EMA terms, specifically Annex 15 of the European Union Guidelines to Good Manufacturing Practice, or those in ICH Q7 (10, 11). Examples of misalignment would be traditional terms such as installation qualification and operational qualification. These terms are not included in the new FDA process validation guidance, but the concept described in Stage 2 of the guidance does state that, "qualification refers to activities undertaken to demonstrate that utilities and equipment are suitable for their intended use and perform properly." This language is similar in nature to Sections 9–13 of the EU GMP Annex and Section 12.3 of ICH Q7 (1, 10, 11).
Second, there is a concern that certain concepts within the EU GMPs conflict with the new FDA guidance, specifically regarding the three-batch validation approach. Section 25 of the EU GMP Annex 15 states, "It is generally considered acceptable that three consecutive batches/runs within the finally agreed parameters, would constitute a validation of the process" (10). On the surface, this language would seem to conflict with the FDA's new focus on data rather than the number of batches. However, Section 25 of the EU GMP Annex 15 also states that, "...the number of process runs carried out and observations made should be sufficient to allow the normal extent of variation and trends to be established and to provide sufficient data for evaluation." So, while the three-batch approach to validation is embedded in Annex 15, Section 25 also provides for a more flexible approach based upon data (10).
Related to the question above are general questions regarding terminology. The most notable new term in the FDA guidance is process performance qualification (PPQ), described earlier in this paper. The guidance defines PPQ as "the second element of Stage 2, process qualification. The PPQ combines the actual facility, utilities, equipment (each now qualified), and the trained personnel with the commercial manufacturing process, control procedures, and components to produce commercial batches. A successful PPQ will confirm the process design and demonstrate that the commercial manufacturing process performs as expected" (1). PPQ is what most companies call their process validation (PV) and should not be confused with individual equipment process qualifications. Here again, FDA has stated that the concepts are important, not the terms (9). Companies are free to call these activities what they wish as long as the two parts of Stage 2 as outlined in the guidance are met.
Challenges. Changing the existing validation culture to meet the expectations of the new guidance may be the biggest challenge industry faces. The guidance requires companies to expand their current scope of validation by reaching further upstream into development and downstream into day-to-day manufacturing. Companies agree this expansion will foster better communication from the development groups through to manufacturing but are concerned that it also requires additional staffing. To better understand and meet this challenge, companies should create gap analyses of their current state of validation and compare it with the future state based on the new guideline. These gap analyses will better situate companies to create action plans for the new policies and procedures that may need to be put into place. The analyses will also assist in identifying any resource gaps and training needs.
Conclusion
FDA's new process validation guidance has received generally favorable reviews for its nonprescriptive and flexible approach. With this flexibility, however, comes a concern about acceptable approaches for implementation at a practical level and a genuine need for a continued dialogue of explanation and clarification between the agency and industry. Due to limitations, this article could only touch on a handful of the questions being asked at this time. The authors endeavor to create a series of articles in Pharmaceutical Technology to further address issues such as the use of risk management along the three stages of validation, collection and transfer of product and process knowledge along the validation life cycle, and continued process verification.
Dr. Mike Long, MBB,* is director of consulting services, and Walter D. Henkels, Sr., is a consultant, both at ConcordiaValsource. Hal Baseman is chief operating officer of ValSource LLC. Mlong@concordiaValsource.com [mlong@concordiavalsource.com]
*To whom all correspondence should be addressed.
References
1. FDA, Guidance for Industry: Process Validation: General Principles and Practices (Rockville, MD, Jan. 2011).
2. US vs. Barr Laboratories, 1993.
3. J. M. Dietrick and B.T. Loftus, "Regulatory Basis for Process Validation" in Pharmaceutical Process Validation, R.A. Nash and A. H. Wachter, Eds. (Marcel Dekker New York, NY, 3rd ed., 2003) p. 43.
4. FDA, Guideline on General Principles of Process Validation (Rockville, MD, May 1987, reprinted May 1990).
5. PDA Annual Meeting Post-Conference Workshop, "The FDA's Process Validation Guidance: Meeting Compliance Expectations and Practical Implementation Strategies" (San Antonio, TX, April 13–14, 2011).
6. ICH, Q10 Pharmaceutical Quality System (2008).
7. FDA, Guidance for Industry: Quality Systems Approach to Pharmaceutical CGMP Regulations (Rockville, MD, September 2006).
8. M. Long and H. Baseman, presentation at the Elsevier Business Intelligence Webinar, May 18, 2011.
9. G. McNally, presentation at the PDA Annual Meeting Post-Conference Workshop (San Antonio, TX, April 13–14, 2011).
10. EU, Guidelines to Good Manufacturing Practice, Medicinal Products for Human and Veterinary Use, Annex 15: Qualification and Validation (Brussels, July 2001).
11. ICH, Q7 Good Manufacturing Practice Guide For Active Pharmaceutical Ingredients (November 2000).

Figure 1: The three stages of the validation life-cycle model based on the new FDA process validation guidance. (Authors)
Figure 2: The process-validation sequence.(Figures are courtesy of authors)