Articles on Standards of Evidence and Evidence Grading


Standards of evidence: KT has emerged in an era of evidence-based practice and knowledge, with an increased need for understanding of standards of evidence. Standards relate to quality of the research that develops the evidence, and issues such as rigor vs. relevance, quantitative vs. qualitative research studies, experimental vs. quasi-experimental designs, internal and external validity, replication, and generalizability (adapted from National Research Council, 2004).

Evidence grading: KT involves the quality assessment of research studies and research evidence. Evidence grading is a systematic method for assessing and rating the quality of evidence that is produced from a research study, a collection of studies, a systematic review, or expert opinion (NCDDR, 2006).

Following are some articles from the KT Library that address these topics. KTDRR staff reviewed a number of articles, developed a brief abstract, and assigned ratings based on strength of evidence, readability, and consumer orientation. For more information on these ratings, see KT Library Descriptor Scales.


American Dietetic Association. (2007). ADA evidence analysis manual (5th ed.)     

Abstract: The American Dietetic Association manual provides a step-by-step process for evidence analysis including specific actions to be taken at each step. Numerous charts, checklists and worksheets to guide the user’s process, as well as a glossary of terms related to research design, are found in the appendices.

Descriptor Scales

Evidence: 2 - Expert opinions
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


American Educational Research Association Task Force on Reporting of Research Methods. (2006). Standards for reporting on empirical social science research in AERA publications.

Abstract: The American Educational Research Association (AERA) adopted standards on reporting empirical research in AERA publications in June, 2006. The standards were developed to assist researchers, editors, reviewers and readers of AERA journals. They are based on two major principles: that empirical research should be warranted, and that the research reporting process is transparent at every step. The standards are organized into the following areas: problem formulation, design and logic of the study, sources of evidence, measurement and classification, analysis and interpretation, extrapolation, ethics in reporting and, title.

Descriptor Scales

Evidence: 2 - Expert opinions
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Coalition for Evidence-Based Policy. (2003). Identifying and implementing educational practices supported by rigorous evidence: A user friendly guide. NCEE EB2003. Washington, DC: US Department of Education. Institute of Education Sciences, National Center for Educational Evaluation and Regional Assistance.

Abstract: The Coalition for Evidence-based Policy document provides educators with guidelines in evaluating the random assignment process, outcome data, and results report to determine whether or not the research is "evidence-based." The document includes a rationale for the guidelines as well as a checklist for use in the evaluation.

Descriptor Scales

Evidence: 2 - Expert opinions
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Demner-Fushman, D., Few, B., Hauser, S. E., &Thoma, G. (2006). Automatically identifying health outcome information in MEDLINE records. Journal of the American Medical Informatics Association, 13, 52-60.     

Abstract: Demner-Fushman et al. target health care professionals with limited time to review research. The authors describe an automated evidence-based medicine model approach to identifying relevant information in medical research quickly without needing to analyze the entire document. The approach was ranked against PubMed Clinical Queries and the authors found that the outcome-based ranking provided significantly more accurate information.

Descriptor Scales

Evidence: 3 - Qual./Quant. research
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Dijkers, M. P. J. M., Brown, M., & Gordon, W. A. (2008). FOCUS Technical Brief (19). Getting Published and Having an Impact: Turning Rehabilitation Research Results Into Gold.

Abstract: The FOCUS authored by Drs. Marcel Dijkers, Margaret Brown, and Wayne Gordon from the Mount Sinai School of Medicine, Department of Rehabilitation Medicine, New York, suggests strategies that rehabilitation researchers can use to maximize their work-turning "research results into gold." In the disability and rehabilitation research community, it is important for researchers to be cognizant of how published results of research studies can facilitate or limit their use in answering important evidence-based questions.

Descriptor Scales

Evidence: 1 - Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Ebell, M. H., Siwek, J., Weiss, B. D., Woolf, S. H., Susman, J. Ewigman, B., & Bowman, M. (2004). Strength of recommendation taxonomy (SORT): A patient-centered approach to grading evidence in the medical literature. American Family Physician, 69, 548-556.     

Abstract: Ebell et al. suggest using the Strength of Recommendation Taxonomy (SORT) scale to determine the quality of medical reviews, . The SORT scale was developed by representatives of family medicine and primary care journals as well as the Family Practice Inquiries Network. SORT is based on evaluation of quality of study design, quantity of studies included in the review and consistency with outcomes reported. In addition, the scale includes a determination of whether the outcomes are patient-oriented or disease-related. Further, the authors link SORT to other compatible taxonomies. The authors suggest that use of a single scale across studies and journals help to analyze outcomes for translation into practice.

Descriptor Scales

Evidence:2 - Expert opinions
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Green, L. W., & Glasgow, R. E. (2006). Evaluating the relevance, generalization, and applicability of research: Issues in external validation and translation methodology. Evaluation & the Health Professions, 29(1), 126-153. source.

Abstract: Green and Glasgow suggest that current research does not include sufficient emphasis on external validation or generalizability. The authors propose criteria to evaluate the external validity of research, such as the inclusion of members of the target population in the study; use of intended settings; reporting the expertise and training of people providing implementation, as well as any adaptations made for different settings; effects beyond primary outcomes including quality of life issues; and reporting costs. The article recommends that external validity should be included in the planning process, thus making the research relevant to the people who will use the outcomes for setting policy or for decision making on an individual level. A companion article is available in this collection entitled, "Why don't we see more translation of health promotion research to practice? Rethinking the efficacy-to-effectiveness translation" (Glasgow, 2003).

Descriptor Scales

Evidence: 1 - Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Harbour, R., & Miller, J. (2001). A new system for grading recommendations in evidence based guidelines. British Medical Journal, 323, 334–336.

Abstract: Harbour and Miller provide the rationale and framework for the development of guidelines by the Scottish Intercollegiate Guidelines Network (SIGN) for evaluating the quality of evidence based clinical research. The guidelines for determining levels of evidence and grades of recommendation are based on study design and quality of methodology. SIGN developed a hierarchy of study types as well as key stages in developing recommendations. The authors recommend the use of a checklist to ensure all aspects are considered. An additional checklist is suggested for the evaluation of diagnostic tests.

Descriptor Scales

Evidence: 2 - Expert opinions
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Haynes, R. B., Cotoi, C., Holland, J., Walters, L, Wilczynski, N., Jedraszewski, D.,McKinlay, J., Parrish, R., & McKibbon, K. A. (2006). Second-order peer review of the medical literature for clinical practitioners. JAMA, 295(15), 1801-1808.

Abstract: Haynes et al. describe The McMaster Online Rating of Evidence (MORE) system that utilizes practicing physicians to rate peer-reviewed journal articles in their discipline as the basis for inclusion in the McMaster Premium Literature Service (PLUS) Internet access program. Following a review by staff, volunteer physicians rate articles by whether the article is important to the field (relevance) and whether it is new information (newsworthy). The ratings provide a screen for articles to be included in an Internet service that notifies physicians of recent research. The project demonstrated the value of a peer review of published journal articles by discipline.

Descriptor Scales

Evidence: 1 – Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Johnston, M. V., Sherer, M., & Whyte, J. (2006). Applying evidence standards to rehabilitation research. American Journal of Physical Medicine and Rehabilitation, 85, 292-309. full-text source.

Abstract: Johnston et al. explain evidenced based practice standards used in systematic reviews. In addition, the authors apply the evidence based methods to analyze the quality of research in spinal cord injury, traumatic brain injury, and burn rehabilitation. The article concludes that although the rehabilitation field has experienced a dramatic increase in systematic reviews published each year, the number of studies that met the highest level of criteria was very small in all three areas of research.

Descriptor Scales

Evidence: 1 – Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Johnston, M. V., Vanderheiden, G. C., Farkas, M. D., Rogers, E. S., Summers, J. A., & Westbrook, J. D., for the NCDDR Task Force on Standards of Evidence and Methods. (2009). The challenge of evidence in disability and rehabilitation research and practice: A position paper. Austin, TX: SEDL.

Abstract: The Challenge of Evidence in Disability and Rehabilitation Research and Practice: A Position Paper was developed in November, 2009, by the NCDDR's Task Force on Standards of Evidence and Methods. This task force position paper focuses on evidence for interventions in the field of disability and rehabilitation (D&R). The document's specific objectives are to clarify what is meant by the term evidence and to describe the nature of the contemporary systems used to identify and evaluate evidence in intervention research; to identify the challenges in meeting contemporary standards of evidence in the field of D&R interventions and to propose next steps for examining related issues and for taking action to promote the availability of evidence-based services and information in the field of D&R interventions.

Descriptor Scales

Evidence: 1 – Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


NCDDR. (2005). FOCUS Technical Brief (9). What are the standards for quality research?

Abstract: This issue of FOCUS discusses principles and standards for quality research, the basis for these standards, and strategies for reporting quality research. In the fields of disability and rehabilitation research, there is a healthy debate regarding the specific criteria for quality research, and the specific checklists to be used to standardize reporting. As the debate ensues, there are many ideas emerging in the public domain related to quality research and quality evidence that can be used to help guide the discussion.

Descriptor Scales

Evidence: 1 - Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Schlosser, R.W. (2007). FOCUS Technical Brief (17). Appraising the Quality of Systematic Reviews.

Abstract: This issue of FOCUS written by Ralf W. Schlosser, PhD, is part two of a three part series on systematic reviews. This issue describes critical considerations for appraising the quality of a systematic review including the protocol, question, sources, scope, selection principles, and data extraction. The author also describes tools for appraising systematic reviews.

Descriptor Scales

Evidence: 1 - Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Schlosser, R. W. (2009). FOCUS Technical Brief (22). The Role of Single-Subject Experimental Designs in Evidence-Based Practice Times.

Abstract: This FOCUS written by Ralf W. Schlosser, PhD, describes high quality single subject experimental designs (SSEDs) in terms of establishing empirically supported treatments and implementing evidence-based practice (EBP). The author also compares and contrasts SSEDs to n-of-1 randomized controlled trials (RCTs).

Descriptor Scales

Evidence: 1 - Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Shadish, W. R., & Rindskopf, D. M. (2007). Methods for evidence-based practice: Quantitative synthesis of single-subject designs. New Directions for Evaluation, 113, 95-109.

Abstract: Shadish and Rindskopf describe the use of single-subject designs in meta-analyses. The article reviews methods for analyzing multiple single-subject designs, suggests methods for conducting a meta-analysis using single-subject designs, and includes a list of current meta-analyses.

Descriptor Scales

Evidence: 1 - Author(s) opinion
Consumer Orientation: C - No data
Readability: III - High (Grade 12 or above)


Task Force on Systematic Review and Guidelines. (2013). Assessing the quality and applicability of systematic reviews (AQASR). Austin, TX: SEDL, Center on Knowledge Translation for Disability and Rehabilitation Research.

Abstract: The basic purpose of the AQASR document and checklist is to help busy clinicians, administrators, and researchers to ask critical questions that help to reveal the strengths and weaknesses of a systematic review, in general, and as relevant to their particular clinical question or other practical concerns. Its primary audience is clinicians, as most systematic reviews are optimized to answer the clinical questions they have.

Descriptor Scales

Evidence: 2 - Expert opinion
Consumer Orientation: B - Some data
Readability: III - High (Grade 12 or above)