Systematic literature review education

Different types of systematic review are discussed in more detail later in this chapter. The majority of systematic review types share a common set of processes. These processes can be divided into distinct but interconnected stages as illustrated in Fig.1. Systematic reviews need to specify a research question and the methods that will be used to investigate the question. This is often written as a protocol prior to undertaking the review. Writing a protocol or plan of the methods at the beginning of a review can be a very useful activity. It helps the review team to gain a shared understanding of the scope of the review and the methods that they will use to answer the reviews questions. Different types of systematic reviews will have more or less developed protocols. For example, for systematic reviews investigating research questions about the impact of educational interventions it is argued that a detailed protocol should be fully specified prior to the commencement of the review to reduce the possibility of reviewer bias [Torgerson 2003, p.26]. For other types of systematic review, in which the research question is more exploratory, the protocol may be more flexible and/or developmental in nature.
Fig.1

The systematic review process

3.1 Systematic Review Questions and the Conceptual Framework

The review question gives each review its particular structure and drives key decisions about what types of studies to include; where to look for them; how to assess their quality; and how to combine their findings. Although a research question may appear to be simple, it will include many assumptions. Whether implicit or explicit, these assumptions will include: epistemological frameworks about knowledge and how we obtain it, theoretical frameworks, whether tentative or firm, about the phenomenon that is the focus of study.

Taken together, these produce a conceptual framework that shapes the research questions, choices about appropriate systematic review approach and methods. The conceptual framework may be viewed as a working hypothesis that can be developed, refined or confirmed during the course of the research. Its purpose is to explain the key issues to be studied, the constructs or variables, and the presumed relationships between them. The framework is a research tool intended to assist a researcher to develop awareness and understanding of the phenomena under scrutiny and to communicate this [Smyth 2004].

A review to investigate the impact of an educational intervention will have a conceptual framework that includes a hypothesis about a causal link between; who the review is about [the people], what the review is about [an intervention and what it is being compared with], and the possible consequences of intervention on the educational outcomes of these people. Such a review would follow a broadly aggregative synthesis logic. This is the shape of reviews of educational interventions carried out for the What Works Clearing House in the USA1 and the Education Endowment Foundation in England.2

A review to investigate meaning or understanding of a phenomenon for the purpose of building or further developing theory will still have some prior assumptions. Thus, an initial conceptual framework will contain theoretical ideas about how the phenomena of interest can be understood and some ideas justifying why a particular population and/or context is of specific interest or relevance. Such a review is likely to follow a broadly configurative logic.

3.2 Selection Criteria

Reviewers have to make decisions about which research studies to include in their review. In order to do this systematically and transparently they develop rules about which studies can be selected into the review. Selection criteria [sometimes referred to as inclusion or exclusion criteria] create restrictions on the review. All reviews, whether systematic or not, limit in some way the studies that are considered by the review. Systematic reviews simply make these restrictions transparent and therefore consistent across studies. These selection criteria are shaped by the review question and conceptual framework. For example, a review question about the impact of homework on educational attainment would have selection criteria specifying who had to do the homework; the characteristics of the homework and the outcomes that needed to be measured. Other commonly used selection criteria include study participant characteristics; the country where the study has taken place and the language in which the study is reported. The type of research method[s] may also be used as a selection criterion but this can be controversial given the lack of consensus in education research [Newman 2008], and the inconsistent terminology used to describe education research methods.

3.3 Developing the Search Strategy

The search strategy is the plan for how relevant research studies will be identified. The review question and conceptual framework shape the selection criteria. The selection criteria specify the studies to be included in a review and thus are a key driver of the search strategy. A key consideration will be whether the search aims to be exhaustive i.e. aims to try and find all the primary research that has addressed the review question. Where reviews address questions about effectiveness or impact of educational interventions the issue of publication bias is a concern. Publication bias is the phenomena whereby smaller and/or studies with negative findings are less likely to be published and/or be harder to find. We may therefore inadvertently overestimate the positive effects of an educational intervention because we do not find studies with negative or smaller effects [Chow and Eckholm 2018]. Where the review question is not of this type then a more specific or purposive search strategy, that may or may not evolve as the review progresses, may be appropriate. This is similar to sampling approaches in primary research. In primary research studies using aggregative approaches, such as quasi-experiments, analysis is based on the study of complete or representative samples. In primary research studies using configurative approaches, such as ethnography, analysis is based on examining a range of instances of the phenomena in similar or different contexts.

The search strategy will detail the sources to be searched and the way in which the sources will be searched. A list of search source types is given in Box 1 below. An exhaustive search strategy would usually include all of these sources using multiple bibliographic databases. Bibliographic databases usually index academic journals and thus are an important potential source. However, in most fields, including education, relevant research is published in a range of journals which may be indexed in different bibliographic databases and thus it may be important to search multiple bibliographic databases. Furthermore, some research is published in books and an increasing amount of research is not published in academic journals or at least may not be published there first. Thus, it is important to also consider how you will find relevant research in other sources including unpublished or grey literature. The Internet is a valuable resource for this purpose and should be included as a source in any search strategy.

Box 1: Search Sources

  • The World Wide Web/Internet

    • Google, Specialist Websites, Google Scholar, Microsoft Academic

  • Bibliographic Databases

    • Subject specific e.g. EducationERIC: Education Resources Information Centre

    • Generic e.g. ASSIA: Applied Social Sciences Index and Abstracts

  • Handsearching of specialist journals or books

  • Contacts with Experts

  • Citation Checking

New, federated search engines are being developed, which search multiple sources at the same time, eliminating duplicates automatically [Tsafnat et al. 2013]. Technologies, including text mining, are being used to help develop search strategies, by suggesting topics and terms on which to searchterms that reviewers may not have thought of using. Searching is also being aided by technology through the increased use [and automation] of citation chasing, where papers that cite, or are cited by, a relevant study are checked in case they too are relevant.

A search strategy will identify the search terms that will be used to search the bibliographic databases. Bibliographic databases usually index records according to their topic using keywords or controlled terms [categories used by the database to classify papers]. A comprehensive search strategy usually involves searching both a freetext search using keywords determined by the reviewers and controlled terms. An example of a bibliographic database search is given in Box 2. This search was used in a review that aimed to find studies that investigated the impact of Youth Work on positive youth outcomes [Dickson et al. 2013]. The search is built using terms for the population of interest [Youth], the intervention of interest [Youth Work] and the outcomes of Interest [Positive Development]. It used both keywords and controlled terms, wildcards [the *sign in this database] and the Boolean operators OR and AND to combine terms. This example illustrates the potential complexity of bibliographic database search strings, which will usually require a process of iterative development to finalise.

Box 2: Search string example To identify studies that address the question What is the empirical research evidence on the impact of youth work on the lives of children and young people aged 10-24years?: CSA ERIC Database

[[TI = [adolescen* or [young man*] or [young men]] or TI = [[young woman*] or [young women] or [Young adult*]] or TI = [[young person*] or [young people*] or teen*] or AB = [adolescen* or [young man*] or [young men]] or AB = [[young woman*] or [young women] or [Young adult*]] or AB = [[young person*] or [young people*] or teen*]] or [DE = [youth or adolescents or early adolescents or late adolescents or preadolescents]]] and[[[TI = [[positive youth development ] or [youth development] or [youth program*]] or TI = [[youth club*] or [youth work] or [youth opportunit*]] or TI = [[extended school*] or [civic engagement] or [positive peer culture]] or TI = [[informal learning] or multicomponent or [multi-component ]] or TI = [[multi component] or multidimensional or [multi-dimensional ]] or TI = [[multi dimensional] or empower* or asset*] or TI = [thriv* or [positive development] or resilienc*] or TI = [[positive activity] or [positive activities] or experiential] or TI = [[community based] or community-based]] or[AB = [[positive youth development ] or [youth development] or [youth program*]] or AB = [[youth club*] or [youth work] or [youth opportunit*]] or AB = [[extended school*] or [civic engagement] or [positive peer culture]] or AB = [[informal learning] or multicomponent or [multi-component ]] or AB = [[multi component] or multidimensional or [multi-dimensional ]] or AB = [[multi dimensional] or empower* or asset*] or AB = [thriv* or [positive development] or resilienc*] or AB = [[positive activity] or [positive activities] or experiential] or AB = [[community based] or community-based]]] or [DE=community education]]

Detailed guidance for finding effectiveness studies is available from the Campbell Collaboration [Kugley et al. 2015]. Guidance for finding a broader range of studies has been produced by the EPPI-Centre [Brunton et al. 2017a].

3.4 The Study Selection Process

Studies identified by the search are subject to a process of checking [sometimes referred to as screening] to ensure they meet the selection criteria. This is usually done in two stages whereby titles and abstracts are checked first to determine whether the study is likely to be relevant and then a full copy of the paper is acquired to complete the screening exercise. The process of finding studies is not efficient. Searching bibliographic databases, for example, leads to many irrelevant studies being found which then have to be checked manually one by one to find the few relevant studies. There is increasing use of specialised software to support and in some cases, automate the selection process. Text mining, for example, can assist in selecting studies for a review [Brunton et al. 2017b]. A typical text mining or machine learning process might involve humans undertaking some screening, the results of which are used to train the computer software to learn the difference between included and excluded studies and thus be able to indicate which of the remaining studies are more likely to be relevant. Such automated support may result in some errors in selection, but this may be less than the human error in manual selection [OMara-Eves et al. 2015].

3.5 Coding Studies

Once relevant studies have been selected, reviewers need to systematically identify and record the information from the study that will be used to answer the review question. This information includes the characteristics of the studies, including details of the participants and contexts. The coding describes: [i] details of the studies to enable mapping of what research has been undertaken; [ii] how the research was undertaken to allow assessment of the quality and relevance of the studies in addressing the review question; [iii] the results of each study so that these can be synthesised to answer the review question.

The information is usually coded into a data collection system using some kind of technology that facilitates information storage and analysis [Brunton et al. 2017b] such as the EPPI-Centres bespoke systematic review software EPPI Reviewer.3 Decisions about which information to record will be made by the review team based on the review question and conceptual framework. For example, a systematic review about the relationship between school size and student outcomes collected data from the primary studies about each schools funding, students, teachers and school organisational structure as well as about the research methods used in the study [Newman et al. 2006]. The information coded about the methods used in the research will vary depending on the type of research included and the approach that will be used to assess the quality and relevance of the studies [see the next section for further discussion of this point].

Similarly, the information recorded as results of the individual studies will vary depending on the type of research that has been included and the approach to synthesis that will be used. Studies investigating the impact of educational interventions using statistical meta-analysis as a synthesis technique will require all of the data necessary to calculate effect sizes to be recorded from each study [see the section on synthesis below for further detail on this point]. However, even in this type of study there will be multiple data that can be considered to be results and so which data needs to be recorded from studies will need to be carefully specified so that recording is consistent across studies

3.6 Appraising the Quality of Studies

Methods are reinvented every time they are used to accommodate the real world of research practice [Sandelowski et al. 2012]. The researcher undertaking a primary research study has attempted to design and execute a study that addresses the research question as rigorously as possible within the parameters of their resources, understanding, and context. Given the complexity of this task, the contested views about research methods and the inconsistency of research terminology, reviewers will need to make their own judgements about the quality of the any individual piece of research included in their review. From this perspective, it is evident that using a simple criteria, such as published in a peer reviewed journal as a sole indicator of quality, is not likely to be an adequate basis for considering the quality and relevance of a study for a particular systematic review.

In the context of systematic reviews this assessment of quality is often referred to as Critical Appraisal [Petticrew and Roberts 2005]. There is considerable variation in what is done during critical appraisal: which dimensions of study design and methods are considered; the particular issues that are considered under each dimension; the criteria used to make judgements about these issues and the cut off points used for these criteria [Oancea and Furlong 2007]. There is also variation in whether the quality assessment judgement is used for excluding studies or weighting them in analysis and when in the process judgements are made.

There are broadly three elements that are considered in critical appraisal: the appropriateness of the study design in the context of the review question, the quality of the execution of the study methods and the studys relevance to the review question [Gough 2007]. Distinguishing study design from execution recognises that whilst a particular design may be viewed as more appropriate for a study it also needs to be well executed to achieve the rigour or trustworthiness attributed to the design. Study relevance is achieved by the review selection criteria but assessing the degree of relevance recognises that some studies may be less relevant than others due to differences in, for example, the characteristics of the settings or the ways that variables are measured.

The assessment of study quality is a contested and much debated issue in all research fields. Many published scales are available for assessing study quality. Each incorporates criteria relevant to the research design being evaluated. Quality scales for studies investigating the impact of interventions using [quasi] experimental research designs tend to emphasis establishing descriptive causality through minimising the effects of bias [for detailed discussion of issues associated with assessing study quality in this tradition see Waddington et al. 2017]. Quality scales for appraising qualitative research tend to focus on the extent to which the study is authentic in reflecting on the meaning of the data [for detailed discussion of the issues associated with assessing study quality in this tradition see Carroll and Booth 2015].

3.7 Synthesis

A synthesis is more than a list of findings from the included studies. It is an attempt to integrate the information from the individual studies to produce a better answer to the review question than is provided by the individual studies. Each stage of the review contributes toward the synthesis and so decisions made in earlier stages of the review shape the possibilities for synthesis. All types of synthesis involve some kind of data transformation that is achieved through common analytic steps: searching for patterns in data; Checking the quality of the synthesis; Integrating data to answer the review question [Thomas et al. 2012]. The techniques used to achieve these vary for different types of synthesis and may appear more or less evident as distinct steps.

Statistical meta-analysis is an aggregative synthesis approach in which the outcome results from individual studies are transformed into a standardized, scale free, common metric and combined to produce a single pooled weighted estimate of effect size and direction. There are a number of different metrics of effect size, selection of which is principally determined by the structure of outcome data in the primary studies as either continuous or dichotomous. Outcome data with a dichotomous structure can be transformed into Odds Ratios [OR], Absolute Risk Ratios [ARR] or Relative Risk Ratios [RRR] [for detailed discussion of dichotomous outcome effect sizes see Altman 1991]. More commonly seen in education research, outcome data with a continuous structure can be translated into Standardised Mean Differences [SMD] [Fitz-Gibbon 1984]. At its most straightforward effect size calculation is simple arithmetic. However given the variety of analysis methods used and the inconsistency of reporting in primary studies it is also possible to calculate effect sizes using more complex transformation formulae [for detailed instructions on calculating effect sizes from a wide variety of data presentations see Lipsey and Wilson 2000].

The combination of individual effect sizes uses statistical procedures in which weighting is given to the effect sizes from the individual studies based on different assumptions about the causes of variance and this requires the use of statistical software. Statistical measures of heterogeneity produced as part of the meta-analysis are used to both explore patterns in the data and to assess the quality of the synthesis [Thomas et al. 2017a].

In configurative synthesis the different kinds of text about individual studies and their results are meshed and linked to produce patterns in the data, explore different configurations of the data and to produce new synthetic accounts of the phenomena under investigation. The results from the individual studies are translated into and across each other, searching for areas of commonality and refutation. The specific techniques used are derived from the techniques used in primary research in this tradition. They include reading and re-reading, descriptive and analytical coding, the development of themes, constant comparison, negative case analysis and iteration with theory [Thomas et al. 2017b].

Video liên quan

Chủ Đề