Part 2: Conducting a literature scan

Intro to Computational Studies in Education and the Social Sciences

Author
Affiliations

Nathan Alexander, PhD

School of Education

Center for Applied Data Science and Analytics

Using Posit to identify an initial set of literature

Most of us use Google Scholar to find research liteature. However, there are a host of computational tools that we can use to conduct a literature scan or go more in-depth to conduct a systematic literature review.

Part of this process is related to the important process of using the literature to help us build theory.

Theory construction

To that end, good research requires knowledge of the peer-reviewed literature.

OpenAlex

We will use OpenAlex to support your literature identification process. OpenAlex is a comprehensive open catalog of the global research system that can help you find relevant publications for your research.

Once you are set up in R from part 1, we’ll start working with the code below:

#install.packages("openalexR")
library(openalexR)
Warning: package 'openalexR' was built under R version 4.5.2
# Search for works related to your social justice topic
works_search <- oa_fetch(
  entity = "works",
  title.search = c("BlackCrit", "youth"),
  from_publication_date = "2026-03-01",
  options = list(sort = "cited_by_count:desc"),
  verbose = TRUE
)
Requesting url:
<https://api.openalex.org/works?filter=title.search%3ABlackCrit%7Cyouth%2Cfrom_publication_date%3A2026-03-01&sort=cited_by_count%3Adesc>
ℹ Getting 14 pages of results with a total of 2613 records...
⠙ OpenAlex downloading [2/14] ■■■■■                             14% ETA:  8s

⠹ OpenAlex downloading [3/14] ■■■■■■■                           21% ETA: 10s

⠸ OpenAlex downloading [5/14] ■■■■■■■■■■■■                      36% ETA: 11s

⠼ OpenAlex downloading [7/14] ■■■■■■■■■■■■■■■■                  50% ETA:  9s

⠴ OpenAlex downloading [9/14] ■■■■■■■■■■■■■■■■■■■■              64% ETA:  7s

⠦ OpenAlex downloading [11/14] ■■■■■■■■■■■■■■■■■■■■■■■■■         79% ETA:  4s

⠧ OpenAlex downloading [13/14] ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■     93% ETA:  1s

⠧ OpenAlex downloading [14/14] ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% ETA:  0s



⠙ Converting [188/2613] ■■■                                7% ETA: 13s

⠹ Converting [445/2613] ■■■■■■                            17% ETA: 12s

⠸ Converting [1011/2613] ■■■■■■■■■■■■■                     39% ETA:  9s

⠼ Converting [1720/2613] ■■■■■■■■■■■■■■■■■■■■■             66% ETA:  4s

⠴ Converting [2456/2613] ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■     94% ETA:  1s

⠴ Converting [2613/2613] ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% ETA:  0s
# Display the top results
works_search |>
  head(10) |>
  show_works() |>
  knitr::kable()
id display_name first_author last_author is_oa top_concepts
W578333627 Youth Training and the Search for Work Denis Gleeson NA FALSE Training (meteorology), Work (physics)
W7133883343 Faith, flight, and futures: religion and the Japa migration of Nigerian youth Ntongha Eni Ikpi Dodeye Uduak Williams FALSE Ethnic group, Immigration, Ethnography
W7133782334 Faith, governance, and exodus: exploring the impact of religion and politics on youth migration in Africa Dodeye Uduak Williams NA FALSE Politics, Ethnic group, Immigration
W7135378591 Vaping as Captured Homeostasis - Adverse Events, Trauma Response, and the Case for Stewardship in Australian Youth Nicotine Policy John Richard Smith SHAI/HATI TRUE Enforcement, Harm, Framing (construction)
W7134069619 Transnational youth and social mobility: the role of family financial support in unsettled lives Alexandra Lee Loretta Baldassar TRUE Social support, Remittance, Politics
W7133207944 Within-Person Association Between Daily Screen Use and Sleep in Youth Matthew Bourke George Thomas FALSE Bedtime, Sleep (system call), Association (psychology)

Web of Science (WoS)

We can also use the WoS to conduct a full systematic literature review with the Quanteda package.

Web of Science (WoS) is a comprehensive and highly respected citation database used for scholarly research. It is one of the main databases used to identify and compare collections of citations and bibliographic sources. Given the conceptual replication method selected for this study, we utilized the Web of Science Core Collection of bibliometric data. This decision provided us with an opportunity to understand differences across disciplinary boundaries, despite the increasingly interdisciplinary nature of research on racism in STEM. Prior studies have analyzed differences between Google Scholar and WoS. Unlike Google Scholar, the Web of Science Core Collection offers access to multiple citation indexes, covering a wide range of academic disciplines and publication types, making it an ideal resource for data collection and analysis.

For this section, I will walk us through a sample study conducted with two graduate students on notions of racism in STEM.

[insert table: Results of title keyword search]

Keyword EBSCO Google Scholar Scopus Web of Science
anti-racism 381 3,420 685 592
anti-racist 427 3,620 826 685
race 44,057 ~313,000 65,203 69,537
racial 24,467 ~212,000 39,930 41,527
racialization 519 2,740 1,108 1,077
racialized 958 ~5,600 1,863 2,120
racism 8,669 ~99,500 11,418 10,674
racist 1,965 ~15,100 2,170 1,906

Systematic review techniques offer a structured approach to synthesizing research literature. New computational methods have significantly advanced systematic reviews, helping with more pre-defined protocols outlining the methodologies that should be undertaken for reproducibility. These approaches employ a comprehensive, detailed search strategy across multiple databases and sources to find relevant studies. Systematic reviews also utilize specific, pre-defined inclusion and exclusion criteria for studies, whereas more critical reviews may rely on subjective or seemingly unclear selection criteria due to content- or disciplinary-specific knowledge. This often ignored but critical aspect of systematic reviews requires a rigorous assessment of its contribution to and limitations around study quality and risk of bias.

## set up, load libraries
library(dplyr)
library(readtext)
library(tidyverse)
library(here)
library(gt)
library(ggplot2)
library(dplyr)
library(knitr)
library(readr)
library(kableExtra)
library(bibliometrix)
library(tidyverse)
library(DiagrammeR)
library(DiagrammeRsvg)
library(rsvg)
library(quanteda)
library(stringr)
library(tidytext)
library("quanteda.textmodels")
library("quanteda.textstats")
library("quanteda.textplots")
require(quanteda.corpora)
here::i_am("part02.qmd")

RESEARCH QUESTIONS

  1. What is the intellectual and conceptual structure of research on racism in science, technology, engineering, and mathematics (STEM)?

  2. How are notions of racism in the research on STEM distributed across different racialized social systems?

Scoping

The data for the study comes from the Web of Science (WoS) Core Collection. Our initial scoping process included a set of iterative steps to make sense of the global research literature on the various notions of racism in STEM. We prioritized three citation indexes in our searches between the period from 2015 to 2024. Our analysis focused on journal articles written in English in the Education, Special Education, and related Education Scientific Disciplines.

  • Science Citation Index Expanded, SCI-EXPANDED (2002-present)
  • Social Sciences Citation Index SSCI (2002-present)
  • Arts and Humanities Citation Index ACHI (2002-present)
  • Emerging Sources Citation Index ESSI (2012-present)

Timespan: 2014-01-01 to 2024-12-31

Document Types: Article

Inclusion and Exclusion Criteria

Inclusion and exclusion criteria for the study
Code Criteria
IC1 Article contains STEM and one of the notions in the title (TI) or abstract (AB): racism, “white supremacy,” colonialism, xenophobia, nationalism, antiasian, anti-Asian[*], antiblack, Anti-Black[*]
IC2 Article published between 2014 and 2024
IC3 Article originally written in English
IC4 Article is a journal article
IC5 Article purpose or core questions center on the topical subjects of analysis

Review

Key Columns of Interest:

  • AU: Authors of the publication

  • AB: Abstract text

  • TI: Title of the publication

  • AU_CO: Countries of the authors

  • SC: Subject categories (e.g., “Education & Educational Research”)

  • PY: Publication year

  • TC: Total citations

FINDINGS


Converting your wos collection into a bibliographic dataframe

Done!


Generating affiliation field tag AU_UN from C1:  Done!
[1] 320  69


MAIN INFORMATION ABOUT DATA

 Timespan                              2014 : 2025 
 Sources (Journals, Books, etc)        256 
 Documents                             320 
 Annual Growth Rate %                  -18.11 
 Document Average Age                  4.95 
 Average citations per doc             12.56 
 Average citations per year per doc    1.817 
 References                            18413 
 
DOCUMENT TYPES                     
 article                               295 
 article; early access                 19 
 article; proceedings paper            1 
 editorial material; early access      1 
 review; early access                  4 
 
DOCUMENT CONTENTS
 Keywords Plus (ID)                    703 
 Author's Keywords (DE)                1053 
 
AUTHORS
 Authors                               889 
 Author Appearances                    912 
 Authors of single-authored docs       134 
 
AUTHORS COLLABORATION
 Single-authored docs                  138 
 Documents per Author                  0.36 
 Co-Authors per Doc                    2.85 
 International co-authorships %        9.375 
 

Annual Scientific Production

 Year    Articles
    2014        9
    2015       11
    2016       14
    2017       13
    2018       14
    2019       22
    2020       21
    2021       41
    2022       47
    2023       57
    2024       70
    2025        1

Annual Percentage Growth Rate -18.11 


Most Productive Authors

   Authors        Articles Authors        Articles Fractionalized
1     MCGEE EO           7   MCGEE EO                        4.70
2     DANCY M            3   SPENCER BM                      2.00
3     ADAMES HY          2   RUSSO-TAIT T                    1.33
4     BROCKMAN AJ        2   BROCKMAN AJ                     1.17
5     BROOKS E           2   BROOKS E                        1.14
6     KIM M              2   ARAMAYO RR                      1.00
7     LEYVA LA           2   ARBUCIAS D                      1.00
8     MCNEILL RT         2   ARMENGOL JM                     1.00
9     MICKELSON R        2   ARONS W                         1.00
10    MISRA DP           2   BABAII E                        1.00


Top manuscripts per citations

                           Paper                                      DOI  TC TCperYear  NTC
1  MCGEE EO, 2016, AM EDUC RES J          10.3102/0002831216676572        230     20.91 4.99
2  MCGEE EO, 2020, EDUC RESEARCHER        10.3102/0013189X20972718        222     31.71 8.05
3  CHAVEZ-DUEÑAS NY, 2019, AM PSYCHOL     10.1037/amp0000289              187     23.38 8.18
4  FUENFSCHILLING L, 2018, RES POLICY     10.1016/j.respol.2018.02.003    155     17.22 4.76
5  MCCOY DL, 2015, J DIVERS HIGH EDUC     10.1037/a0038676                 88      7.33 5.63
6  KIIK L, 2016, EURASIAN GEOGR ECON      10.1080/15387216.2016.1198265    78      7.09 1.69
7  SLAUGHTER-ACEY JC, 2016, ANN EPIDEMIOL 10.1016/j.annepidem.2015.10.005  76      6.91 1.65
8  MCGEE EO, 2019, TEACH COLL REC         NA                               68      8.50 2.97
9  LEE MGJ, 2020, INT J STEM EDUC         10.1186/s40594-020-00241-4       66      9.43 2.39
10 HARTMAN TK, 2021, SOC PSYCHOL PERS SCI 10.1177/1948550620978023         63     10.50 4.55


Corresponding Author's Countries

          Country Articles   Freq SCP MCP MCP_Ratio
1  USA                 192 0.6076 179  13    0.0677
2  UNITED KINGDOM       27 0.0854  24   3    0.1111
3  CANADA               20 0.0633  17   3    0.1500
4  AUSTRALIA            13 0.0411  10   3    0.2308
5  CHINA                 9 0.0285   9   0    0.0000
6  BRAZIL                7 0.0222   5   2    0.2857
7  GERMANY               4 0.0127   4   0    0.0000
8  KOREA                 4 0.0127   4   0    0.0000
9  SOUTH AFRICA          4 0.0127   4   0    0.0000
10 SWEDEN                4 0.0127   3   1    0.2500


SCP: Single Country Publications

MCP: Multiple Country Publications


Total Citations per Country

     Country      Total Citations Average Article Citations
1  USA                       2754                     14.34
2  UNITED KINGDOM             304                     11.26
3  CANADA                     238                     11.90
4  SWEDEN                     178                     44.50
5  BRAZIL                      82                     11.71
6  ESTONIA                     78                     78.00
7  AUSTRALIA                   64                      4.92
8  ISRAEL                      60                     30.00
9  ECUADOR                     58                     58.00
10 CHINA                       45                      5.00


Most Relevant Sources

                                                      Sources        Articles
1  JOURNAL OF CHEMICAL EDUCATION                                            6
2  RACE ETHNICITY AND EDUCATION                                             6
3  CULTURAL STUDIES OF SCIENCE EDUCATION                                    5
4  JOURNAL OF RESEARCH IN SCIENCE TEACHING                                  5
5  ETHNIC AND RACIAL STUDIES                                                4
6  INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH        4
7  INTERNATIONAL JOURNAL OF STEM EDUCATION                                  4
8  JOURNAL OF DIVERSITY IN HIGHER EDUCATION                                 4
9  JOURNAL OF RACIAL AND ETHNIC HEALTH DISPARITIES                          4
10 NATIONS AND NATIONALISM                                                  4


Most Relevant Keywords

   Author Keywords (DE)      Articles Keywords-Plus (ID)     Articles
1          RACISM                  34         RACE                 43
2          RACE                    21         EXPERIENCES          31
3          STEM                    15         SCIENCE              29
4          NATIONALISM             14         WOMEN                29
5          HIGHER EDUCATION        13         EDUCATION            28
6          COLONIALISM             11         IDENTITY             23
7          COVID-19                11         HEALTH               22
8          GENDER                  11         STUDENTS             22
9          INTERSECTIONALITY       10         DISCRIMINATION       18
10         DIVERSITY                9         DISPARITIES          17

Descriptive (performance) analysis

Summary of the data set and documents.

Keywords and Keywords Plus

Author Keywords and Keywords-Plus

S[10] # Author Keywords and Keywords-Plus
$MostRelKeywords
   Author Keywords (DE)      Articles Keywords-Plus (ID)     Articles
1          RACISM                  34         RACE                 43
2          RACE                    21         EXPERIENCES          31
3          STEM                    15         SCIENCE              29
4          NATIONALISM             14         WOMEN                29
5          HIGHER EDUCATION        13         EDUCATION            28
6          COLONIALISM             11         IDENTITY             23
7          COVID-19                11         HEALTH               22
8          GENDER                  11         STUDENTS             22
9          INTERSECTIONALITY       10         DISCRIMINATION       18
10         DIVERSITY                9         DISPARITIES          17

Keyword Occurence Network

# Classical keyword co-occurrences network
NetMatrix1 <- biblioNetwork(M4, analysis = "co-occurrences", network = "keywords", sep = ";")

# statistics for the network
netstat1 <- networkStat(NetMatrix1)
summary(netstat1, k=10)


Main statistics about the network

 Size                                  703 
 Density                               0.016 
 Transitivity                          0.286 
 Diameter                              6 
 Degree Centralization                 0.229 
 Average path length                   2.883 
 
# Plot the network
set.seed(3)
net1a = networkPlot(NetMatrix1, 
                   n = 25,  # Limit to top 25 keywords
                   normalize = "association",
                   Title = "Top Keyword Co-Occurrences", 
                   type = "circle", 
                   size = TRUE, 
                   remove.multiple = FALSE,
                   labelsize = 0.7,
                   cluster = "none")

net1b = networkPlot(NetMatrix1, 
                   n = 30,  # Even fewer nodes
                   normalize = "association",
                   Title = "Keyword Network", 
                   type = "kamada", 
                   size = TRUE, 
                   remove.multiple = TRUE,
                   labelsize = 0.5,
                   cluster = "louvain")

net1c = networkPlot(NetMatrix1, 
                   n = 30,  # Even fewer nodes
                   #normalize = "association",
                   #weighted = T,
                   Title = "Keyword Co-Occurence Network", 
                   type = "fruchterman", 
                   size = TRUE, 
                   remove.multiple = TRUE,
                   labelsize = 0.5,
                   cluster = "louvain")

Conceptual Structure Map

Conceptual Structure

suppressWarnings(CS1 <- conceptualStructure(M4,
                                            method="MCA", 
                                            field="ID", 
                                            minDegree=15, 
                                            clust=5, 
                                            stemming=FALSE, 
                                            labelsize=15,
                                            documents=20)
                 )

# Conceptual Structure using keywords (method="CA")
CS <- conceptualStructure(M4,field="ID", method="CA", minDegree=4, clust=5, stemming=FALSE, labelsize=10, documents=10)

CS <- conceptualStructure(M4, 
                           field="ID", 
                           method="CA", 
                           minDegree=4, 
                           clust=5, 
                           stemming=FALSE, 
                           labelsize=10,  # Set to 0 to remove labels
                           documents=10)

# Extract coordinates and clusters
coords <- CS[[1]]  # Coordinates
clusters <- CS[[2]]  # Cluster assignments

CS[4]
$graph_terms

# Create a historical citation network
options(width=130)
histResults <- histNetwork(M4, min.citations = 5, sep = ";")

WOS DB:
Searching local citations (LCS) by reference items (SR) and DOIs...

Analyzing 19923 reference items...

Found 29 documents with no empty Local Citations (LCS)
# Plot a historical co-citation network
net <- histPlot(histResults, n=15, size = 8, labelsize=4)


 Legend

                                                                            Label
1                         MCCOY DL, 2015, J DIVERS HIGH EDUC DOI 10.1037/A0038676
2                      MCGEE EO, 2016, AM EDUC RES J DOI 10.3102/0002831216676572
3                           BROWN BA, 2016, J RES SCI TEACH DOI 10.1002/TEA.21249
4                           MCGEE E, 2018, AERA OPEN DOI 10.1177/2332858418816658
5                   LEE MGJ, 2020, INT J STEM EDUC DOI 10.1186/S40594-020-00241-4
6                    MCGEE EO, 2020, EDUC RESEARCHER DOI 10.3102/0013189X20972718
7                   DANCY M, 2020, INT J STEM EDUC DOI 10.1186/S40594-020-00250-3
8                        VAN DUSEN B, 2020, J RES SCI TEACH DOI 10.1002/TEA.21584
9                           SPENCER BM, 2021, SOCIOL FORUM DOI 10.1111/SOCF.12724
10 NISSEN JM, 2021, PHYS REV PHYS EDUC R DOI 10.1103/PHYSREVPHYSEDUCRES.17.010116
11                  ALLEN D, 2022, INT J STEM EDUC DOI 10.1186/S40594-022-00334-2
12                      RUSSO-TAIT T, 2022, J RES SCI TEACH DOI 10.1002/TEA.21775
13                MCNEILL RT, 2022, J LEARN SCI DOI 10.1080/10508406.2022.2073233
14                    KING GP, 2023, CBE-LIFE SCI EDUC DOI 10.1187/CBE.22-06-0104
15                    WILKINS-YEL KG, 2023, J RES SCI TEACH DOI 10.1002/TEA.21798
16           MCGEE EO, 2023, ETHNIC RACIAL STUD DOI 10.1080/01419870.2022.2159474
17             FORSYTHE D, 2024, J HIGH EDUC-UK DOI 10.1080/00221546.2023.2265285
                                                                                                                                                                 Author_Keywords
1                                                                                                                                COLORBLIND RACISM; MENTORING; STUDENTS OF COLOR
2                                                                    STEREOTYPE MANAGEMENT; STEM STUDENTS OF COLOR; RACIAL HOSTILITY IN ACADEMIA; STEM RACIAL GAP; CULTURAL BIAS
3                                                                                                           AFRICAN-AMERICAN; SCIENCE IDENTITY; MATRICULATION; ACCESS TO SCIENCE
4                                                     RACIAL STEREOTYPES; STEREOTYPE LIFT; STEREOTYPE THREAT; COLLEGE STEM OUTCOMES; RACIAL TRAUMA; BLACK; ASIAN; POLITICAL RACE
5                                                                                       RACIAL MICROAGGRESSIONS; HIGHER EDUCATION; STEM; EDUCATIONAL SETTING; DIVERSITY CONCERNS
6  CULTURAL ANALYSIS; DISPARITIES; DOCTORAL; ENGINEERING EDUCATION; ENTREPRENEURSHIP; HBCUS; HIGHER EDUCATION; MENTORING; MINORITIZED; RACE; STEM; STRUCTURAL RACISM; TECHNOLOGY
7                                                                                                                             RACE; GENDER; UNDERGRADUATES; QUALITATIVE RESEARCH
8                                        CRITICAL QUANTITATIVE INTERSECTIONALITY; EQUALITY; EQUITY; GENDER; HIERARCHICAL LINEAR MODEL; HIGHER EDUCATION; LEARNING; PHYSICS; RACE
9                                                  BLACK MEN; CRITICAL RACE THEORY; PSYCHOLOGICAL HEALTH AND WELL-BEING; RACISM; RESPECTABILITY POLITICS; STEM DOCTORAL PROGRAMS
10                                                                                                                                                                          <NA>
11                                                                                                     BLACK; MINORITY; WOMEN; STEM; COMMUNITY COLLEGE; TRANSFER; RACISM; SEXISM
12                                                                     COLLEGE SCIENCE FACULTY; COLOR-BLIND RACISM; CRITICAL RACIAL CONSCIOUSNESS; RACIALLY MINORITIZED STUDENTS
13                                                                                                                                                                          <NA>
14                                                                                                                                                                          <NA>
15                                                                                                                COUNTERSPACE; MENTAL HEALTH; PERSISTENCE; STEM; WOMEN OF COLOR
16                                                                     WOMEN OF COLOR; ENGINEERING EDUCATION; HIGHER EDUCATION; SALARY; IDENTITY TAXATION; STEREOTYPE MANAGEMENT
17                                                                                                                           WHITE SUPREMACY; ANTI-RACISM; STEM; WOMEN; ACTIVISM
                                                                                                                                                         KeywordsPlus
1                                                                                   COLLEGE-STUDENTS; AFRICAN-AMERICAN; GENDER; RACE; PERCEPTIONS; EXPERIENCES; WOMEN
2                                 CRITICAL RACE THEORY; STEREOTYPE THREAT; SCIENCE; EXPERIENCES; EDUCATION; MICROAGGRESSIONS; MATHEMATICS; DIVERSITY; STUDENTS; COLOR
3                                                                                                                                  EXPERIENCES; PERSISTENCE; STUDENTS
4                                              AFRICAN-AMERICAN; HIGHER-EDUCATION; IDENTITY; RACE; MICROAGGRESSIONS; MATHEMATICS; EXPERIENCES; CAREERS; IMPACT; WOMEN
5  AFRICAN-AMERICAN STUDENTS; CRITICAL RACE THEORY; PREDOMINANTLY WHITE; GENDER-DIFFERENCES; COLLEGE-STUDENTS; EXPERIENCES; CLIMATE; COLOR; STEREOTYPE; OPPORTUNITIES
6                                                                  MATHEMATICS EDUCATION; COLOR; BLACK; RACE; SCIENCE; IDENTITY; STUDENTS; HEALTH; WOMEN; EXPERIENCES
7                                                                                                                                   DOUBLE BIND; SCIENCE; WOMEN; RACE
8                                                                                                                                                              GENDER
9                                                                                                           RACE; COLLEGE; MASCULINITY; PERSISTENCE; EDUCATION; WOMEN
10                                                                              COLORADO LEARNING ATTITUDES; SELF-EFFICACY; WOMEN; GENDER; MODEL; IMPUTATION; BELIEFS
11               FEMALE TRANSFER STUDENTS; HIGHER-EDUCATION; TRANSFER SHOCK; ACADEMIC-PERFORMANCE; CHILLY CLIMATE; DOUBLE BIND; COLOR; GENDER; SCIENCE; UNDERGRADUATE
12                                                               STUDENTS; WOMEN; RACE; UNDERGRADUATE; EDUCATION; RACISM; GAPS; ACHIEVEMENT; PERFORMANCE; PERSISTENCE
13                                    CRITICAL RACE THEORY; MATHEMATICS; PERSISTENCE; IDENTITIES; BLACK; OPPORTUNITIES; PERCEPTIONS; EXPERIENCES; EDUCATION; IDEOLOGY
14                                                                      HIGHER-EDUCATION; RACISM; COLOR; MATHEMATICS; STUDENTS; EXPERIENCES; IDEOLOGY; CLIMATE; WOMEN
15                    CAMPUS RACIAL CLIMATE; CRITICAL RACE THEORY; HIDDEN CURRICULUM; BLACK-WOMEN; EXPERIENCES; SCIENCE; WHITE; MICROAGGRESSIONS; EDUCATION; IDENTITY
16                                                                          GENDER STEREOTYPES; STEM; RACE; BLACK; PREJUDICE; PROMOTION; EMOTION; EQUITY; WOMAN; NEED
17                                                                                                                                EDUCATION; STUDENTS; SCIENCE; COLOR
                                    DOI Year LCS GCS
1                      10.1037/a0038676 2015   5  88
2              10.3102/0002831216676572 2016  14 230
3                     10.1002/tea.21249 2016   2  62
4              10.1177/2332858418816658 2018   4  61
5            10.1186/s40594-020-00241-4 2020   3  66
6              10.3102/0013189X20972718 2020  10 222
7            10.1186/s40594-020-00250-3 2020   2  23
8                     10.1002/tea.21584 2020   1  55
9                    10.1111/socf.12724 2021   1   9
10 10.1103/PhysRevPhysEducRes.17.010116 2021   1  32
11           10.1186/s40594-022-00334-2 2022   2  14
12                    10.1002/tea.21775 2022   3  22
13        10.1080/10508406.2022.2073233 2022   3   8
14               10.1187/cbe.22-06-0104 2023   2  18
15                    10.1002/tea.21798 2023   1  10
16        10.1080/01419870.2022.2159474 2023   1   5
17        10.1080/00221546.2023.2265285 2024   1   2
# LEXICAL PATTERNS
# keywords in context
M4_abstract <- corpus(M4$AB)
# M4_abstract
toks_M4_abstract <- corpus_subset(M4_abstract) %>% 
  tokens()

toks <- toks_M4_abstract
toks_clean <- tokens(toks, 
               remove_punct = TRUE, 
               remove_numbers = TRUE) %>%
        tokens_remove(stopwords("english"))

Top token frequencies

# Top token frequencies
top_tokens <- toks_clean %>%
  tokens_group() %>%
  dfm() %>%
  textstat_frequency(n = 20)
top_tokens %>%  # top tokens from abstracts
  filter(feature != "research") %>% 
  filter(feature != "study") %>% 
  filter(feature != "article") %>% 
  filter(feature != "also") %>% 
  filter(feature != "can")
       feature frequency rank docfreq group
1         stem       343    1     145   all
2       racism       330    2     186   all
3        black       323    3      88   all
4       health       249    4      67   all
5     students       208    5      60   all
6       social       202    6      98   all
8       racial       192    8      93   all
10       women       176   10      54   all
11       white       174   11      79   all
12 experiences       160   12      80   all
14   education       127   14      67   all
17        data       115   17      78   all
18  indigenous       114   18      33   all
19     science       113   19      50   all
20        race       109   20      66   all