TY - JOUR
T1 - Cascaded Semantic Fractionation for identifying a domain in social media
AU - Danowski, James
AU - Riopelle, Ken
AU - Yan, Bei
N1 - Publisher Copyright:
Copyright © 2024 Danowski, Riopelle and Yan.
PY - 2024
Y1 - 2024
N2 - Searching social media to find relevant semantic domains often results in large text files, many of which are irrelevant due to cross-domain content resulting from word polysemy, abstractness, and degree centrality. Through an iterative pruning process, Cascaded Semantic Fractionation (CSF) systematically removes these cross-domain links. The social network procedure performs community detection in semantic networks, locates the semantic groups containing the terms of interest, excludes intergroup links, and repeats community detection on the pruned intragroup network until the domain of interest is clarified. To illustrate CSF, we analyzed public Facebook posts, using the CrowdTangle app for historical data search, from February 3, 2020, to March 13, 2021, about the possible Wuhan lab leak of COVID-19 over a daily interval. The initial search using keywords located six multi-day bursts of posts of more than 500 per day among 95 K posts. These posts were network analyzed to find the domain of interest using the iterative community detection and pruning process. CSF can be applied to capture the evolutions in semantic domains over time. At the outset, the lab leak theory was presented in conspiracy theory terms. Over time, the conspiratorial elements washed out in favor of an accidental release as the issue moved from social to mainstream media and official government views. CSF identified the relevant social media semantic domain and tracked its changes.
AB - Searching social media to find relevant semantic domains often results in large text files, many of which are irrelevant due to cross-domain content resulting from word polysemy, abstractness, and degree centrality. Through an iterative pruning process, Cascaded Semantic Fractionation (CSF) systematically removes these cross-domain links. The social network procedure performs community detection in semantic networks, locates the semantic groups containing the terms of interest, excludes intergroup links, and repeats community detection on the pruned intragroup network until the domain of interest is clarified. To illustrate CSF, we analyzed public Facebook posts, using the CrowdTangle app for historical data search, from February 3, 2020, to March 13, 2021, about the possible Wuhan lab leak of COVID-19 over a daily interval. The initial search using keywords located six multi-day bursts of posts of more than 500 per day among 95 K posts. These posts were network analyzed to find the domain of interest using the iterative community detection and pruning process. CSF can be applied to capture the evolutions in semantic domains over time. At the outset, the lab leak theory was presented in conspiracy theory terms. Over time, the conspiratorial elements washed out in favor of an accidental release as the issue moved from social to mainstream media and official government views. CSF identified the relevant social media semantic domain and tracked its changes.
KW - COVID-19
KW - CrowdTangle
KW - Facebook
KW - semantic networks
KW - start words
UR - http://www.scopus.com/inward/record.url?scp=85188067058&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85188067058&partnerID=8YFLogxK
U2 - 10.3389/frma.2024.1189099
DO - 10.3389/frma.2024.1189099
M3 - Article
AN - SCOPUS:85188067058
VL - 9
JO - Frontiers in Research Metrics and Analytics
JF - Frontiers in Research Metrics and Analytics
M1 - 1189099
ER -