Date of Award
8-6-2024
Document Type
Thesis
Publisher
Santa Clara : Santa Clara University, 2024
Degree Name
Master of Science (MS)
Department
Computer Science and Engineering
First Advisor
Yi Fang
Abstract
This thesis explores the potential of Large Language Models (LLMs) in automating the extraction of sourcing information from news articles, a crucial step towards enhancing transparency and ethical analysis in journalism. We evaluate the performance of two state-of-the-art LLMs, GPT-4 and Claude 3, in identifying and categorizing various source types across four diverse news articles. The thesis employs a zero-shot learning approach with two different prompt designs, assessing the models’ ability to adapt to varying source structures and prompt instructions.
Our findings reveal that while LLMs show promise in extracting sourcing information, their performance varies significantly across different article types and source structures. The research highlights the complex interplay between prompt design, source types, and model performance, with both LLMs demonstrating strengths and limitations in handling diverse journalistic contexts. This thesis contributes to the growing body of work on AI in journalism by providing initial insights into the current capabilities of LLMs in sourcing analysis and outlining key areas for future research and development in automated ethical analysis of news content.
Recommended Citation
Wang, Jingsen, "A Step Towards Automated Ethical Analysis in Journalism: Measuring LLMs’ Performance in Extracting Sourcing Information" (2024). Computer Science and Engineering Master's Theses. 43.
https://scholarcommons.scu.edu/cseng_mstr/43