Taxonomy of Citation Contexts: A Framework for Systematic Analysis of Reference Text Extraction in Computational Linguistic

Authors

  • Afsheen Khalid
  • Dilawar Khan
  • Fazal Malik Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan
  • Ashraf Ullah

Keywords:

Citation context, reference extraction, computational linguistics, text categorization, fixed-window extraction, NLP.

Abstract

Citation contexts (CCs)—text near citation marks—is helpful summaries of referenced materials but challenging for machine analysis because they vary in structure and contain inherent vagueness. Current methods largely apply fixed-window extraction, where unnecessary information tends to be obtained or key points go unexamined. The necessity for more formal CC analysis is a result of the weakness in the current strategy. The previous approaches lack a comprehensive framework to categorize CCs based on syntactic scope, information completeness, and ambiguity, thus being less effective for computational linguistics applications such as reference extraction and sentiment analysis. In response to this deficiency, our research constructs an in-depth taxonomy to classify CCs by their positional, syntactic, and contextual properties. We examined 100 ACL Anthology Network research papers, manually classifying CCs into four major dimensions: citation position (head, mid, tail), syntactic units (phrase, clause, and sentence), missing information, and ambiguity. Our results show that tail-position citations usually refer to whole statements, whereas head and mid citations need accurate scope identification. Interestingly, 95% of CCs are single-sentence, with phrases and clauses being most frequent in mid-position citations. Moreover, 15% of CCs showed ambiguity that challenged even human annotators. This taxonomy facilitates CC processing in applications such as reference extraction and opens up future directions of research in multi-sentence CCs and machine learning-based analysis.

Downloads

Published

2025-04-27

How to Cite

Afsheen Khalid, Dilawar Khan, Malik, F., & Ashraf Ullah. (2025). Taxonomy of Citation Contexts: A Framework for Systematic Analysis of Reference Text Extraction in Computational Linguistic. Dialogue Social Science Review (DSSR), 3(4), 873–887. Retrieved from http://www.thedssr.com/index.php/2/article/view/514

Issue

Section

Articles

Similar Articles

<< < 5 6 7 8 9 10 

You may also start an advanced similarity search for this article.