Home Cell Biology Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications
Cell Biology JoVE (Open Access) Citable · DOI

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

DOI: 10.3791/59108-v
What you'll learn
  • Implement cloud-based automated phrase mining from biomedical literature
  • Calculate phrase-category associations using CaseOLAP scoring
  • Analyze domain-specific concepts through text-cube data structures
  • Apply semantic online analytical processing to knowledge discovery
Protocol

We present a protocol and associated programming code as well as metadata samples to support a cloud-based automated identification of phrases-category association representing unique concepts in user selected knowledge domain in biomedical literature. The phrase-category association quantified by this protocol can facilitate in depth analysis in the selected knowledge domain.

Difficulty
advanced
Total time
~4-8 hours (depending on literature dataset size and cloud processing)

Steps

1
Create text-cube data structure

Build a multi-dimensional text-cube representation from biomedical publication corpus. This foundational step organizes phrases, categories, and documents into a queryable 3D data structure for downstream analysis.

▶ 01:13
2
Count and enumerate entities

Quantify phrase and category occurrences across the document collection. This establishes baseline frequency metrics required for subsequent association scoring.

▶ 03:34
3
Update metadata attributes

Refresh and annotate phrase-category metadata with occurrence counts and dimensional coordinates. This step prepares data for semantic scoring algorithms.

▶ 04:55
4
Calculate CaseOLAP association scores

Apply context-aware semantic online analytical processing to compute strength and significance of phrase-category associations. This quantifies domain-specific concept relationships.

▶ 05:50
5
Perform representative case analyses

Execute domain-specific queries and interpret phrase-category associations in selected knowledge areas. Demonstrate practical application of mining results to biomedical research questions.

▶ 07:00
💬 Comments coming soon