Contextual Classification Methodology

General Overview

Sounder’s contextual category classification model identifies topics of conversation at the segment level of podcast episodes according to the IAB Content Taxonomy (3.0).

Methodology

Contextual Classification

Podcast episodes are transcribed using Sounder’s proprietary automatic speech recognition (ASR) engine. Leveraging natural language processing (NLP), we further anaylze transcriptions to identify named entities, topics, sentiment, and tone, among other identifiers, that supports contextual category identification.

To illustrate, if a show’s most recent processed episode was on April 15, then any episodes published and processed between January 15 and April 14 would be included in the show-level score calculation.

Contextual Category Results

All contextual categories are comprehensively classified on the segment level, then summed within the same 90-day window, and inherited at the show level.

Interpreting Contextual Category confidence scores

Sounder identifies contextual categories of podcast content at the segment level, ultimately summarizing episode and show-level classifications. For a category to be outlined on the episode level, a single podcast episode must contain at least three segments of the same category.

The concentration and depth of conversations of identified categories determine confidence scores. For example, a discussion focused on a single topic roughly 6-8 minutes long likely results in a 100% confidence score of the related contextual category. In contrast, a conversation about the same topic that is only a couple minutes or less in length will likely only result in a confidence score of 60% or less.

Since Sounder’s models are not keyword-based and rely on contextual signals such as sentiment and tone, the precise number of identifiers does not guarantee confidence score results.