Triton Digital employs techniques based on identifiers, activity, and patterns per the data in the log files, in an attempt to identify and filter (exclude) invalid activity. Invalid activity includes, but is not limited to, known and suspected non-human activity and suspected invalid human activity. However, user identification and intent cannot always be detected or discerned by the network, advertiser, or their respective agents, and it is unlikely that all invalid activity can be identified and excluded from report results. Details on our techniques are described below.
One Minute Rule
Due to the nature of podcasting activity, and the general behavior of robotic/spider related traffic, Triton Digital uses a process whereby downloaded content with a duration of less than one minute is considered invalid and is removed from all collected data, unless the episode/file length is also under a minute, in which case the full file shall be downloaded. This rule reduces noise from extremely short sessions, robotic activities, and initial connectivity issues.
If the required information to measure the downloaded content length is not available, we apply adjustment factors to remove equivalent short sessions. This process has been audited by the IAB Tech lab.
Specific Identification of Non-Human Activity
Triton Digital uses the IAB/ABCe International Spiders and Bots Blacklist1 in order to exclude site-traffic associated with robotic activity from the collected data. For example, this filtering process allows us to exclude HTTP requests from search engine spiders including Google, Bing, Yahoo, and more. This list is maintained by the Interactive Advertising Bureau (IAB) and updated monthly.
Additional lists are utilized and updated by Triton Digital to exclude invalid or include known-valid user agents, if those agents are not reflected within the IAB/ABCe Internal Spiders & Robots List.
Triton Digital also follows the IAB’s filtering guidance regarding Apple’s watchOS downloads, as the majority of watchOS downloads are automated duplicates of iPhone downloads and are not user-initiated. Specifically, we filter out:
- User Agents that begin with atc/ and include watchOS (for example: atc/1.0 watchOS)
- User Agents that contain (null)/(null) watchOS*
Data Center Exclusion
Triton Digital uses the TAG Data Center IP address list in order to exclude industry-identified non-human data center traffic. This list is maintained by the Trustworthy Accountability Group (TAG) and updated monthly.
Bad or Unidentified Requests
Triton Digital only accepts valid file transfer requests such as GET with 200 and 206 error codes, with a valid byte range. Requests to files that can’t be identified as part of a podcast/program will not be credited to any episode, podcast or program.
Duplication of Data
All duplicated data is removed from the dataset used to produce the final metrics.
Other Activity-Based Filtration
Triton may flag traffic as invalid based on abnormal or suspicious traffic patterns as determined by activity-based filtration rules. In cases where suspicious traffic has been removed, Triton may adjust the reporting period and corresponding weekly averages.