Log File Delivery and Frequency
Log files are pulled by Triton Digital from the client's S3, FTP, or SFTP (pickup point). Publishers are required to provide the necessary login credentials to obtain the log files.
Log files should reach the pickup point at least daily, but more frequently is acceptable, especially if the logs are larger and in multiple chunks. Triton Digital expects to receive session information within three days of the beginning of the session. If the session log is received after this delay, it will not be counted in the aggregated metrics.
File Name
Log files should be uniquely named, with the date as part of the filename. Each named file should arrive in the pickup location when the log is complete (rather than opening up FTP to the actual logging folder where files are being actively written), and each unique filename will be retrieved/processed once. An example of a good filename for this purpose is: MSLT20150830-00.tsv.gz, where 20150830 is the date, and -00 is a suffix, if needed, for the hour, file-number on that day, or some other sequencing/unique identifying value.
For efficiency purposes, each log file should be compressed in a “.gz” single-file archive.
Persistence
Create a clean-up task to remove log files that were created more than 60 days ago. For example, if you are using Amazon S3 storage, you can add a lifecycle rule to your bucket in order to accomplish this.
File Format
Log files should be in one of the following formats:
- Formatted in the standard output of the file server. The default access-log output of most current audio streaming services is often usable with no special changes to configuration.
- Formatted per the W3C extended log format (https://www.w3.org/TR/WD-logfile-960221.html). This is a format commonly used for streaming server output. It is basically a tab or space delimited file with a header that identifies field names for each column in the data.
- Formatted as tab-separated values (tsv). Details: Tab character (As \t or 0x09). Line Ending (\n or 0x0A). The # character for commented line. Header line ideal but optional. If this format is used, please contact your Triton Digital Account Manager with details on your planned output format/fields so we can ensure it matches with a log parsing scheme.
- For those using Akamai CDN, this Luna file format should be used:
Extended + Completion Flag Log record format:
date_YYYY-MM-DD \t time_HH:MM:SS \t client_ip \t http_method \t arl_stem \t status_code \t total_bytes \t transfer_time \t "referrer" \t "user_agent" \t "cookie" \t total_object_size \t byte_range \t last_byte_served_flag
Required and Optional Fields in the Log Lines
Required Fields (Logs) | Description |
---|---|
IP | Remote public IP address of the client device. It can be either IPv4 or IPv6. It can be either the full IP address (e.g., 200.150.100.111) or a partial (truncated) one with the last digit at 0 (e.g., 200.150.100.0). If it’s a partial address, we must also receive the hashed IP address in an extra field. The hashed IP is required to maintain a proper Download and Unique count. |
Hashed-IP | IP address hashed with any standard hash function algorithm (e.g., MD5). The hashing method should minimize the risk of collisions and should not be shared. The hashed IP is used to properly count unique downloads and unique listeners when the IP address is truncated. (See IP field.) |
user-agent | Contents of "User-Agent" HTTP header. This user-agent HTTP header contains a characteristic string that allows network protocol peers to identify the application type, operating system, software vendor, or software version of the requesting software user agent. Example: Mozilla/5.0 (Windows NT 10.0; Win64; x64) |
date | Date at which transaction is completed. Format is YYYY-MM-DD. |
start-time | The timestamp of the session start. Format is HH:MM: SS |
method | HTTP method. E.g.: GET |
status | HTTP Status Code. E.g.: 200 |
url | Up to 2048 characters complete URI (or URL). This value should be populated with a unique publishing-point (URI) so the record can be applied to that station in our system. In other words, a part of this field will be used as the key to match with a station in Triton Digital’s database. Example: /FolderABC/2018/02/20180209_pine0823.mp3?siteplayer=true&episode=588472 |
bytes | bytes transferred or response size, server to client. E.g.: 766967 |
object-size | Total size of the podcast file to be downloaded. |
byte-range | Should have data if the response code is 206. E.g.: 2008-19568 |
Optional Fields | Description |
referrer | The address of the previous web page from which a link to the currently requested page was followed. |
time-taken (duration) | Numeric, up to nine digits. This is the duration of the session, in integer seconds. |
episode-guid | Episode GUID Identifier as represented in the RSS feed. If this value is present, it may substitute the URI as the key to match with an episode in the RSS feed. |
podcast-id | Podcast (show) identifier where the listener has initiated a session. |
vid | A unique registration/visitor ID that can be used to identify a listener and must come from a listener registration mechanism. |
lsid | The Triton Digital LSID (a.k.a UUID). This is the App/Cookie/Advertising ID as presented in the Listener ID Management topic in the Advertising Technical Specification. Typically, on a mobile device, this should be a Google “gaid” or Apple “idfa,” or if not available, an application-generated ID. On desktop, it should be a cookie ID. |
gender | Listener’s gender (M or F or U). U can be used for other or unknown gender. |
yob | Listener’s year of birth, using the YYYY format. |
age | Listener’s age. |
zip | Listener’s ZIP code (5 digits) or postal code (alpha-numerical, no space). |
hasads | Flag that indicates if the listening session can receive advertising. Possible values are 0 and 1. Sending 0 indicates that the session cannot receive advertising. |
dev | Extra property used to specify on which device the session is initiated. Triton Digital can provide a non-exhaustive list of available devices, but clients may extend that list for their own usage. |
dist | Extra property that could be used to produce aggregation that shows on which Distributor/Partner the session has been initiated. For example, Publisher A shares their stream on Distributor/Partner B, so the Distributor property is "B". Triton Digital can provide a non-exhaustive list of available distributors, but clients may extend that list for their own usage. |
ss | Extra property that could be used to produce aggregation that shows the publisher of the stream. For example, Publisher A shares their stream on Partner B, so the ss property is "A". This is rarely used, since logs are typically produced by the publisher. |
ps | Extra property used to specify on which player the session is initiated. Triton Digital can provide a non-exhaustive list of all available players, but clients may extend that list for their own usage. |
Query Params | Any or all other URI Query param strings may be provided for potential future usage. For example, there could be IDs that are used to properly match episodes or podcasts. |
X-Forwarded-For | Used for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or load balancer. |
Customs / Others | Any other parameters sent will be ignored by our systems. |