How it Works

Datastream is a raw data pipeline that delivers real-time, user-level data from visitor interactions on a page, streamed to your Amazon S3 or Google Cloud Storage bucket.

Our feed contains over 50 fields of data. These can be categorized into four broad groups:

Engagement: Chartbeat’s best-in-class engagement metrics, such as engaged time, time on page, scroll depth, page and browser geometry.
Data about the page: Data points related to the identity of the page, such as the path, title, section and author, content type, platform, and sponsor data associated with each page view.
Data about the user: For user level analysis, a unique ID, their browser’s user agent string, frequency, and recency.
Timestamp: The time the visitor visited the page, left the page, and user's time zone.

Note: Unique IDs are unique to a given user, in a given browser, on a given website. Chartbeat only uses a first-party cookie, so our IDs cannot be used to track a user between sites. Customers who want to track users between sites can pass us an ID and perform user journey analysis on their backend systems.

Some of the Platform data supported in Datastream include:

Web
Google AMP
Facebook Instant Articles
Apple News
Your own native app

Chartbeat’s Datastream Reporting supports exporting data to the following data storage platforms:

Amazon Web Services
Google Cloud Storage

Datastream specifications and formats

File Format: CSV, one row per Chartbeat-logged page session-expired page view

Compression Type: GZIP

Delimiters: pipe-separated

Character Encoding: UTF-8

Example File Naming Convention: rawdata/YYYY/MM/DD/h/[00|30]/[epoch timestamp].[file hash].csv.gz

Data Batch Interval: by minute

Delivery Frequency: by minute

Delivery Destination: Amazon S3 or GCS bucket with shared read/write permissions

Note: Files are created every minute, with each minute’s files representing the users whose page views ended in that minute.

Download a sample data file

Click the link below to download a sample data CSV file or preview a row of pageview data.

distribution|last_ping_timestamp|host|cookie_id|page_session_id|domain|path|new_user|device|engaged_time_on_page_seconds|page_width|page_height|max_scroll_position_top|window_height|external_referrer|no_client_storage|city_name|region_name|country_code|country_name|continent_name|dma_code|utc_offset_minutes|user_agent|recency|frequency|internal_referrer|author|section|content_type|sponsor|utm_campaign|utm_medium|utm_source|utm_content|utm_term|account_id|page_title|virtual_page|scrolldepth|total_time_on_page_seconds|ga_client_id|login_id|id_sync|subscriber_acct|page_load_time
SITE|1571314031|mysite.com|M_qCECKGCqIcP9a3|Ffe3bT84JvqrIDfBDS+w5FgVyRY=|mysite.com|mysite.com/news/3977611002|false|desktop|5|1366|768.0|0.0|768||false|Brooklyn|New York|US|United States|North America|501|-240|Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36|1|16|mysite.com/|no author|news,local|how-to|||||||REMOVED|How to become a writer for your local newspaper|false|768|74|1236546315.5527916466||"{""clientId"":""62d8fbb2-0060-1cfd-a004-a6f56c0dc7a4"",""anonymousId"":""46e7a61c3a0d3208cf504ff859008b70"",""userMeterState"":""3""}"||784

Map Chartbeat data with other data sources

With ID Sync, website owners can populate their Datastream feed with custom ID values via a few extra lines of JavaScript in our tracking snippet for standard websites. This custom metadata can be used to join a user’s Chartbeat engagement data to other data sources, or to enrich engagement data by specifying information about the current viewing session.

PreviousUse Cases NextMetrics and Dimensions

Last updated 4 years ago

Was this helpful?