# How it Works

Datastream is a raw data pipeline that delivers real-time, user-level data from visitor interactions on a page, streamed to your Amazon S3 or Google Cloud Storage bucket.

Our feed contains over 50 fields of data. These can be categorized into four broad groups:

* **Engagement:** Chartbeat’s best-in-class engagement metrics, such as engaged time, time on page, scroll depth, page and browser geometry.
* **Data about the page:** Data points related to the identity of the page, such as the path, title, section and author, content type, platform, and sponsor data associated with each page view.
* **Data about the user:** For user level analysis, a unique ID, their browser’s user agent string, frequency, and recency.&#x20;
* **Timestamp:** The time the visitor visited the page, left the page, and user's time zone.

{% hint style="info" %}
**Note:** Unique IDs are unique to a given user, in a given browser, on a given website. Chartbeat only uses a first-party cookie, so our IDs cannot be used to track a user between sites. Customers who want to track users between sites [can pass us an ID](https://docs.chartbeat.com/datastream/implement-tracking/id-sync) and perform user journey analysis on their backend systems.
{% endhint %}

**Some of the Platform data supported in Datastream include:**

* Web
* Google AMP
* Facebook Instant Articles
* Apple News
* Your own native app

**Chartbeat’s Datastream Reporting supports exporting data to the following data storage platforms:**

* Amazon Web Services
* Google Cloud Storage

## Datastream specifications and formats

**File Format:** CSV, one row per Chartbeat-logged page session-expired page view

**Compression Type:** GZIP

**Delimiters:** pipe-separated

**Character Encoding:** UTF-8

**Example File Naming Convention:** rawdata/YYYY/MM/DD/h/\[00|30]/\[epoch timestamp].\[file hash].csv.gz

**Data Batch Interval:** by minute&#x20;

**Delivery Frequency:** by minute

**Delivery Destination:** Amazon S3 or GCS bucket with shared read/write permissions

{% hint style="info" %}
**Note:** Files are created every minute, with each minute’s files representing the users whose page views ended in that minute.
{% endhint %}

## Download a sample data file

Click the link below to download a sample data CSV file or preview a row of pageview data.

```
distribution|last_ping_timestamp|host|cookie_id|page_session_id|domain|path|new_user|device|engaged_time_on_page_seconds|page_width|page_height|max_scroll_position_top|window_height|external_referrer|no_client_storage|city_name|region_name|country_code|country_name|continent_name|dma_code|utc_offset_minutes|user_agent|recency|frequency|internal_referrer|author|section|content_type|sponsor|utm_campaign|utm_medium|utm_source|utm_content|utm_term|account_id|page_title|virtual_page|scrolldepth|total_time_on_page_seconds|ga_client_id|login_id|id_sync|subscriber_acct|page_load_time
SITE|1571314031|mysite.com|M_qCECKGCqIcP9a3|Ffe3bT84JvqrIDfBDS+w5FgVyRY=|mysite.com|mysite.com/news/3977611002|false|desktop|5|1366|768.0|0.0|768||false|Brooklyn|New York|US|United States|North America|501|-240|Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36|1|16|mysite.com/|no author|news,local|how-to|||||||REMOVED|How to become a writer for your local newspaper|false|768|74|1236546315.5527916466||"{""clientId"":""62d8fbb2-0060-1cfd-a004-a6f56c0dc7a4"",""anonymousId"":""46e7a61c3a0d3208cf504ff859008b70"",""userMeterState"":""3""}"||784
```

{% file src="<https://1982097824-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M5cm1lhNCKBuSy3rMXS%2F-MGkLXV7N3W07XaCkwo6%2F-MGkPhGYY_R2FweqIk7g%2FDatastream_sample_data.csv?alt=media&token=2238145f-7709-47a8-b573-78863198d787>" %}
Download sample data CSV file
{% endfile %}

## Map Chartbeat data with other data sources

With [**ID Sync**](https://docs.chartbeat.com/datastream/implement-tracking/id-sync), website owners can populate their Datastream feed with custom ID values via a few extra lines of JavaScript in our tracking snippet for standard websites. This custom metadata can be used to join a user’s Chartbeat engagement data to other data sources, or to enrich engagement data by specifying information about the current viewing session.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.chartbeat.com/datastream/getting-started/how-it-works.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
