Data Export Guide (Optimizely X)

Updates to Data Export file retention policy Starting 25 May 2018, to comply with the upcoming GDPR privacy requirements and to enhance your control over your data, Optimizely will retain the files in your Data Export bucket for 30 days. To keep your file history over 30 days, update your import process to archive your files at least once every 30 days.


Data Export allows you to access all of your Optimizely event data. Data Export relies on a daily job that collects all events received in the last 24 hours for all experiments in your account. It then exports them securely to an S3 bucket, which can be programmatically accessed via Amazon's APIs with a secure set of credentials provided by Optimizely.

Availability

Data Export is only available to Enterprise customers. Please reach out to your Customer Success Manager if you wish to request access to this feature. If you do not have a CSM, submit a ticket to the developer support team to verify your plan and eligibility.

Technical Details

Data Export generates multiple Files containing raw event data collected in the past 24 hours, starting from UTC midnight. The files are tab-delimited and compressed with gzip for faster download. The first file in each daily partition includes a header row.

Note: The data contains the raw list of the events we received separated by experiment ID. It is not a list of raw results data. Event data goes through an attribution process described in our knowledge base article on How Optimizely Counts Conversions. This raw data is pre-attribution, meaning it is unprocessed and comes from before results computation.

Raw event data contains events from users who may or may not count for an experiment. Events may exist in exports outside of the time frame in which an experiment ran. Recreating the results page's numbers out of raw data will be non-trivial as it would require recreating our attribution model in your queries.

The S3 bucket name is: optimizely-export-ng

The S3 bucket location follows the format:
/optimizely-export-ng/{account_id}/{project_id}/2.0/yyyy/mm/dd/{experiment_id}/{file_name}

The file names follow the format:
experiment_id-filepartnum-yyyy-mm-dd-r-reducernum.gz

Example: 987654321-0-2017-03-06-r-00062.gz

Notes:

Status File

A status file (status.yaml) is included within each daily partition to track the success or failure of the Data Export job. The status files contain the following information: failed_exports (list of experiment IDs), successful_exports (list of experiment IDs), and timestamp in UTC seconds since epoch. View a sample YAML file with and without failed export files.

Optimizely X field descriptions

Definitions

timestamp

The timestamp of when the event occurred in the browser or app. The format is a number representing the number of seconds since Unix epoch.

project_id

Your Optimizely project ID on which the campaign and/or experiment lives.

campaign_id

The campaign ID (also known as layer ID). For Web Experimentation and Personalization, this value can be found in the API Names tab. For Full Stack, it is found in the project's JSON data file.

experiment_id

The experiment ID. For Web Experimentation and Personalization, this value can be found in the API Names tab. For Full Stack, it is found in the project's JSON data file.

variation_id

The ID Optimizely uses to identify the variation the visitor saw. For Web Experimentation and Personalization, this value is found in the API Names tab. For Full Stack, it is found in the project's JSON data file.

layer_holdback

Boolean value that indicates whether the visitor was placed in the campaign's or experiment's holdback group.

audience_names

An array containing the name of the audience for which the visitor qualified to be placed in the campaign and experiment. For Web Experimentation and Personalization, if your snippet masks descriptive names, this will be the audience ID (of the form [Aud 1234567890]). It can be mapped to Audience Name on the Campaign Overview screen, API Names tab. For Full Stack, this mapping is available in the project's JSON data file.

end_user_id

For Web Experimentation and Personalization, this is the anonymous optimizelyEndUserId value stored in a cookie and local storage. It represents a unique visitor. For Full Stack, this is the user ID provided by your app.

uuid

Ignore - null. uuid is not currently supported in Optimizely X.

session_id

A unique session identifier. For Web Experimentation and Personalization, it is set to AUTO by default. For Full Stack, this is null and can be ignored.

snippet_revision

For Web Experimentation and Personalization, the revision number of the Optimizely snippet that was served in this visitor's browser. For Full Stack, the revision number of your datafile that was compiled into the SDK at the time of event firing.

user_ip

IP address of the visitor associated with this tracking call. If you employ IP Anonymization, the last octet will be a 0 (zero) for all tracking calls made to Optimizely. The full IP address will not be stored anywhere and cannot be retrieved later.

user_agent

For Web Experimentation and Personalization, the userAgent header passed from the browser. For Full Stack, describes the package or code language that initiated this tracking call.

user_engine

Language or stack in which the Optimizely snippet or SDK was served. For example, a value of js will be shown for the web snippet.

user_engine_version

The Optimizely-internal version of the snippet or SDK.

referer

For Web Experimentation and Personalization, the referring URL in the browser. For Full Stack, this will be null and can be ignored.

global_holdback

Ignore - will always be 'false.' A global holdback is not currently supported in Optimizely X.

event_type

For Web Experimentation and Personalization, the type of event recorded by Optimizely. Values are view_activated or other. view_activated indicates the activation of a page (view), and other could be a click or custom event. Refer to the event_name column for more details. For Full Stack, this will be null and can be ignored. For all products, if the row represents a bucketing decision event, this field will be null.

event_name

The API name of the click or custom event. For Web Experimentation and Personalization, if event_type = view_activated this value will be the page ID. For all products, if the row represents a bucketing decision event, this field will be null.

user_features

For Web Experimentation and Personalization, an array of JSON objects of Optimizely customer-defined behavioral attributes (if Personalization is enabled), custom dimensions and/or user attributes, and Optimizely standard segments. Each object will have a type, a name, and a value. These values are all optional. For Full Stack, this will be an array of JSON objects containing customer-defined attributes.

  • Optimizely default segments: first_session, browser_id, AdWords campaign value (if source_type is campaign), device, source_type (traffic source), timestamp (in seconds since UNIX epoch), and offset (number of minutes behind UTC, serves as an indicator of timezone in which the event was fired)
active_views

Deprecated. For all products, this field is null.

event_features

For Web Experimentation and Personalization, an array of JSON objects of any page or event tags or categories defined for this event. For Full Stack, an array of JSON objects containing customer-defined tags. For all products, if the row represents a bucketing decision event, this field will be null.

event_metrics

If revenue is captured for this event, a JSON object array indicating revenue as the name and the value in cents. For all products, if the row represents a bucketing decision event, this field will be null.

event_uuid

A unique identifier for this event. Clients usually set this value with any UUID-generating method. The field can be used to de-duplicate events that are accidentally or erroneously replayed.

Data formats

As these files are TSVs, nulls will be empty tabs.

Primary Keys

Sample Files

Miscellaneous