This article describes what arrives in your environment with each Install delivery: the three files, their formats, how to read them, and how sensitive attribute restrictions appear. It's reference material for engineers and data operators implementing or maintaining the integration.
Files delivered in each engagement
Every Install delivery contains the following. All files for a given delivery share a common wave identifier and timestamp.
- Output File — one per contracted ID type. Filename pattern: Resonate_{CLIENT}_{ID_TYPE}_Install_{STRUCTURE}_{WAVE}_{YYYYMMDD}.{format}.{compression}.
- Key Definitions File — one per engagement. Filename pattern: Resonate_{CLIENT}_DataDictionary_{WAVE}_{YYYYMMDD}.csv.
- Summary Report — one per engagement. Filename pattern: Resonate_{CLIENT}_SummaryReport_{WAVE}_{YYYYMMDD}.csv.
Partitioned deliveries
For Output Files at the higher end of the file-size ranges — typically HEM and MAID full-catalog files — the file is delivered in multiple parts to keep individual files at a tractable size for ingestion. Partitioned filenames follow:
Resonate_{CLIENT}_{ID_TYPE}_Install_{STRUCTURE}_{WAVE}_{YYYYMMDD}_part{NNN}.{format}.{compression}
Partitioned deliveries include a manifest file that lists every part. The manifest is written last; your ingestion pipeline should wait for the manifest before processing to ensure the delivery is complete.
Resonate_{CLIENT}_{ID_TYPE}_Manifest_{WAVE}_{YYYYMMDD}.csv
The Output file
The Output File contains one record for every ID in the Resonate universe for the corresponding ID type, with predictions for every attribute in your licensed catalog subset.
File format
Selected at engagement initiation:
- Parquet — recommended for cloud data warehouse ingestion. Smaller file size, faster query performance. Compressed with Snappy.
- CSV — available for environments that require a row-based text format. Gzip-compressed. UTF-8 encoded, comma-delimited, double-quoted per RFC 4180, with a header row and Unix line endings.
Output structure
Selected at engagement initiation. Smart defaults apply per ID type but may be overridden:
| ID type | Default Structure | Rationale |
|---|---|---|
| IP | Pivoted | Modest universe; one column per attribute is tractable. |
| ZIP11 | Pivoted | Modest universe; pivoted format is easier for direct-mail workflows. |
| HEM | Standard | Population scale makes pivoted CSV impractical (300–600 GB). |
| MAID | Standard | Same as HEM. |
- Standard structure — each row contains one ID with the set of Survey_Value_Keys that are TRUE for that ID, plus a separate column listing any Survey_Value_Keys redacted because the ID is associated with a restricted state.
- Pivoted structure — each row contains one ID with one column per Survey_Value_Key. A value of 1 indicates TRUE, 0 indicates FALSE, and NULL indicates the Survey_Value_Key is a sensitive attribute redacted for a resident of a restricted state. NULL has only this meaning — FALSE is always represented explicitly as 0.
Key definitions file
The Key Definitions File maps Survey_Value_Keys used in the Output Files to their human-readable names and metadata. It allows you to join the Output File to descriptive labels for analysis, reporting, and dashboarding.
It's delivered as a single CSV per engagement, regardless of how many ID types are contracted (the attribute catalog is the same across ID types). For each Survey_Value_Key in your licensed subset, it includes: cluster, subcategory, attribute name, attribute key, value name, Survey_Value_Key, and a flag indicating whether the attribute was derived from a single-select or multi-select question.
A new Key Definitions File is delivered with every wave, even if the catalog hasn't changed, so that your ingestion pipeline can treat each refresh as a self-contained delivery.
The Summary Report
The Summary Report is a CSV that summarizes the contents of the Output File(s) for the wave. Use it to verify your Output and Key Definitions Files are correctly aligned. After ingesting the files, compare your formatted data against the metrics in the Summary Report as a sanity check.
For each Survey_Value_Key in your licensed catalog, the Summary Report contains:
- The full attribute taxonomy (cluster, subcategory, attribute name, value name)
- SURVEY_VALUE_KEY — the key corresponding to the attribute as reflected in the Output File
- DISTINCT_COUNT — total number of IDs for which the attribute is TRUE
- PERCENTAGE — percentage of records for which the attribute is TRUE
Note: because sensitive data is redacted for individuals in restricted states, percentage values for mutually exclusive attribute groups (e.g., Age Range, Gender, Household Income) may no longer sum to 100%. This is expected behavior, not a data quality issue.
One Summary Report is delivered per engagement, covering all contracted ID types.
How sensitive attribute restrictions appear in the file
Predictions for a defined set of sensitive attributes are not delivered for individuals residing in restricted states. The restriction is enforced at file generation and applies to the Output File only.
- In Standard structure: the record for a restricted-state resident is present in the file. The Predictions column contains the Survey_Value_Keys for non-sensitive attributes that are TRUE for that ID. A separate Redacted column lists Survey_Value_Keys that would have been predicted but were withheld due to the state-level restriction. You can distinguish absence-because-FALSE from absence-because-redacted by checking the Redacted column.
- In Pivoted structure: the record for a restricted-state resident is present in the file. Sensitive Survey_Value_Key columns contain NULL for that record. Non-sensitive columns contain 1 or 0 as usual. NULL appears in the file only as a redaction signal, so you can distinguish FALSE (0) from redacted (NULL) directly.
Records are never excluded from the Output File on the basis of state of residence. Record count consistency is preserved so that customer-side joins don't silently drop rows. See the Sensitive Data article for the full restricted state list and the categories of attributes affected.
Encoding, character handling, and edge cases
Both formats are UTF-8 encoded. CSV files follow RFC 4180 conventions: comma delimiter, double-quote quoting for fields containing commas, quotes, or line breaks (with embedded quotes escaped by doubling), Unix (LF) line endings, and a header row in the first line. NULL is represented as an empty string in CSV; in Parquet, NULL is the native null type. Survey_Value_Keys appear as numeric strings in CSV and as STRING-typed columns in Parquet to preserve leading zeros and prevent any inadvertent numeric coercion.
If you have questions about your deliverables, contact your Customer Success Manager or reach out to resonatesupport@resonate.com.
Comments
0 comments
Article is closed for comments.