Query API Overview - Emerge API

The Query API retrieves user data after they’ve granted consent through Emerge Link. Choose between synchronous (immediate response) or asynchronous (job-based) queries depending on your use case.

Available data types

Data Type	Sync Endpoint	Async Endpoint	Description
Search History	`/v1/sync/get_search`	`/v1/search`	Google search queries
Browsing History	`/v1/sync/get_browsing`	`/v1/browsing`	Chrome browser history
YouTube History	`/v1/sync/get_youtube`	`/v1/youtube`	Watched videos
Ad Interactions	`/v1/sync/get_ads`	`/v1/ads`	Ad clicks and views
Receipts	`/v1/sync/get_receipts`	`/v1/receipts`	Purchase receipts with items and brands

Sync vs Async

Sync (Immediate)
Async (Job-based)

Best for: Real-time needs, single users, simple integrations

const response = await fetch(
  `https://query.emergedata.ai/v1/sync/get_search?uid=${uid}`,
  { headers: { 'Authorization': `Bearer ${token}` } }
);

const data = await response.json();
// Data returned immediately in JSON format

Characteristics:

Single uid parameter
Response includes data directly as JSON
30-second timeout
Supports pagination with cursor
Supports delta queries with ingested_begin/ingested_end

Best for: Large datasets, multiple users, batch processing

// 1. Start the query (up to 25 users)
const params = new URLSearchParams([
  ['user_ids', 'psub_a1b2c3d4e5f6789012345678901234ab'],
  ['user_ids', 'psub_b2c3d4e5f6789012345678901234abcd'],
  ['begin', '2024-01-01T00:00:00Z'],
  ['end', '2024-01-31T23:59:59Z']
]);

const startResponse = await fetch(
  `https://query.emergedata.ai/v1/search?${params.toString()}`,
  { headers: { 'Authorization': `Bearer ${token}` } }
);
const { job_id: taskId } = await startResponse.json();

// 2. Poll for results
const resultResponse = await fetch(
  `https://query.emergedata.ai/v1/job/${taskId}`,
  { headers: { 'Authorization': `Bearer ${token}` } }
);
const result = await resultResponse.json();

if (result.status === 'COMPLETED' && result.url) {
  // Download Parquet file from result.url
  console.log(result.url);
}

Characteristics:

user_ids query parameter (up to 25 users per request)
begin/end date range parameters
Returns job_id immediately
Poll /v1/job/{task_id} using the returned job_id
Completed jobs provide S3 presigned URL to Parquet file
No timeout concerns

When to use each

Use Case	Recommendation
User-facing dashboard	Sync - immediate JSON response
Background data processing	Async - batch multiple users
Real-time personalization	Sync - low latency
Batch analytics	Async - Parquet format for big data tools
Mobile app	Sync - simpler error handling
Multi-user reports	Async - query up to 25 users at once

Authentication

All Query API endpoints require a Bearer token:

curl 'https://query.emergedata.ai/v1/sync/get_search?uid=psub_c3d4e5f6789012345678901234abcdef' \
  -H 'Authorization: Bearer your_api_token'

Sync endpoints require uid. Use the callback uid you stored on your backend (or the same value you supplied in the Link URL). Never ask end-users for uid, and never call Query from the frontend since tokens and user mapping must stay server-side.

Categories and schema

Categories are Google Topics taxonomy paths (for example /Shopping/Apparel/Footwear).
Use GET /v1/sync/categories?table=searches (or browsing, youtube, ads, receipts) to list available category paths.

Response example:

{
  "categories": [
    "/Shopping/Apparel/Footwear",
    "/Sports/Running & Walking",
    "/Computers & Electronics/Software"
  ]
}

See:

Sync response format

Sync endpoints return JSON directly:

{
  "data": [
    {
      "user_id": "psub_c3d4e5f6789012345678901234abcdef",
      "event_id": 12345,
      "query": "best restaurants nearby",
      "timestamp": "2024-01-15T10:30:00Z",
      "ingested_at": "2024-01-15T11:00:00Z"
    }
  ],
  "count": 1,
  "has_more": true,
  "next_cursor": "eyJsYXN0X2lkIjogMTIzNH0=",
  "applied_ingested_end": "2024-01-15T12:00:00Z"
}

Field	Description
`data`	Array of records
`event_id`	Unique identifier for deduplication
`ingested_at`	When record was added to Emerge
`count`	Number of records in this response
`has_more`	Whether more records exist
`next_cursor`	Pagination token for next page
`applied_ingested_end`	Actual end time used (for delta sync)

Async response format

Async endpoints return a job reference:

{
  "job_id": "82f80278-5e76-4d01-8f1d-b55e08f12a52",
  "status": "PENDING"
}

When completed, the job result includes a download URL:

{
  "task_id": "82f80278-5e76-4d01-8f1d-b55e08f12a52",
  "status": "COMPLETED",
  "url": "https://query-results.s3.amazonaws.com/82f80278-5e76-4d01-8f1d-b55e08f12a52.parquet",
  "created_at": "2026-02-12T09:10:11Z",
  "expire_at": "2026-02-19T09:10:11Z"
}

The download URL points to a Parquet file containing all results.

Error responses

Status	Code	Description
401	`unauthorized`	Invalid or missing API token
404	`user_not_found`	No consent for this user
429	`rate_limited`	Too many requests
500	`internal_error`	Server error (retry with backoff)

Next steps

Pagination

Handle large datasets with cursors and delta queries

Event Categories

First-level category list and filter patterns

Data Schema

Field-level schema for all Query event types

Documentation Index

​Available data types

​Sync vs Async

​When to use each

​Authentication

​Categories and schema

​Sync response format

​Async response format

​Error responses

​Next steps

Pagination

Event Categories

Data Schema

Available data types

Sync vs Async

When to use each

Authentication

Categories and schema

Sync response format

Async response format

Error responses

Next steps