Internal Documentation internal
TalkIDE internal documentation

Authenticated user requests a full export of their personal data. The export is processed asynchronously and delivered as a ZIP archive via email with a time-limited download link (7 days). Implements GDPR Article 15 — Right of Access.

  • Only one export request per user can be in PENDING or PROCESSING state at any time (idempotent per user).
  • The ZIP contains structured JSON files covering all data categories: user profile, projects, conversations, messages, usage events, budget, and top-ups.
  • The download link is a signed URL stored in DO Spaces (platform/exports/ prefix) with a lifecycle policy of 7 days. After expiry the object is automatically deleted by DO Spaces.
  • Email is sent via Mailgun. See ADR-025: Transactional Email via Mailgun. If Mailgun is unavailable the job fails loudly — no silent stub.
  • A new table gdpr_export_request tracks export lifecycle: PENDING → PROCESSING → READY | FAILED.
  • Export data is a point-in-time snapshot taken during the async processing job; it does not reflect changes made after the request was submitted.
  • The public download endpoint (GET /api/v1/gdpr/exports/{token}/download) is unauthenticated — security relies solely on the unpredictability of the signed token. The token is a 256-bit random value encoded as hex (64 chars) generated at READY time, not at request time.
sequenceDiagram
    actor User

    %% --- Request export ---
    User->>+FE: clicks "Request export" in Danger Zone
    FE->>FE: show confirmation modal <br> "We'll email a download link to {email} within 24 h. <br> Link expires in 7 days. Continue?"

    User->>+FE: confirms modal

    FE->>+BE: POST /api/v1/users/me/gdpr/export <br> Authorization: Bearer {accessToken}

    BE->>BE: validate JWT access token
    alt access token invalid or missing
        BE-->>FE: 401 Unauthorized <br> ErrorResponse
    end

    BE->>DB: check existing PENDING or PROCESSING export for this user
    alt already has a pending/processing export
        BE-->>FE: 409 Conflict <br> ErrorResponse
    end

    BE->>DB: INSERT gdpr_export_request (status=PENDING)
    BE->>BE: enqueue async export job

    BE->>-FE: 202 Accepted <br> GdprExportResponse

    FE->>-User: show toast "Export request received. <br> Check your email within 24 hours."

    %% --- Async processing (background job) ---
    Note over BE,DB: Background job (Spring @Async / scheduled)

    BE->>DB: mark export status=PROCESSING
    BE->>DB: load all user data (profile, projects, conversations, <br> messages, usage_events, budget, topups)
    BE->>BE: serialize data to JSON files, compress to ZIP
    BE->>BE: upload ZIP to DO Spaces <br> platform/exports/{userId}/{exportId}.zip
    BE->>BE: generate signed download token (256-bit random, hex)
    BE->>DB: update export: status=READY, download_url, <br> expires_at=now+7d, file_size_bytes
    BE->>BE: send email via Mailgun <br> "Your TalkIDE data export is ready"

    alt Mailgun unavailable or upload failed
        BE->>DB: mark status=FAILED, store error_message
        Note over BE: Job fails loudly — logged at ERROR level
    end

    %% --- Status check ---
    User->>+FE: opens profile (optional status polling)

    FE->>+BE: GET /api/v1/users/me/gdpr/export/{exportId} <br> Authorization: Bearer {accessToken}

    BE->>BE: validate JWT access token
    alt access token invalid or missing
        BE-->>FE: 401 Unauthorized <br> ErrorResponse
    end

    BE->>DB: find export by id, verify user ownership
    alt export not found or belongs to another user
        BE-->>FE: 404 Not Found <br> ErrorResponse
    end

    BE->>-FE: 200 OK <br> GdprExportResponse

    FE->>-User: display export status (PENDING / PROCESSING / READY / FAILED)

    %% --- Download via email link ---
    User->>+FE: opens email, clicks download link <br> GET /api/v1/gdpr/exports/{token}/download

    FE->>+BE: GET /api/v1/gdpr/exports/{token}/download <br> (no auth — public signed URL)

    BE->>DB: find export by download_token
    alt token not found
        BE-->>FE: 404 Not Found
    end

    BE->>BE: check expires_at > now()
    alt token expired
        BE-->>FE: 410 Gone <br> ErrorResponse
    end

    BE->>BE: generate pre-signed DO Spaces redirect URL (short TTL, 5 min)
    BE->>-FE: 302 Found <br> Location: <pre-signed Spaces URL>

    FE->>-User: browser follows redirect, ZIP download starts

POST /api/v1/users/me/gdpr/export

POST /api/v1/users/me/gdpr/export (no request body)

202 Accepted GdprExportResponse:

{
  "data": {
    "exportId": 42,
    "status": "PENDING",
    "requestedAt": "2026-05-23T10:00:00Z",
    "expiresAt": null,
    "fileSizeBytes": null,
    "downloadAvailable": false,
    "errorMessage": null
  }
}

401 Unauthorized (missing or invalid access token) ErrorResponse:

{
  "status": 401,
  "code": "AUTHENTICATION_FAILED",
  "message": "Access token is missing or invalid"
}

409 Conflict (export already pending or processing) ErrorResponse:

{
  "status": 409,
  "code": "CONFLICT_GDPR_EXPORT",
  "message": "An export request is already pending for this account. Check your email or try again later."
}

GET /api/v1/users/me/gdpr/export/{exportId}

GET /api/v1/users/me/gdpr/export/{exportId} (no request body)

200 OK GdprExportResponse (READY):

{
  "data": {
    "exportId": 42,
    "status": "READY",
    "requestedAt": "2026-05-23T10:00:00Z",
    "expiresAt": "2026-05-30T10:00:00Z",
    "fileSizeBytes": 204800,
    "downloadAvailable": true,
    "errorMessage": null
  }
}

200 OK GdprExportResponse (FAILED):

{
  "data": {
    "exportId": 42,
    "status": "FAILED",
    "requestedAt": "2026-05-23T10:00:00Z",
    "expiresAt": null,
    "fileSizeBytes": null,
    "downloadAvailable": false,
    "errorMessage": "Upload failed, please try again later"
  }
}

errorMessage is populated only for FAILED status. It MUST NOT contain PII or internal details (e.g. stack traces). Only user-friendly descriptions are permitted: e.g. "Upload failed, please try again later", "Export size exceeds limit", "Aborted due to server restart".


`401 Unauthorized` **ErrorResponse**:
```json
{
  "status": 401,
  "code": "AUTHENTICATION_FAILED",
  "message": "Access token is missing or invalid"
}

404 Not Found ErrorResponse:

{
  "status": 404,
  "code": "NOT_FOUND",
  "message": "Export not found"
}

GET /api/v1/gdpr/exports/{token}/download

GET /api/v1/gdpr/exports/{token}/download (public, no auth)

302 Found (redirect to pre-signed DO Spaces URL, TTL 5 min)

404 Not Found (token does not exist) ErrorResponse:

{
  "status": 404,
  "code": "NOT_FOUND",
  "message": "Download link not found"
}

410 Gone (token expired) ErrorResponse:

{
  "status": 410,
  "code": "GONE_GDPR_EXPORT",
  "message": "This download link has expired. Request a new export from your account settings."
}

DB Schema

New table: gdpr_export_request

Liquibase migration: 0048-add-gdpr-export-request.xml

<?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog
    xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
        http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd">

  <changeSet id="0048-add-gdpr-export-request" author="talkide">
    <createTable tableName="gdpr_export_request">
      <column name="id" type="bigint" autoIncrement="true">
        <constraints primaryKey="true" nullable="false"/>
      </column>
      <column name="user_id" type="bigint">
        <constraints nullable="false"
                     foreignKeyName="fk_gdpr_export_user"
                     references="users(id)"
                     deleteCascade="true"/>
      </column>
      <column name="status" type="varchar(20)">
        <constraints nullable="false"/>
      </column>
      <column name="download_token" type="varchar(64)">
        <constraints nullable="true" unique="true"/>
      </column>
      <column name="download_url" type="varchar(1024)">
        <constraints nullable="true"/>
      </column>
      <column name="requested_at" type="timestamptz">
        <constraints nullable="false"/>
      </column>
      <column name="completed_at" type="timestamptz">
        <constraints nullable="true"/>
      </column>
      <column name="expires_at" type="timestamptz">
        <constraints nullable="true"/>
      </column>
      <column name="file_size_bytes" type="bigint">
        <constraints nullable="true"/>
      </column>
      <column name="error_message" type="text">
        <constraints nullable="true"/>
      </column>
      <column name="created_at" type="timestamptz" defaultValueComputed="now()">
        <constraints nullable="false"/>
      </column>
      <column name="updated_at" type="timestamptz" defaultValueComputed="now()">
        <constraints nullable="false"/>
      </column>
    </createTable>

    <createIndex tableName="gdpr_export_request" indexName="idx_gdpr_export_user_id">
      <column name="user_id"/>
    </createIndex>

    <createIndex tableName="gdpr_export_request" indexName="idx_gdpr_export_download_token">
      <column name="download_token"/>
    </createIndex>
  </changeSet>

</databaseChangeLog>

Status lifecycle: PENDING → PROCESSING → READY | FAILED

StatusDescription
PENDINGRequest received, job not yet started
PROCESSINGJob is actively building the ZIP
READYZIP uploaded, email sent, download token set
FAILEDProcessing error — error_message contains details
EXPIREDLogical state only (derived from expires_at < now()); row is not deleted, status column stays READY

DataEntity note: GdprExportRequestEntity extends DataEntity (provides id, createdAt, updatedAt). The domain timestamps requested_at and completed_at are separate columns for audit clarity: requested_at marks the lifecycle entry point (when the user submitted the request), created_at is the row creation timestamp — these may differ in the case of retry or replay scenarios.

ZIP archive contents

FileContent
user.jsonUser entity: id, email, name, salutation, teamBriefing, locale, createdAt, inviteGeneration
projects.jsonList of user’s projects: slug, name, createdAt, archivedAt, status
conversations.jsonList of conversations: id, projectId, title, createdAt, closedAt
messages.jsonList of messages: id, conversationId, role, content (truncated at 10 000 chars per message), createdAt
usage_events.jsonAI call ledger entries: id, projectId, model, inputTokens, outputTokens, costUsd, createdAt
budget.jsonBudget snapshot: aiCreditUsd, aiCreditInitialUsd, spendingLimitUsd
topups.jsonCredit top-up records: id, amountUsd, status, createdAt (note: bonusUsd is not exported — BOGO 2× multiplier is applied directly to aiCreditUsd in budget.json, not stored as a separate topup field)

Frontend

UX Guidelines

Danger Zone section in DangerSection.vue:

The “Request export” card uses neutral styling (var(--bg-2) background).

Flow:

  1. User clicks “Request export” ghost button.
  2. Modal opens: title “Export your data”, body: “We’ll send a download link to {user.email} within 24 hours. The link expires in 7 days. Continue?”, two buttons: “Cancel” (ghost) and “Request export” (primary).
  3. On confirm: POST /api/v1/users/me/gdpr/export.
    • On 202 Accepted: close modal, show success toast “Export request received. Check your email within 24 hours.”
    • On 409 Conflict: close modal, show info toast “You already have a pending export request. Check your email.”
  4. Button is disabled while request is in flight (loading spinner).
  5. If a previous export is in READY status and not yet expired, the component may optionally display the existing expiresAt date instead of showing the “Request export” button — MVP can skip this optimization.

Validations

(No form inputs — the endpoint takes no request body.)

FieldConstraintsNote
Confirmation is modal-based; no form fields

Backend

Validations

FieldConstraintsNote
JWT tokenmust be valid, non-expired401 AUTHENTICATION_FAILED otherwise
existing exportno PENDING or PROCESSING row for user409 CONFLICT_GDPR_EXPORT otherwise

Async Job Invariants

  • Job must be idempotent: on BE startup, CleanupStaleProcessingExportsBean runs in @PostConstruct — it finds all rows with status=PROCESSING and marks them FAILED with errorMessage='Aborted due to BE restart'. The user will see FAILED status and can request a new export from the UI.
  • DO Spaces object key: platform/exports/{userId}/{exportId}.zip
  • DO Spaces lifecycle policy: automatic deletion after 7 days (configured at bucket level — not per-object, relies on expires_at metadata tag set during upload). Note: The bucket-level lifecycle rule in DO Spaces deletes all objects under platform/exports/ after 7 days regardless of the per-object expires_at value in the DB. If the lifecycle policy is misconfigured, old ZIPs may persist. See limitations.md — GDPR Features.
  • Signed download token: generated using SecureRandom, 256 bits, hex-encoded to 64 characters. Stored in download_token column, indexed for fast lookup.
  • Email template: plain-text + HTML. Subject: “Your TalkIDE data export is ready”. Body includes download link and expiry date.

Security Considerations

  • Download token must be at least 256 bits (32 bytes) of cryptographic randomness — not derived from user ID or timestamp.
  • The public download endpoint does NOT check JWT — security is entirely based on token unpredictability.
  • Tokens must not appear in server access logs at INFO level; use DEBUG or redact.
  • DO Spaces pre-signed URL generated at download time has a short TTL (5 min) to prevent link sharing. The token itself is the durable credential.
  • download_url stored in DB is the internal Spaces path, not the pre-signed URL — pre-signed URL is generated on each download request.

Test Cases

GIVENWHENTHENScope
authenticated user, no existing exportPOST /gdpr/export is called202 Accepted, PENDING row created, async job enqueuedintegration
authenticated user, existing PENDING exportPOST /gdpr/export is called409 CONFLICT_GDPR_EXPORT returnedintegration
authenticated user, existing PROCESSING exportPOST /gdpr/export is called409 CONFLICT_GDPR_EXPORT returnedintegration
authenticated user, existing READY export (not expired)POST /gdpr/export is called409 CONFLICT_GDPR_EXPORT returnedintegration
authenticated user, existing FAILED exportPOST /gdpr/export is called202 Accepted, new PENDING row createdintegration
no Authorization headerPOST /gdpr/export is called401 AUTHENTICATION_FAILED returnedunit
PENDING export row existsasync job runsstatus transitions PENDING→PROCESSING→READY, download_token set, email sent via Mailgunintegration
DO Spaces upload failsasync job runsstatus=FAILED, error_message populated, no email sentintegration
Mailgun unavailable (returns 5xx)async job runsstatus=FAILED, error_message=“email delivery failed”, job fails loudly (ERROR log)integration
authenticated user, owns exportIdGET /gdpr/export/{exportId}200 OK with correct status and fieldsintegration
authenticated user, exportId belongs to different userGET /gdpr/export/{exportId}404 NOT_FOUND returnedintegration
no Authorization headerGET /gdpr/export/{exportId}401 AUTHENTICATION_FAILED returnedunit
valid token, export not expiredGET /gdpr/exports/{token}/download302 redirect to pre-signed Spaces URLintegration
valid token, export expired (expires_at < now())GET /gdpr/exports/{token}/download410 Gone returnedintegration
non-existent tokenGET /gdpr/exports/{token}/download404 Not Found returnedintegration
export row with status=READY but download_token is NULLGET /gdpr/exports/{token}/downloaddefensive: log ERROR, return 500 Internal Server Errorunit
user clicks “Request export” and confirmsmodal submitPOST called, toast shown on 202e2e

Was this page helpful?

Thanks for the feedback.