One monolithic CRM. 77 distinct businesses. 700,000+ records that needed to land in exactly the right place — with zero data loss and a 32-record failure rate traceable entirely to bad source data.
A franchising group had been running all 77 of its member businesses out of a single HighLevel CRM subaccount for years. Contacts, notes, pipeline opportunities, and calendar bookings were all co-mingled — 188,000 contacts with no clean way to tell which business owned which record.
The decision was made to migrate every business onto its own isolated subaccount with its own API key, pipelines, automations, and integrations. Each business needed to receive only its own data — no cross-contamination.
This project delivered that: a complete extraction, transformation, and re-import of over 700,000 CRM records, finished with a 32-record failure rate against a dataset of that size — all traceable to bad source data.
HighLevel's API cannot create custom fields with options (radio buttons, checkboxes, multi-select dropdowns). The API silently accepts the request but never creates the field. This is not documented anywhere.
// SOLUTIONThe bug was identified, documented, and reported. The 451 option-based custom fields were created manually via the UI for each subaccount. The remaining field types were fully automated. The discovery was turned into an API reference document to prevent future projects from hitting the same wall.
"New Lead" in Subaccount A and "New Lead" in Subaccount B have the same name but completely different UUIDs. GHL snapshots clone the workflow structure but do not remap stage IDs — leaving every imported workflow silently pointing to the wrong subaccount's pipeline stages.
// SOLUTIONDuring setup, every subaccount's pipeline stages were fetched via API and stored in a SQLite lookup table keyed by (subaccount, stage name). At import time, every workflow trigger, action, and branch condition was rewritten against the correct destination IDs — roughly 8 remappings per workflow, 770 total.
During large batch fetches, the source HighLevel API returned intermittent 5xx errors for 375 contacts across 2 businesses — not because the data was bad, but because the API was under load. A naive run-and-hope approach would have silently dropped those records.
// SOLUTIONEvery API call wrote its outcome to a progress JSON file in real time. A dedicated retry_failed.py script read those files, identified failed contacts, and re-fetched each one individually with exponential backoff. 100% of transient failures were recovered.
GHL snapshots clone workflow steps — including Slack and Google Sheets integrations — but the OAuth tokens embedded in those steps still reference the source account. The result: cloned workflows that look correct but will silently fail on every execution.
// SOLUTIONEach integration step was manually disconnected and reconnected in each destination subaccount, forcing a fresh OAuth handshake. This was documented as a non-automatable step and built into the project timeline as explicit manual work.
HighLevel's contact search API uses a searchAfter cursor rather than page numbers. ~376 API calls with a 500-contact page size extracted all 188K contacts. Checkpoint files ensured no re-work on interruption.
Every contact receives a new UUID in its destination subaccount. A SQLite mapping table (old ID → new ID) was populated during contact import, then used to rewrite all foreign-key references in opportunities and notes.
ThreadPoolExecutor with 10 workers per subaccount achieved 2–3× throughput over sequential imports. Subaccounts were processed sequentially to prevent aggregate rate-limit spikes against the destination API.
451 fields mapped using a name-based lookup table built during setup. 70–103 fields translated per contact. Option-based fields (blocked by API bug) created manually then included in automation.
837 source calendars assigned to 76 businesses based on name prefixes. The matching logic handled abbreviations, & vs and, multi-location variants, and punctuation differences — 280 calendars created with zero failures.
When any lead reaches "Responded (Hot)" in any of the 77 pipelines, a webhook fires to Hot Prospector's power dialer API. The lead enters a live dialing queue within seconds. Remapped across all 77 subaccounts.
def extract_all_contacts(api_key, location_id):
"""Extract all contacts using cursor-based pagination."""
contacts = []
search_after = None
page = 0
while True:
payload = {
"locationId": location_id,
"limit": 500,
"sort": "dateAdded",
"sortDirection": "asc",
}
if search_after:
payload["searchAfter"] = search_after
response = requests.post(
f"{BASE_URL}/contacts/search",
headers={"Authorization": f"Bearer {api_key}", "Version": "2021-07-28"},
json=payload,
)
data = response.json()
batch = data.get("contacts", [])
if not batch:
break
contacts.extend(batch)
search_after = data.get("meta", {}).get("nextPageAfter")
page += 1
print(f"Page {page}: {len(contacts):,} contacts fetched")
if not search_after:
break
time.sleep(0.3) # Rate limiting
return contacts
The cursor (searchAfter) is a composite sort key returned by the API in meta.nextPageAfter. Unlike offset-based pagination, cursors stay stable even if records are added during extraction — critical when working with an actively-used CRM. The 0.3s delay and deterministic sort order prevent duplicate or skipped records across the ~376 pages required to exhaust the dataset.
The project was structured into discrete milestones to allow safe checkpointing. Each phase was independently verifiable before the next began — ensuring that a failure in any step could not corrupt the output of a prior one.
Fetched all 77 destination subaccounts, created API key mapping, initialized SQLite schema for contact ID translation, custom field maps, and stage lookups.
Cursor-based extraction of all 188,000 contacts from the source account. Checkpoint files written after each page. Took ~376 paginated API calls.
Parsed business-identifying tags and fields to assign each of the 188K contacts to one of the 77 destination subaccounts. Built the per-subaccount import batches.
Imported 83,286 contacts using concurrent workers. Built old-ID → new-ID translation table in SQLite as each contact was created. Automated retry recovered 15 subaccounts.
Extracted 36,339 opportunities. Translated contact IDs and pipeline stage IDs. Imported 27,747 opportunities across 76 subaccounts.
482,081 notes extracted. Contact ID references rewritten. 427,911 notes imported at 88.8% success rate — all failures attributed to invalid source data.
837 source calendars matched to 76 businesses via fuzzy name-prefix logic. 280 calendars created across destinations with zero failures.
~770 workflow configurations rewritten. Pipeline stage IDs, Slack steps, Google Sheets steps, and Hot Prospector webhooks all remapped per subaccount. OAuth integrations manually reconnected.
Record counts verified, failure reports generated. Reusable FastAPI migration web app built and documented for future client use.
Each business now operates in an isolated HighLevel subaccount with its own contacts, pipelines, automations, and calendar. Zero data leakage between businesses. Each subaccount owner can see and work with only their own leads.
The Hot Prospector webhook fires in seconds when any lead hits "Responded (Hot)" in any of the 77 pipelines. Leads enter a live dialing queue automatically — no manual routing, no CSV exports, no delays.
The migration web app (FastAPI + HTMX + SQLite + Server-Sent Events) was engineered as a generic tool — no hardcoded account IDs or API keys. Any future HighLevel migration can be started by pasting two API keys and clicking through a five-step guided workflow.