slice icon Context Slice

Gmail Profile Extraction

You have two data sources. Extract different things from each.

Source 1: Discovery Searches

The discovery section contains emails found by targeted searches. Extract specific facts:

Category What to Extract
location City, state, zip from shipping addresses
children Names, ages, schools
partner Name, relationship type
pets Names, species/breed
birthday Birth date or month
phone_numbers Phone numbers mentioned in emails (extract the number)
whatsapp, signal_app, telegram Messenger usernames or phone numbers from verification emails
instagram, linkedin, twitter, github, etc. Username or profile URL from email subject/snippet
work_calendar, meetings Regular meeting patterns, preferred video platform
slack Workspace names they belong to
amazon, subscriptions, food_delivery, retail Preferred services and stores
homeowner/renter Property type, approximate area
vehicle Make, model, year
banking, credit, investments Institution names only (never account numbers)
moving New location, move date
job_change New company, role, start date
wedding Event date, partner name if mentioned
travel Frequent destinations, airlines, loyalty programs
health Healthcare providers (never conditions)
education Schools, degrees, graduation years
spotify, netflix, apple, discord Platform usage (indicates ecosystem preferences)
gaming Platforms, games they play
online_courses, substack Learning interests, newsletters they read
fitness Gym, apps, fitness activities
venmo Frequent contacts (who they pay/receive from)
work_tools Professional tools they use (Figma, Notion, etc.)
donations Causes and charities they support (reveals values)
books Reading habits, genres, recent reads
side_business Whether they sell/create something, what platform
kids_activities Sports, activities, involvement level
professional_orgs Industry associations, memberships, conferences

Signal Confidence

Not all discovery results are equal. Weight by source:

Source Confidence Why
in:sent queries (children, partner, pets, phone, job_change, moving) High User wrote this about themselves
Verification emails (from:whatsapp, from:instagram with "code") High Confirms account ownership
Shipping/delivery notifications Medium Address could be gift recipient
Generic service emails Low Confirms usage, not details

Recency Matters

Use timeAgo to weight information:

  • Location: Prefer recent shipping addresses. 2mo ago > 3y ago
  • Job: Recent announcements override old ones
  • Partner/children: Older mentions are fine (stable facts)
  • Phone numbers: Prefer recent (numbers change)
  • Platform accounts: Any age confirms existence

Ignore These Patterns

Even with targeted queries, some noise gets through. Skip:

Marketing/Promotional:

  • Emails with "unsubscribe" in footer but no personal info
  • "Your husband will love this!" — not about THEIR husband
  • Generic "dear customer" or "dear member" emails

False Positives by Category:

Category Ignore
children "Kids sale!" "For your kids" (marketing)
partner "Gift for your wife" (retail marketing)
location Gift shipping addresses (check if name matches user)
job_change Recruiter outreach, job alerts
donations "Please donate" solicitations (not actual donations)
professional_orgs Spam conference invites

Signal vs Noise Indicators:

  • Signal: Specific names, dates, confirmation language, user in To: field
  • Noise: Generic language, bulk sender patterns, promotional tone

Source 2: Writing Samples

The writing samples contain sent email content. Extract persistent patterns, not transient activity.

Extract:

  • Work domain: Infer field/industry from recurring themes, technical vocabulary, domain concepts
  • Interests: Topics appearing meaningfully across multiple emails (2+ mentions with depth)

Do NOT extract:

  • Specific project names (transient)
  • Current tasks or deadlines (changes constantly)
  • Topics from single emails (could be one-off)
  • Collaborator names (privacy concern)

Persistence signals: Topic appears across emails with different dates, relates to work domain, shows depth not just mentions.

Output

Write to uiUser Profile:

  • personal — location, family, pets, birthday
  • work — work domain, industry (only if clear pattern emerges)
  • interests — social platforms with usernames, hobbies, topics of genuine interest

Rules: Only write facts with clear evidence. Skip weak signals. Never write sensitive financial details. When uncertain, write nothing—false positives are worse than missing data.