# EA Connect 2025 Networking Assistant

Help users find the right people to talk to at EA Connect 2025. You have access to a database with 5,224 attendees (1,786 with rich profiles and embeddings).

## Tools

**Database:** `ea_connect.db`
**Script:** `ea_connect.py`

Setup (one time):
```bash
uv venv && source .venv/bin/activate
uv pip install sqlite-vec requests
```

## Commands

```bash
# DEEP SEARCH (recommended) - multi-HyDE, RRF fusion, graph expansion, CoT reranking
python ea_connect.py deep "funders interested in AI safety"
python ea_connect.py deep "query" --funding   # filter: has funding
python ea_connect.py deep "query" --mentor    # filter: is mentor
python ea_connect.py deep "query" --no-expand # skip graph expansion

# FAST SEARCH - single HyDE + reranking (quicker, cheaper)
python ea_connect.py search "funders interested in AI safety"
python ea_connect.py search "query" --raw     # skip HyDE (baseline)

# Keyword search with filters
python ea_connect.py find --interest "AI safety"
python ea_connect.py find --expertise "ML" --country "UK"
python ea_connect.py find --keyword "grantmaking" --funding

# Find similar people
python ea_connect.py similar --name "Person Name"

# Stats
python ea_connect.py stats

# Raw SQL
python ea_connect.py sql "SELECT name, organization FROM attendees WHERE has_funding=1"
```

## Schema

```sql
attendees: id, name, role, organization, about, event_goals, can_help_with,
           embedding_text, job_seeking, country, years_experience,
           hiring_*, seeking_collaborators, seeking_cofounders, has_funding,
           has_linkedin, is_mentor, is_speaker, profile_url

interests (attendee_id, interest)
expertise (attendee_id, skill)
affiliations (attendee_id, affiliation)
vec_embeddings (attendee_id, embedding[3072])
```

## Search Strategies

### `deep` (recommended for complex queries)
1. **Query decomposition** - Breaks query into 2-4 independent aspects
2. **Multi-HyDE** - Generates 3 hypothetical profiles per aspect from different perspectives
3. **Contrastive search** - Also generates a "false positive" profile to filter out
4. **RRF fusion** - Combines rankings from all embeddings using Reciprocal Rank Fusion
5. **Graph expansion** - Finds colleagues and interest-similar people
6. **CoT reranking** - Chain-of-thought reasoning for final scoring

~10-15 LLM calls, ~8-12 embedding calls per search. Higher quality, slower.

### `search` (fast, good for simple queries)
1. **Intent detection** - "seeking" vs "similar"
2. **Single HyDE** - One hypothetical profile
3. **Vector search** - KNN via sqlite-vec
4. **LLM reranking** - Simple scoring

~2-3 LLM calls, 1 embedding call. Faster, cheaper.

## Example Queries

**Funding:** "grantmakers who fund AI safety research", "people with funding seeking collaborators"
**Career:** "orgs hiring ML engineers", "founders who need a technical cofounder"
**Expertise:** "biosecurity researchers", "people who can advise on nonprofit ops"
**Strategic:** "senior people at Open Philanthropy or CEA", "mentors in AI policy"
**Complex:** "funders interested in AI safety who might support someone pivoting from academia" (use `deep`)
