
Web Scraping Engineer/ 1 week ago
Quick Summary
Alphamatician is hiring a Web Scraping Engineer for an individual-contributor role at the heart of our data operations. The work is technical, operational, and quietly important: keeping a mature alternative-data product running reliably for institutional investors and decision makers. The role is structured as contract-to-hire, with W-2 conversion after a successful initial engagement.
What you will do
- Morning (7am ET). Review overnight pipeline logs. Identify failures, anomalies, or coverage gaps. Triage fixes and follow-ups.
- Engineering work. Fix PHP bugs. Update scrapers as target sites change their structure or defenses. Update cron logic. Ship incremental improvements to collection coverage and quality.
- Data quality. Run checks against current and historical baselines to confirm coverage and accuracy.
- Client questions. Respond to client inquiries about coverage, methodology, or anomalies as they come in.
- Strategy. Periodically discuss collection strategies, help scope and stand up new datasets, and contribute to new products and features.
This is operational work with a steady rhythm. The reward is in keeping an important data product running well, and in being good at a specific kind of hard problem (scraping hard sites at scale) that few people are good at.
Hard requirements
- Production PHP experience. CodeIgniter 4 is a strong plus. You must be able to point to public PHP work: a repo, contributions to a project, a blog post, or similar.
- Python proficiency with modern scraping libraries. Working fluency in Playwright, Scrapy, Selenium, Requests, httpx, BeautifulSoup, or comparable. Real scraping work lives in this toolkit.
- Demonstrated experience scraping hard targets at scale. Sites with active anti-bot defenses, dynamic rendering, rate-limit walls, or aggressive blocking. You must include a link to public scraping work in your application, or describe a specific scraper you built in detail (target, defenses encountered, how you solved them).
- MySQL competence. Reading and writing non-trivial queries against tables with hundreds of millions of rows.
- Schedule. 7am US Eastern start.
- US-based with verifiable employment history.
Strong plusses
- Direct experience with anti-bot evasion: residential proxies, TLS fingerprint matching, JA3, header rotation, CAPTCHA strategy.
- Comfort with mature, incrementally maintained codebases.
- Background in financial data, alternative data, or equity research.
- Node.js, Puppeteer, or additional automation tooling.
Scope
This role focuses on day-to-day data collection, quality, and client-facing operations. It is not an architecture, modernization, or infrastructure-ownership role. If you want to do excellent scraping work, solve real problems, and own the process of keeping data products reliable, please apply.
How to apply
You must apply at https://alphamatician.com/careers
Submissions must include:
- Your resume.
- A link to your LinkedIn profile.
- A link to at least one piece of public scraping work. A repo, blog post, conference talk, portfolio piece, or similar. If all your scraping work is proprietary, describe in 150 words or less: one specific scraper you built, the target, the anti-bot challenges you faced, and how you solved them.
- A one-line answer to: what is the hardest site you have ever scraped at scale, and what made it hard?
Verification
All application information is subject to full verification prior to employment. Misrepresentation or unverifiable history is disqualifying.
Pay: $120,000.00 - $140,000.00 per year
Benefits: Flexible schedule
Experience:
- production PHP: 3 years (Required)
- production web scraping: 2 years (Required)
Location: United States (Required)
Work Location: Remote
