
Pipeline Engineer/ 2 days ago
Alphamatician
Attractive
Application ends: 2026-04-19
Quick Summary
This remote, US-based role involves taking end-to-end ownership of data collection infrastructure for an alternative data platform. You will monitor and improve production pipelines that scrape, clean, and map data from dozens of web sources using PHP, Python, and Node.js. Candidates need 3–6 years of experience in ETL or web scraping, strong MySQL optimization skills, and proficiency with AWS RDS and EC2.
Alphamatician is looking for a Data Pipeline Engineer to take ownership of key components of the data collection infrastructure that powers our alternative data platform.
This is a hands-on engineering role. You will monitor, maintain, and improve production systems that collect data daily from dozens of web sources, clean and map it to company identifiers, and deliver it to institutional clients. You will own multiple data collection processes end to end.
What You Will Do:
- Monitor the data pipeline and error logs across multiple collection processes
- Diagnose and resolve data quality issues by working across the codebase, the database, and the underlying infrastructure
- Own the full lifecycle of data collection processes, from source scraping through cleaning, mapping, and loading
- Maintain and improve scraping and parsing logic as source websites change
- Work with production MySQL databases on AWS and manage data integrity across large-scale datasets
- Contribute to platform documentation and operational processes
Tech Stack:
- Core application: CodeIgniter 4 (PHP)
- Sub-projects and tooling: Python and Node.js
- Database: MySQL on AWS RDS
- Infrastructure: AWS (EC2, RDS)
What We Are Looking For:
- 3–6 years of experience with data pipelines, ETL processes, or web scraping infrastructure in a production environment
- Strong working knowledge of at least two of: PHP, Python, Node.js
- Solid MySQL skills, including query optimization and troubleshooting at scale
- Experience with AWS, particularly RDS and EC2
- Comfort with web scraping and the unpredictability of external data sources
- Self-directed work style with the ability to manage priorities independently
Nice to Have:
- Experience in financial data, alternative data, or fintech
- Familiarity with CodeIgniter or other PHP frameworks
- Understanding of data mapping, entity resolution, or securities identifiers (tickers, ISINs)
Details:
Competitive compensation based on experience. Remote, US-based.
