Airbyte vs custom ETL: when to use each
Off-the-shelf connectors vs custom Python: how we pick between Airbyte, Fivetran, and a hand-rolled ETL for SMB clients.
The honest framing
If a hosted connector exists for your source, use it. The time you save is worth more than the SaaS bill, almost always. We reach for a custom pipeline only when the off-the-shelf options have a real gap.
When Airbyte / Fivetran wins
- Source is one of the top 100 SaaS apps (Stripe, HubSpot, Shopify, Salesforce, ad platforms)
- You need it loaded into a standard warehouse (Postgres, BigQuery, Snowflake, Redshift)
- The default schema is "good enough" and you will do final transformation in SQL
Airbyte is open-source and self-hostable. Fivetran is hosted and more expensive. Both are dramatically faster to set up than custom code.
When custom wins
- Source is a partner-specific API with auth nobody else uses
- You need real-time or near-real-time updates (most off-the-shelf is batch, every 1–24 hours)
- The data needs significant transformation before it lands (PII redaction, tenant separation)
- You need to control the cost ceiling — Fivetran in particular can get expensive at volume
- The off-the-shelf connector is "supported" but buggy and the vendor is slow to fix it
We have built custom pipelines for clients where the Fivetran bill was approaching 4-figures-per-month and the actual volume was small. Custom paid for itself in two months.
The hybrid approach
Often the right answer:
- Airbyte for the 8 common SaaS sources
- Custom Python for the 2 weird ones
- dbt in the warehouse for transformation
This gives the team one tool to learn for transformation (SQL + dbt), one tool to operate for common sources (Airbyte), and a small, well-tested custom codebase for the rest.
What we will not do
We will not build a "framework" for pipelines. Every team that does this ends up with a worse Airflow. Use the tools that exist. If you need orchestration beyond cron, use Prefect or Dagster. Do not build it yourself.
The five-minute decision
Ask: "Is the source on Airbyte's supported list, and does the default schema work for me?" If yes, use Airbyte. If no, build it custom.
Do not spend a week evaluating. The answer is usually obvious after 10 minutes of looking at the connector docs.