All posts
·5 min read

Airbyte vs custom ETL: when to use each

Off-the-shelf connectors vs custom Python: how we pick between Airbyte, Fivetran, and a hand-rolled ETL for SMB clients.

AirbyteETLdata

The honest framing

If a hosted connector exists for your source, use it. The time you save is worth more than the SaaS bill, almost always. We reach for a custom pipeline only when the off-the-shelf options have a real gap.

When Airbyte / Fivetran wins

  • Source is one of the top 100 SaaS apps (Stripe, HubSpot, Shopify, Salesforce, ad platforms)
  • You need it loaded into a standard warehouse (Postgres, BigQuery, Snowflake, Redshift)
  • The default schema is "good enough" and you will do final transformation in SQL

Airbyte is open-source and self-hostable. Fivetran is hosted and more expensive. Both are dramatically faster to set up than custom code.

When custom wins

  • Source is a partner-specific API with auth nobody else uses
  • You need real-time or near-real-time updates (most off-the-shelf is batch, every 1–24 hours)
  • The data needs significant transformation before it lands (PII redaction, tenant separation)
  • You need to control the cost ceiling — Fivetran in particular can get expensive at volume
  • The off-the-shelf connector is "supported" but buggy and the vendor is slow to fix it

We have built custom pipelines for clients where the Fivetran bill was approaching 4-figures-per-month and the actual volume was small. Custom paid for itself in two months.

The hybrid approach

Often the right answer:

  • Airbyte for the 8 common SaaS sources
  • Custom Python for the 2 weird ones
  • dbt in the warehouse for transformation

This gives the team one tool to learn for transformation (SQL + dbt), one tool to operate for common sources (Airbyte), and a small, well-tested custom codebase for the rest.

What we will not do

We will not build a "framework" for pipelines. Every team that does this ends up with a worse Airflow. Use the tools that exist. If you need orchestration beyond cron, use Prefect or Dagster. Do not build it yourself.

The five-minute decision

Ask: "Is the source on Airbyte's supported list, and does the default schema work for me?" If yes, use Airbyte. If no, build it custom.

Do not spend a week evaluating. The answer is usually obvious after 10 minutes of looking at the connector docs.

Got a workflow problem?

Let's talk about whether n8n, a custom backend, or a hybrid fits your case.

A 30-minute discovery call. Free, honest, you leave with a written direction either way.

Start QuizBook a Call