Hi, I'm Vytautas.

Data engineer with hands-on experience building ELT pipelines and lakehouse-style analytics stacks in regulated environments. I like owning pipelines end-to-end, improving reliability, and delivering measurable outcomes.

Data Engineer dbt Airflow AWS

Get in touch View projects

Experience

Lakehouse architectures, Airflow orchestration and practical data quality.

Data Engineer · Bank of Lithuania

2025 — present

S3‑based lakehouse (Apache Iceberg + Trino) with a dbt layer for analytics/regulatory reporting.
Airflow DAGs for multi‑source ingestion (retries/backoff, on‑failure alerts) and coding/security standards.
Data quality checks and documentation (EIOPA/ESMA/DORA).

Data Engineer Intern · Wix.com

2024 — 2025

Automated incremental pipeline from MySQL to Apache Iceberg via Trino, orchestrated with Airflow.
Improved query latency and analytics workflows.

Architecture & Project Delivery · Architect

2008 — 2024

Led projects with strict compliance requirements; mentored peers and managed timelines and documentation.

Skills

Languages

Python SQL

Data Engineering

Airflow dbt Apache Iceberg Trino / Starburst PostgreSQL Spark (basics)

Infrastructure / Cloud

Terraform (basics) Docker Kubernetes / OpenShift (basics) AWS GitLab CI/CD GitHub Actions Linux

Data Quality / Governance

Great Expectations dbt tests DataHub

Nice to have

Streamlit FastAPI Postman n8n

Projects

ELT architecture

Focus: Airflow, dbt, S3 data lake

This ELT project loads source files from object storage, stages them in an analytical database, and then transforms them into standardized, history-aware dimensional tables. Raw data is first copied into a dated backup area, then ingested into staging schemas with minimal structure changes. Next, transformation models build SCD2-style dimensions, applying data quality checks to ensure consistency and traceability. The entire pipeline (backup → load → transform → test → document) runs on Kubernetes cluster and orchestrated by Airflow.

ELT architecture with Apache Airflow, dbt and an S3 data lake.

ELT dbt Airflow

Backup/load CLI

Focus: explain code logic

Python CLI that orchestrates three data maintenance modes over S3 and Trino/Iceberg: backup (copy CSVs between S3 accounts into date-based folders), load (normalize columns and write into Trino *_staging tables), and cleanup (delete old backups by retention period). All behavior is configured via environment variables and CLI arguments.