- S3-based lakehouse (Apache Iceberg + Trino) with a dbt layer for analytics/regulatory reporting.
- Airflow DAGs for multi-source ingestion (retries/backoff, on-failure alerts) and coding/security standards.
- Data quality checks and documentation (EIOPA/ESMA/DORA).
Hi, I'm Vytautas.
Data engineer with hands-on experience building ELT pipelines and lakehouse-style analytics stacks in regulated environments. I like owning pipelines end-to-end, improving reliability, and delivering measurable outcomes.
Experience
Lakehouse architectures, Airflow orchestration and practical data quality.
- Automated incremental pipeline from MySQL to Apache Iceberg via Trino, orchestrated with Airflow.
- Improved query latency and analytics workflows.
- Led projects with strict compliance requirements; mentored peers and managed timelines and documentation.
Skills
Languages
Data Engineering
Infrastructure / Cloud
Data Quality / Governance
Nice to have
Projects
ELT architecture
Focus: Airflow, dbt, S3 data lakeThis ELT project loads source files from object storage, stages them in an analytical database, and then transforms them into standardized, history-aware dimensional tables. Raw data is first copied into a dated backup area, then ingested into staging schemas with minimal structure changes. Next, transformation models build SCD2-style dimensions, applying data quality checks to ensure consistency and traceability. The entire pipeline (backup → load → transform → test → document) runs on Kubernetes cluster and orchestrated by Airflow.
Data product
Focus: maintenanceRefactored a data product that consolidates raw regulatory data into a single governed fact layer, enriched with metadata, entity attributes, and submissions status. Improved SQL readability and maintainability using well-structured CTEs.
Backup/load CLI
Focus: explain code logic
Python CLI that orchestrates three data maintenance modes over S3 and Trino/Iceberg:
backup (copy CSVs between S3 accounts into date-based folders),
load (normalize columns and write into Trino *_staging tables),
and cleanup (delete old backups by retention period). All behavior is
configured via environment variables and CLI arguments.
Contact form: email delivery
Focus: serverlessA fully serverless contact form for my portfolio website. The frontend sends a POST request with user input (email and message). API Gateway acts as a secure HTTP interface, routing requests to the backend. AWS Lambda processes incoming requests, validates the payload, and triggers email delivery. Messages are sent via AWS SES — no servers, no long-running services.
Static portfolio website: S3 hosting
Focus: HTTPS, cost-efficiencySimple website deployment automation with GitHub Actions and Terraform.
Weather Data System: Serverless ingestion
Focus: serverless simplicity and cost control (2024)Automated weather data ingestion to Postgres with subsequent analysis.
Machine Learning Models: ingestion + price prediction
Focus: experimentation pipeline (2024)Pipelines for ML experiments and price prediction.
Recommendation Engine: front-end & back-end
Focus: end-to-end demo (2024)Prototype with data preparation and a web UI.
Data Processing Pipeline: Docker + Airflow
Focus: reproducibility (2024)Containerised example with Airflow orchestration.
Certifications
Status: achieved
Status: in progress
Contact
Location: Vilnius, Lithuania
LinkedIn: linkedin.com/in/pliadis