Hi, I'm Vytautas.

Data engineer with hands-on experience building ELT pipelines and lakehouse-style analytics stacks in regulated environments. I like owning pipelines end-to-end, improving reliability, and delivering measurable outcomes.

Data Engineer dbt Airflow AWS

Experience

Lakehouse architectures, Airflow orchestration and practical data quality.

Data Engineer · Bank of Lithuania
2025 — present
  • S3-based lakehouse (Apache Iceberg + Trino) with a dbt layer for analytics/regulatory reporting.
  • Airflow DAGs for multi-source ingestion (retries/backoff, on-failure alerts) and coding/security standards.
  • Data quality checks and documentation (EIOPA/ESMA/DORA).
Data Engineer Intern · Wix.com
2024 — 2025
  • Automated incremental pipeline from MySQL to Apache Iceberg via Trino, orchestrated with Airflow.
  • Improved query latency and analytics workflows.
Architecture & Project Delivery · Architect
2008 — 2024
  • Led projects with strict compliance requirements; mentored peers and managed timelines and documentation.

Skills

Languages

Python SQL

Data Engineering

Airflow dbt Apache Iceberg Trino / Starburst PostgreSQL Spark (basics)

Infrastructure / Cloud

Terraform (basics) Docker Kubernetes / OpenShift (basics) AWS GitLab CI/CD GitHub Actions Linux

Data Quality / Governance

Great Expectations dbt tests DataHub

Nice to have

Streamlit FastAPI Postman n8n

Projects

“Chat with Your Building” RAG Assistant

Focus: n8n orchestration, OpenAI embeddings/chat, ChromaDB retrieval
Illustration of the building RAG assistant concept and data sources.

Proof of concept for a RAG assistant for property administration workflows. The core idea is simple: a resident asks a question, and the assistant answers based on the building profile (a structured record) covering administration, technical maintenance, key parameters, and tariff summaries.

It ingests structured “building passport” data (Vilnius City Municipality Open Data Portal), BIM data (normalized into tabular records), and technical building documentation (PDFs), then indexes the content for grounded Q&A. At runtime, the agent identifies the building (preferably by a unique building ID, but address-based lookup is also possible), retrieves the most relevant context from the appropriate source (passport / BIM / docs), and answers only from retrieved sources.

Example use cases

  • “What are the specified wall materials?”
  • “List all fire safety elements (fire doors, dampers, extinguishers) and their locations.”
  • “Who maintains the engineering systems?”

The prototype also could connect an incident/fault register and scheduled maintenance data, so the assistant can answer:

  • “What is planned for this month?”
  • “When was the last heat substation inspection?”

RAG Runtime Architecture

RAG runtime architecture diagram for the building assistant.
RAG AI BIM n8n

ELT architecture

Focus: Airflow, dbt, S3 data lake

This ELT project loads source files from object storage, stages them in an analytical database, and then transforms them into standardized, history-aware dimensional tables. Raw data is first copied into a dated backup area, then ingested into staging schemas with minimal structure changes. Next, transformation models build SCD2-style dimensions, applying data quality checks to ensure consistency and traceability. The entire pipeline (backup → load → transform → test → document) runs on Kubernetes cluster and orchestrated by Airflow.

ELT architecture with Apache Airflow, dbt and an S3 data lake.
ELT dbt Airflow

Data product

Focus: maintenance

Refactored a data product that consolidates raw regulatory data into a single governed fact layer, enriched with metadata, entity attributes, and submissions status. Improved SQL readability and maintainability using well-structured CTEs.

ELT architecture with Apache Airflow, dbt and an S3 data lake.
data product dbt sql

Backup/load CLI

Focus: explain code logic

Python CLI that orchestrates three data maintenance modes over S3 and Trino/Iceberg: backup (copy CSVs between S3 accounts into date-based folders), load (normalize columns and write into Trino *_staging tables), and cleanup (delete old backups by retention period). All behavior is configured via environment variables and CLI arguments.

UML-like flowchart of a Python CLI that backs up CSVs between S3 buckets, loads them into Trino staging tables, and cleans up old backup folders.
Python S3 flowchart

Contact form: email delivery

Focus: serverless

A fully serverless contact form for my portfolio website. The frontend sends a POST request with user input (email and message). API Gateway acts as a secure HTTP interface, routing requests to the backend. AWS Lambda processes incoming requests, validates the payload, and triggers email delivery. Messages are sent via AWS SES — no servers, no long-running services.

Static portfolio website S3 hosting architecture
API Gateway Lambda SES

Static portfolio website: S3 hosting

Focus: HTTPS, cost-efficiency

Simple website deployment automation with GitHub Actions and Terraform.

Static portfolio website S3 hosting architecture
S3 IAM GitHub Actions Terraform

Weather Data System: Serverless ingestion

Focus: serverless simplicity and cost control (2024)

Automated weather data ingestion to Postgres with subsequent analysis.

Serverless ingestion
Serverless Lambda Postgres

Machine Learning Models: ingestion + price prediction

Focus: experimentation pipeline (2024)

Pipelines for ML experiments and price prediction.

Serverless ingestion
FastAPI ML Docker

Recommendation Engine: front-end & back-end

Focus: end-to-end demo (2024)

Prototype with data preparation and a web UI.

Serverless ingestion
ETL FastAPI Streamlit

Data Processing Pipeline: Docker + Airflow

Focus: reproducibility (2024)

Containerised example with Airflow orchestration.

Serverless ingestion
Docker Airflow

Certifications

AWS Certified Cloud Practitioner
Status: achieved
AWS Certified Solutions Architect – Associate
Status: in progress

Contact

Location: Vilnius, Lithuania

LinkedIn: linkedin.com/in/pliadis