open to work

Gabriel
Henrique

_

~/.gabriel

$ cat about.txt

Data Engineer building the infrastructure

that transforms raw data into revenue.

Architect of CRM pipelines (+$100M) and

Data Owner of the Data Catalog for 3,000+ users/month.

 

$ echo $STACK

→ Python · SQL · Airflow · Databricks · Azure · Spark

 

$ echo $LOCATION

São Paulo, SP · Brazil

 

15+ Projects delivered 500K+ Lines of code 10TB+ Data processed 98% Pipeline uptime 3+ Years of experience 15+ Projects delivered 500K+ Lines of code 10TB+ Data processed 98% Pipeline uptime 3+ Years of experience
01

What I do

01

Ownership & Pipelines

I don't just build pipelines — I own data products end-to-end. I prioritize reliability, performance and quality to ensure information drives real impact on business decision-making.

SparkDatabricksAzureAirflow
02

Automation & Impact

I designed and delivered automation solutions and web crawlers that unlock real business value, not just technical wins, aligning technology with commercial strategy.

PythonSeleniumScrapyFastAPI
03

Performance & SQL

I optimize legacy queries and ELT/ETL pipelines with time reductions exceeding 80% through hands-on performance tuning, serving 3,000+ users at scale.

SQLPythonSASPostgreSQL
02

Stack

Apache Spark 88%
Azure (Cloud) 88%
Apache Airflow 87%
SAS 85%
FastAPI / Flask / Node.js 82%
Web Scraping (Selenium, Scrapy, BS4) 85%
Git & GitHub 88%
03

SQL Playground

— query my data

📂 db_gabriel

▸ gabriel

name TEXT age INT city TEXT state TEXT email TEXT github TEXT available BOOL

▸ family

relation TEXT support TEXT knows_it TEXT description TEXT

▸ hobbies

name TEXT category TEXT weekly_freq INT dedication_level TEXT mental_note TEXT

▸ education

institution TEXT course TEXT type TEXT start_year INT end_year INT status TEXT

▸ certifications

title TEXT issuer TEXT year INT impact TEXT

▸ skills

name TEXT level INT category TEXT years_exp INT

▸ soft_skills

skill TEXT context TEXT

▸ tools

name TEXT category TEXT level TEXT daily_use BOOL

▸ experience

role TEXT company TEXT start_year INT end_year TEXT description TEXT

▸ achievements

milestone TEXT year TEXT business_impact TEXT
gabriel_db=#

-- 👹 Welcome to my personal database!

-- Try running a query. Examples:

--

-- SELECT * FROM gabriel;

-- SELECT name, level FROM skills WHERE level > 85;

-- SELECT * FROM education ORDER BY start_year DESC;

-- SELECT role, company FROM experience;

-- SHOW TABLES;

--

-- Tip: click on the examples below 👇

gabriel_db=#
gabriel family hobbies education certifications skills soft_skills tools experience achievements SHOW TABLES
04

Projects

01

CVM-210 Data Pipeline

End-to-end analytics pipeline for CVM 210 investment funds. Serverless ingestion via AWS Lambda, S3 Data Lake storage and distributed processing in Databricks with Medallion Architecture (Bronze→Silver→Gold).

🎯 Highlight: Automated Medallion Architecture (AWS + Databricks) processing restricted CVM 210 data with Delta Lake (ACID).

AWSDatabricksPySparkS3
02

PNAD COVID Data Engineering

Comprehensive AWS Data Lake with Medallion Architecture. Processed 1.1M records via Glue (PySpark), analytical queries in Athena and impact dashboards in Power BI about COVID-19 in Brazil.

🎯 Highlight: Processing of 340+ MB of raw data (1.1M+ records) via AWS Glue (PySpark) and Athena.

AWS GluePySparkAthenaPower BI
03

Obesity Prediction

Predictive ML model for preventive health: classifies obesity risks by analyzing behavioral patterns (diet, physical activity, transportation) instead of traditional anthropometric metrics.

🎯 Metric: End-to-End Pipeline with 87% accuracy (Random Forest) and interactive web interface via Streamlit.

PythonScikit-learnPandasML
04

Ibovespa Forecasting System

Machine Learning system to predict the daily direction of the Ibovespa. Complete pipeline with advanced feature engineering, automated hyperparameter optimization and rigorous time-series holdout validation.

🎯 Metric: 75.76% accuracy and automated optimization via Optuna with temporal holdout validation.

PythonMLFeature Eng.Forecasting
05

Loan Default Prediction

Predictive model (Random Forest) for loan default propensity with interactive analytical dashboard in Dash/Plotly. Risk visualization by state/region, key default metrics and credit portfolio analysis.

🎯 Metric: 72% accuracy and interactive analytical dashboard for corporate decision-making.

PythonRandom ForestDashPlotly
05

Certifications

— swipe left
AZ

Azure Fundamentals (AZ-900)

Microsoft

2026

HBS

Aspire Leaders Program

Aspire Institute (Harvard Business School)

2025

AF

Airflow 3 DAG Authoring

Astronomer

2025

AF

Airflow 3 Fundamentals

Astronomer

2025

SC

Scrum Foundation

Certiprof

2023

PY

Python — Nano Course (80h)

FIAP

2022

06

Experience

2023 — present now

Data Governance / Data Engineer

Bradesco

I architect and manage the core data pipelines of a CRM that supports $100M+ in monthly revenue. I act as Data Owner of the internal Data Catalog (3,000+ users/month). Optimized SQL/ELT processes by +80% and built automations with high business impact.

2024 — 2026

Data Engineer (Consulting)

Confidential (NDA)

Autonomous development of ETLs processing millions of daily records. Built pipelines and automations using Python, DBT and DuckDB. Created frameworks to accelerate development, also coding in Node.js, Java and Scala.

2022 — 2023

Data Engineer / Software Engineer

Keyrus · Internship

Developed ETL/ELT pipelines and features for a corporate data catalog. Implemented a business glossary, MySQL integrations, and refactored legacy SQL achieving +50% performance gains.

07

Education

2026 — present ongoing

MBA People and Technology Management

FIA Business School

Technical leadership and corporate vision. Focused on connecting data engineering to business goals, managing the crucial intersection between teams, technology, and financial results.

2025 — 2026

Graduate Studies in Data Analytics

FIAP

Advanced analytical intelligence and modeling. The essential link to ensure the technical data infrastructure supports precise, value-driven decisions.

2022 — 2024

Systems Analysis & Development

FATEC-SP

The cornerstone of my vision as a Data Engineer. Rigorous foundations in system architecture, software engineering, and modeling to build robust and scalable infrastructures.

08

Let's work
together.

Open to high-impact challenges in Data Engineering and partnerships in consulting. Shoot me a message!