🧨 Data Engineer, Remote Job, April 2025

Webinterpret is all about taking e-commerce to the next level, empowering over 12,000 sellers to expand globally with ease. For more than 16 years, we've simplified cross-border selling with powerful tools that manage everything from AI-driven ad optimization and smart inventory syncing to handling currency fluctuations and localizing listings for different markets.

What do we do? Our tool automatically translates, localizes, and lists your products on international marketplaces like eBay and Amazon. We handle everything—from currency conversion to shipping options—so you can expand your reach, boost sales, and tap into new markets without the hassle. We take care of international shipping and returns, making global e-commerce as simple as selling locally.

What makes us unique? We’re not just helping sellers go global—we’re making it effortless. Our AI-driven tools, smart automation, and deep expertise turn complex challenges into simple, powerful solutions for global growth.

From massive scale listings to seamless inventory management, we handle the complexity so sellers can focus on success.

About the Role:

We are looking for a Data Engineer who can blend robust data engineering practices with strong analytical skills. In this role, you will architect and maintain data pipelines, optimize our AWS-based Data Lake and Data Warehouse, and collaborate with stakeholders to produce meaningful, self-serve analytics. As we move toward modern data governance and AI-driven automation, you’ll be a key player in building and scaling our data infrastructure for the next phase of growth.

Short job description:

Location: Remote / Warsaw
Experience: Middle
Work framework: Scrum / Kanban
Employment Type: B2B
Location: Fully remote, hybrid, or office based on your preference
Flexible working hours: Yes (with a start time between 8.00 AM - 10.00 AM CET/CEST)

What We Do:

We process 500M+ records daily in AWS (S3, EMR, Glue, Athena, Apache Hudi/Delta Lake).
We operate an AWS Redshift-based Data Warehouse.
We build and orchestrate data pipelines in Python-based tools.
We integrate and transform data from multiple sources (SQL/NoSQL DBs, APIs, queues, files) for internal reporting.
Our next major milestone focuses on self-service data solutions, data lineage, and best practices for securing sensitive data.

Responsibilities:

Design, develop, and optimize ETL/ELT pipelines in Python to ensure timely, accurate, and reliable data delivery.
Leverage AWS S3, EMR, Athena, Redshift, and related services to build and maintain our Data Lake and Warehouse.
Work with business stakeholders to define data requirements and model datasets in a scalable, reusable way.
Monitor pipeline performance, optimize SQL queries in Redshift, and implement best practices for data validation and testing (e.g., Great Expectations).
Connect to various data sources (Salesforce, RabbitMQ, Kafka, REST APIs, SQL databases) and handle synchronization/ingestion at scale.
Leverage BI tools (e.g., Tableau, Superset, Count) to develop dashboards and reports that support data-driven decisions across the company.
Write complex SQL queries, clean and analyze data, and verify business hypotheses for ongoing and ad-hoc requests.
Support our transition toward self-service analytics and AI-driven automation, ensuring data is easily accessible and well-structured.
Collaborate with other teams (e.g., Software Engineers, DevOps) to maintain high data quality, completeness, and reliable data processing pipelines.
Use Git (or similar) for code and documentation to promote stable, scalable workflows and predictable collaboration.

What You Bring:

2+ years of experience designing and developing ETL pipelines in Python
Comfortable writing and optimizing complex SQL queries in a data warehouse environment (e.g., Redshift).
Hands-on experience building and maintaining dashboards and reports using BI platform (e.g., Tableau, Superset, Count).
Capable of translating business needs into technical solutions, and comfortable verifying hypotheses with data.
Hands-on experience with S3, Athena, and Redshift to manage large-scale data environments.
Able to discuss technical concepts with both technical and non-technical audiences, including senior leadership, in English.

Nice-to-Have:

Familiarity with Git, CI/CD, unit testing (Pytest or similar), and containerization (Docker, Kubernetes).
Basic knowledge of AI/ML techniques, supporting our journey toward automation.
Active profiles (e.g., GitHub, Tableau Public) or a data-related blog showcasing ongoing learning and community engagement, as we love working with self-driven people contributing to the data communities and learning in public
Understanding of performance frameworks, goal tracking, and metrics alignment (e.g., OKRs, KPIs) to ensure data solutions directly support key business objectives.
Previous experience in the eCommerce sector - as you will have a better insight into improving our customers' understanding and seeking the most impactful ways of solving their problems
Enthusiasm for proposing and implementing new ideas to enhance our data solutions and workflows.

What can we offer you?

Take care of you and your loved ones' health with private medical health insurance.
Multisport cards for you and your family will take care of your health and well-being.
No more morning rush - start your workday anytime between 8:00 AM - 10:00 AM.
We know how to balance work and fun in accordance with #workhard #playharder - join our company events on a legendary level!
Stay longer and earn more - enjoy extra paid days off throughout the Anniversary program.
Grow with us through our company revenue sharing scheme, LTIP - when we succeed, so do you.

Data Engineer