Senior Software Engineer-Data Engineering, Bangalore, Karnātaka, India / Chennai, Tamil Nādu, India

Lokalizacja	Bangalore, Karnātaka / Chennai, Tamil Nādu, India
Data wysłania	piątek, 19 czerwca 2026
Złóż wniosek do	czwartek, 25 czerwca 2026
Rodzaj umowy	Full time
Rodzaj pracy	Regular
Identyfikator zapotrzebowania	R0000332856

Opis

Career Area:

Technology, Digital and Data

Job Description:

Your Work Shapes the World at Caterpillar Inc.

When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other. We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.

Job Summary

We are looking for a highly motivated and experienced Data Engineer to join our data engineering team. The ideal candidate will have a strong background in building scalable data pipelines using the AWS cloud stack and extensive hands-on experience with Snowflake. Proficiency in Python and SQL, along with graph and vector database technologies, is essential. This role requires strong problem-solving abilities and a proactive mindset to deliver efficient, scalable, and reliable data solutions.

Key Responsibilities

Design, develop, and maintain scalable data pipelines on AWS using services such as S3, Glue, Lambda, Redshift, and EMR.
Build and optimize data warehousing solutions using Snowflake, including performance tuning and data modeling.
Write efficient and reusable code in Python and SQL for data transformation and processing.
Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data requirements.
Develop and optimize solutions using graph databases (e.g., Neo4j, Amazon Neptune), including query design and performance tuning.
Design, build, and operate vector database solutions (e.g., Milvus, Amazon OpenSearch) to support semantic search, recommendations, RAG, and AI-driven use cases.
Integrate vector databases with LLM-based applications and AI workflows.
Monitor, troubleshoot, and improve pipeline performance and reliability.
Ensure data quality, integrity, and security across all stages of the pipeline.
Participate in code reviews, architecture discussions, and continuous improvement initiatives.

Required Qualifications

8+ years of experience in data engineering or related roles.
Strong hands-on experience with AWS cloud services, including data and AI workloads.
Deep understanding of Snowflake architecture, performance tuning, and best practices.
Advanced proficiency in Python and SQL for data pipelines, transformations, and services.
Strong understanding of graph and vector data modelling concepts and their practical applications.
Hands-on experience with graph databases (e.g., Neo4j, Neptune) and vector databases (e.g., Milvus, Amazon OpenSearch).
Experience with version control systems (e.g., Git) and Git workflows.
Experience working with Azure DevOps (AzDO) boards for backlog management in Agile environments.
Excellent analytical and problem-solving skills.
Strong communication and collaboration abilities.
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.

Nice to Have skills

Knowledge of the NVIDIA ecosystem and its applications in data and AI.

Preferred Qualifications

Experience with orchestration tools such as AWS Step Functions.
Familiarity with data governance and compliance practices.
Exposure to real-time data processing frameworks (e.g., Kafka, Spark Streaming).

Mode detail on Knowledge Base

Experience designing and deploying data ingestion pipelines for unstructured sources such as PDFs, Word documents, and HTML files, including text extraction, chunking strategies, and embedding generation at scale.
Hands-on expertise with vector databases, specifically Milvus, covering schema design, indexing, and optimizing write performance for large-scale embedding ingestion pipelines.
Proficiency in building Knowledge Graph ingestion pipelines using Graph Databases — including entity extraction, relationship modelling, and populating nodes and attributes.
Strong pipeline engineering skills in Python and frameworks for orchestrating multi-stage document processing workflows, with experience deploying and monitoring these pipelines in production environments.
Bonus: Exposure to RAPIDS libraries (cuDF, cuML, cuGraph) or CUDA-based tooling for GPU-accelerated data processing, enabling faster transformation and optimization during large-scale ingestion workflows.

Posting Dates:

June 19, 2026 - June 25, 2026

Caterpillar is an Equal Opportunity Employer. Qualified applicants of any age are encouraged to apply

Not ready to apply? Join our Talent Community.

Zespół analityczny i ds. cyfrowych

Każdego dnia zespoły ds. analizy i treści cyfrowych w Caterpillar przekraczają kolejne granice, pomagając nam rozwijać się, budować społeczności i są motorem biznesu. Pomagają zmieniać sposób prowadzenia działalności i nieustannie wymyślają nowe rozwiązania, aby dotrzymać kroku stale ewoluującemu światu cyfrowemu. Nasz zespół łączy floty klientów na całym świecie, usprawnia procesy wewnętrzne i pomaga nam w bezproblemowej obsłudze wszystkich osób, z którymi się kontaktujemy. Ponieważ wysoko cenimy specjalistyczne umiejętności, jakie posiadają członkowie naszego zespołu ds. cyfrowych, oferujemy im szansę zobaczenia, jak ich pomysły są wprowadzane w życie w postaci rzeczywistych produktów i usług, które mają wpływ na miejsca pracy na całym świecie. Ponadto stosujemy podejście do rozwoju zawodowego oparte na mocnych stronach, które pozwala członkom naszego zespołu ds. cyfrowych rozwijać swoje umiejętności i pracować nad swoją przyszłością, pomagając nam jednocześnie pracować nad naszą.

Senior Software Engineer-Data Engineering

Opis

Kalendarz talentów