Przejdź do głównej zawartości
osoba i osoba siedząca na krzesłach i patrząca na laptopa

Technology, Digital and Data

Senior Software Engineer-Data Engineering

Lokalizacja Bangalore, Karnātaka / Chennai, Tamil Nādu, India
Data wysłania
Złóż wniosek do
Rodzaj umowy Full time
Rodzaj pracy Regular
Identyfikator zapotrzebowania R0000332856

Opis

Career Area:

Technology, Digital and Data

Job Description:

Your Work Shapes the World at Caterpillar Inc.

When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other. We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.

Job Summary

We are looking for a highly motivated and experienced Data Engineer to join our data engineering team. The ideal candidate will have a strong background in building scalable data pipelines using the AWS cloud stack and extensive hands-on experience with Snowflake. Proficiency in Python and SQL, along with graph and vector database technologies, is essential. This role requires strong problem-solving abilities and a proactive mindset to deliver efficient, scalable, and reliable data solutions.

Key Responsibilities

  • Design, develop, and maintain scalable data pipelines on AWS using services such as S3, Glue, Lambda, Redshift, and EMR.
  • Build and optimize data warehousing solutions using Snowflake, including performance tuning and data modeling.
  • Write efficient and reusable code in Python and SQL for data transformation and processing.
  • Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data requirements.
  • Develop and optimize solutions using graph databases (e.g., Neo4j, Amazon Neptune), including query design and performance tuning.
  • Design, build, and operate vector database solutions (e.g., Milvus, Amazon OpenSearch) to support semantic search, recommendations, RAG, and AI-driven use cases.
  • Integrate vector databases with LLM-based applications and AI workflows.
  • Monitor, troubleshoot, and improve pipeline performance and reliability.
  • Ensure data quality, integrity, and security across all stages of the pipeline.
  • Participate in code reviews, architecture discussions, and continuous improvement initiatives.

Required Qualifications

  • 8+ years of experience in data engineering or related roles.
  • Strong hands-on experience with AWS cloud services, including data and AI workloads.
  • Deep understanding of Snowflake architecture, performance tuning, and best practices.
  • Advanced proficiency in Python and SQL for data pipelines, transformations, and services.
  • Strong understanding of graph and vector data modelling concepts and their practical applications.
  • Hands-on experience with graph databases (e.g., Neo4j, Neptune) and vector databases (e.g., Milvus, Amazon OpenSearch).
  • Experience with version control systems (e.g., Git) and Git workflows.
  • Experience working with Azure DevOps (AzDO) boards for backlog management in Agile environments.
  • Excellent analytical and problem-solving skills.
  • Strong communication and collaboration abilities.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.

Nice to Have skills

  • Knowledge of the NVIDIA ecosystem and its applications in data and AI.

Preferred Qualifications

  • Experience with orchestration tools such as AWS Step Functions.
  • Familiarity with data governance and compliance practices.
  • Exposure to real-time data processing frameworks (e.g., Kafka, Spark Streaming).

Mode detail on Knowledge Base

  • Experience designing and deploying data ingestion pipelines for unstructured sources such as PDFs, Word documents, and HTML files, including text extraction, chunking strategies, and embedding generation at scale.
  • Hands-on expertise with vector databases, specifically Milvus, covering schema design, indexing, and optimizing write performance for large-scale embedding ingestion pipelines.
  • Proficiency in building Knowledge Graph ingestion pipelines using Graph Databases — including entity extraction, relationship modelling, and populating nodes and attributes.
  • Strong pipeline engineering skills in Python and frameworks for orchestrating multi-stage document processing workflows, with experience deploying and monitoring these pipelines in production environments.
  • Bonus: Exposure to RAPIDS libraries (cuDF, cuML, cuGraph) or CUDA-based tooling for GPU-accelerated data processing, enabling faster transformation and optimization during large-scale ingestion workflows.

Posting Dates:

June 19, 2026 - June 25, 2026

Caterpillar is an Equal Opportunity Employer. Qualified applicants of any age are encouraged to apply

Not ready to apply? Join our Talent Community.

Kalendarz talentów

Bądź na bieżąco z najnowszymi ofertami pracy i wiadomościami o Caterpillar.

Dołącz do społeczności talentów
kolaż uśmiechniętych ludzi