Share this Job
Apply now »

IT EIM Data Engineering Hadoop - Data Services

Job Title: IT EIM Data Engineering Hadoop - Data Services

Job Location: Bangalore

Job Details:

The Data Engineering - Data Services role is responsible for developing data ingestion and modeling pipelines that leverage Enterprise Level ETL tools such as SAP Data Services or Informatica for the Org IT Enterprise Information Management (EIM). In this role, you will be part of a growing, global team of DevOps engineers, system admins and infrastructure technicians who collaborate to design, build, test and implement solutions across Life Sciences, Finance, Manufacturing and Healthcare. 

 

The EIM platform currently comprises multiple different technology stacks, which are hosted on Amazon Web Services (AWS) infrastructure or on-premise Org’s own data centers. These are: 

  • Hortonworks Hadoop environment (development cluster and a regulated production cluster) 
  • ELK (Elasticsearch, Logstash, Kibana) stack 
  • R and Python Servers with connectivity to the Hadoop cluster. 
  • Docker and Docker container technologies
  • Informatica and SAP Data Services Platforms

     

    This position will be expected to analyze various data sources and coordinate with team members and business users to ingest data from enterprise systems leveraging SAP Data Services and Informatica.  Note: this is not a Database Administration role – this is a role for ETL/ELT data architects and analysts who can code and build solution in a modern open Hadoop and ELK tools, so knowledge of Hadoop/Spark is essential. The individual must be capable of complex and creative problem solving with the ability to work in an agile development environment.

     

    Roles & Responsibilities: 

  • Ability to understand and model data using traditional RDBMS methods and modern NoSQL solutions and understand the merits and strengths of both systems.
  • Work closely with business users, data scientists/analysts to design logical and physical data models
  • Utilize automation to create pipeline processes for data ingestion, transformation and access to data catalog solutions
  • Create SAP Data Services and Informatica interfaces to ingest data from ERP and enterprise systems into ‘staging’ databases or direct into HDFS/Hive
  • Document technical work in a professional and transparent way
  • Be a technical leader in data catalog, data modeling, metadata management and data governance design and processes

     

    Education 

  • B.Sc. (or higher) degree in Computer Science, Engineering, Mathematics, Physical Sciences or related fields 

     

    Professional Experience 

  • 5+ years of experience in system engineering or software development 
  • 3+ years of experience in engineering with experience in ETL type work with databases and Hadoop platforms.

     

    Skills

Hadoop General

Deep knowledge of distributed file system concepts, map-reduce principles and distributed computing.  Knowledge of Spark and differences between Spark and Map-Reduce.  Familiarity of encryption and security in a Hadoop cluster.

HDFS

HDFS and Hadoop File System Commands

Hive

Creating and managing tables; experience of building partitioned tables; HQL;

Sqoop

Full knowledge of sqoop including creating and running sqoop jobs in incremental and full load

Oozie

Experience in creating Oozie workflows to control Java, Hive, Spark and Shell actions using

Spark

Experience in launching spark jobs in client mode and cluster mode.  Familiarity with the property settings of spark jobs and their implications to performance.

SCC/Git

Must be experienced in the use of source code control systems such as Git

ETL 

Experience with developing ELT/ETL processes with experience in loading data from enterprise sized RDBMS sytems such as Oracle, DB2, MySQL, etc.

 

Must have experience in using Enterprise ETL tools such as SAP Data Services of Informatica.

Linux 

Must be experienced in Enterprise Linux command line, preferably in SUSE Linux 

Shell Scripting 

Ability to write parameterized shell scripts using functions and familiarity with Unux tools such as sed/awk/etc

Programming 

Must be at expert level in Python or expert in at least one high level language such as Java, C, Scala.

SQL 

Must be an expert in manipulating database tables using SQL.  Familiarity with views, functions, stored procedures and exception handling.

AWS 

General knowledge of AWS Stack (EC2, S3, EBS, …)

IT Process Compliance

SDLC experience and formalized change controls

Languages 

Fluent English skills

 

Specific information related to the position:

  • Physical presence in primary work location (Bangalore)
  • Flexible to work CEST and US EST time zones (according to team rotation plan)

     

Job Requisition ID:  184231
Location:  Bangalore SBS
Career Level:  D - Professional (4-9 years)
Working time model:  full-time

US Disclosure
The Company is an Equal Employment Opportunity employer. No employee or applicant for employment will be discriminated against on the basis of race, color, religion, age, sex, sexual orientation, national origin, ancestry, disability, military or veteran status, genetic information, gender identity, transgender status, marital status, or any other classification protected by applicable federal, state, or local law.  This policy of Equal Employment Opportunity applies to all policies and programs relating to recruitment and hiring, promotion, compensation, benefits, discipline, termination, and all other terms and conditions of employment. Any applicant or employee who believes they have been discriminated against by the Company or anyone acting on behalf of the Company must report any concerns to their Human Resources Business Partner, Legal, or Compliance immediately. The Company will not retaliate against any individual because they made a good faith report of discrimination.

North America Disclosure
The Company is committed to accessibility in its workplaces, including during the job application process. Applicants who may require accommodation during the application process should speak with our HR Services team at 855 444 5678 from 8:00am to 5:30pm ET Monday through Friday.


Job Segment: Database, ERP, Oracle, SAP, Java, Technology

Apply now »