AWS Data Engineer

iTitans - We Champion Innovation - Lahore

Job Description

We are looking for an excellent AWS Data Engineer who is passionate about data and the insights that large amounts of data sets can provide. Candidate should possess both a data engineering background and a business acumen that enables him to think strategically.
Responsbilities
  • Review current state Amazon Virtual Private Cloud (Amazon VPC), account, Amazon Simple Storage Service (Amazon S3) bucket strategy and capture impact for introducing critical data.
  • Provide assistance in defining goals, success metrics, scope, requirements and exit criteria for the data lake.
  • Understand Customer's compliance requirements for storing the defined data.
  • Assess current state data sources, velocity, data dependencies and usage patterns.
  • Assess current state data source details such as formats, encoding, encryption, refresh rates.
  • Provide assistance in capturing data source requirements.
  • Provide guidance in data classification requirements.
  • Provide assistance in capturing data usage requirements

Expected Outcomes

Closure architecture document aligning requirements to architectural design.

  • Recommendation on picking right tools for job based on usage
  • First pass high level data flow architecture
  • Initial data architecture document
  • Assist Customer in building out a reusable framework for data collection, data storage, data catalog and data serving that increase speed to which information is curated, added, and secure access is provided (“AWS Data Lake Accelerator”).

Detailed Design and Architecture

You, in a non-production environment, will assist Customer with:

  • Implementing Amazon Virtual Private Cloud (Amazon VPC), account, Amazon Simple Storage Service (Amazon S3) bucket strategy and capture impact for introducing critical data for a data lake.  Note this is not a full Amazon VPC strategy for an enterprise, but a focus on data lake implementations.
  • Achieving modern big data lambda architecture, providing an environment for consumption and utilization of data coming from either batch or streaming sources
  • Defining detailed data architecture including raw data, conformed, structured, enriched, and aggregated.

Ingestion

 Based on ingestion requirements provided, you will assist Customer with:

  • Designing and building robust data ingestion solutions using monitoring, logging and alerting servicing with retry capabilities
  • Developing server-less approaches, where if applicable, for the ingestion
  • The build of auto registering data in a metadata catalogue
  • Building the general best practice for managing data updates to data stored in objects on Amazon S3.

 

Amazon S3 Data Storage Build

Based on provided usage requirements, you will assist the Customer with:

  • Addressing data landing zone including data stored in Amazon S3 unchanged from the source compared to different data derivative options including format transformations and business logic data manipulation
  • Building data storage including buckets, prefixes, encryption, file types, partitioning
  • Building automated data partition strategy

Serving Layer

Based on providing access requirements, you will assist Customer with:

  • Building the serving layer to the data catalog and data lake and establishing reusable key patterns
  • Designing and implementing data transformation pipelines to be used for downstream applications.
  • Designing and implementing push serving model to aggregate or derived datasets
Requirements
  • Should have 5+ years of strong AWS experience, concentrated on Data Lake Migration specific work.
  • Should have strong experience with all of the following: AWS EMR/Hadoop, AWS S3, AWS Glue, AWS Redshift, AWS IAM roles, and AWS security.
  • AWS Certified Solutions Architect (Associate or Professional).
  • If they are not certified, then 7+ years of strong experience with the above is required.
  • Exceptional English communication skills.
Apply Now