We provide IT Staff Augmentation Services!

Sr.hadoop Developer Resume

SUMMARY:

  • 12+ years of experience in the Information Technology industry.
  • 7+ years in Data Warehousing and Big Data Environment.
  • Excellent Analytical Thinking and Problem - Solving skills.
  • Exceptional Communication and Interpersonal skills
  • Big Data Programmer with experience across multiple components of the Hadoop Ecosystem like HIVE, SQOOP, Flume, Pig, Spark with Scala.
  • Experience in building, maintaining multiple Hadoop clusters of different sizes and configuration.
  • Understanding of Information Architecture and usability design concepts, along with OOPs.
  • Experienced in multiple Hadoop distributions like Cloudera, MapR, and Horton works.
  • Adept at performance tuning and throughput optimization techniques.
  • Well-versed with data import and export data in Hadoop tool suites.
  • Evaluation of ETL and OLAP tools and recommend the most suitable solutions based on business needs.
  • Adept at documenting Technical design and Application Software design and development of applications using Java,C and C#.
  • Experience spanning across Health Care, Retail, Banking and Mobile Telecommunication industry.

TECHNICAL SKILLS:

Operating System: Windows,Linux

Hadoop Ingestion Tools: Sqoop, Flume

Hadoop Data Processing Tools: HDFS, Spark,Scala, Hive, PIG,Shell Scripting.

Hadoop Scheduling & Monitoring Tools: Zena, Ambari

Database: MySQL,Oracle,Netezza

Language: Scala, C,Java,Python

IDE: Eclipse, IntelliJ

Repository: GIT

Other: Jira

Projects Summary

Confidential

Sr.Hadoop Developer

Roles and Responsibilities

  • Establishes end-to-end automation processes for all files using ZENA.
  • Experience in designing and developing applications in Spark using Scala.
  • Data ingested from fixed width flat files from HDFS to Raw layer using Spark - Scala.
  • Scala is used here to apply the business transformations rules as per mapping document for 32 tables.
  • Involved in creating HDFS directory and partitions.
  • Involved in creating an external Hive tables for all TMG government data.
  • The history data is moved to CDC and then merged in to current tables.
  • Altered table partiotions using hive.
  • Created GCF format tables using denormalized data.
  • Invloved in error handling tasks in the process.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Invloved in performing the validation of the records count and File name.
  • Unit testing and validating the case outputs.
  • Performance tuning of spark job for optimal utilization of cluster resources.

Environment: Hadoop, HDFS, Hive, Apache,Spark,Scala and Shell Scripting.

Confidential

Sr.Hadoop Developer

Roles and Responsibilities

  • Establishes end-to-end automation processes for all files using ZENA.
  • Imported DB2 data into HDFS using sqoop.
  • Involved in creating HDFS directory and partitions.
  • Involved in creating Hive tables for all member’s Bluestar membership.
  • Creating Hive tables, and loading and analyzing data using hivequeries.
  • Performance tuning of Flume agent by configuring different properties.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Migrated Hive scripts to Scala for ingestion.
  • Invloved in error handling tasks in the process.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Invloved in performing the validation of the records count and File name.
  • Unit testing and validating the case outputs.
  • Performance tuning of spark job for optimal utilization of cluster resources.
  • Creating Oozie workflow to schedule spark rule engine job.

Environment: Hadoop, HDFS, Hive, Pig, Sqoop, Apache,Spark, Scala, Shell Scripting.

Confidential

Sr.Hadoop Developer

Roles and Responsibilities

  • Provide technical designs, architecture, Support automation, installation and configuration tasks and upgrades and planning system upgrades of Hadoop cluster.
  • Ingested files from GCPS files to HDFS and creating External hive tables and data should be available for the consumption team.
  • Maintained Hadoop clusters for dev/staging/production. Trained the development, administration, testing and analysis teams on Hadoop framework and Hadoop eco system.
  • Developed the UNIX shell scripts for creating the reports from Hive data.
  • Integrating Big data technologies and analysis tools into the overall architecture.

Environment: Hadoop, HDFS, Hive, Apache, Shell Scripting.

Confidential

Sr.Hadoop Developer

Roles and Responsibilities:

  • Loaded data into HDFS and extracted the data from AXway into HDFS.
  • Writing pig scripts for data cleansing and transforming data and then using Hive on top of it for querying and deriving meaningful insights out of the data
  • Optimization and Performance tuning of Hive queries
  • Worked on automation of end to end process of creating Hive tables from raw RDF/XML data.
  • Maintaining project documentation/knowledge base.
  • Invloved in writing Pig Scripts for Splitting and loading the data into MCEF with header trailer and filename.
  • Finally transformed CET files to Blue Gateway.
  • Commissioning and decommissioning nodes from the cluster, Configuring Views and Managing HDFS services through Hue.
  • Managing Users, Groups and their respective Quotas in HDFS.
  • Monitoring Clusters and maintaining their respective configuration versions
  • Deployment of codes in different environments like prod, stg from Dev.

Environment: Hadoop, HDFS, Hive, Pig, Apache, Shell Scripting.

Confidential

Sr.Hadoop Developer

Roles and Responsibilities:

  • Involved in creating Hive tables, and loading and analyzing data using hivequeries.
  • Involved in loading data from LINUX file system to HDFS.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Invloved in performing the validation of the records count and File name.
  • Developed Simple to complex Map/reduce Jobs using Hive.
  • Responsible to manage data coming from different sources.

Environment: Hive, Map Reduce, Datastage8.7, Netezza, Shell script, Unix, Filezilla,Windows 7,Linux

Confidential

Hadoop Developer

Roles and Responsibilities:

  • Successfully loading files to Hive and HDFS from Teradata using Datastage.
  • Writing Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Creating Hive tables, loading with data and writing Hive queries.
  • Analyzing the log files using Hive queries and writing UDF's for Hive queries.
  • Creating Hive tabular partitions and buckets.
  • Developed the Parallel Jobs(using XML Input Stage) for XML Source Files.
  • Handled MQ Managers for loading Messages in Queue.
  • Created Design Document for the Model Classes.
  • Created the UTC Document for the Extracts.
  • Created the XML Analyzer Document for the Model Classes.
  • Followed the Agile Methodology and participated in the Scrum Calls daily.

Environment: Datastage8.7,Netezza, Shell script, Unix,Windows 7,Linux

Confidential

Developer

Roles and Responsibilities:

  • Worked as a Data Analyst within the Firm Wide Matter and identify the reports that go to the government or that are used for executive decisions.
  • Work with BCBS business team to decomposes the reports to their atomic Critical Data Elements- the raw element that feeds the calculations and metrics in the reports and submits the intake request. Analyze and map the attributes of rationalized reports in Enterprise data warehouse (EDW) to (Integrated Customer data warehouse) ICDW risk reports to ICDW elements and identify gaps.
  • Missing attributes will be loaded into the ICDW Staging, Integration and Semantic layer using the tool DataStage.
  • Performs quality review on the submitted intake request and assign elements to category. Runs the matching process (using CHARMS) to find duplicates and results are loaded into MetaTracker.Perform certification of the elements categorized by the CB DMC Council by performing the tasks Naming and Definition, Lineage, Profiling, Remediation & Monitoring. Work closely with business partners and end users to Name and Define the Critical Data Elements with the help of Data Stewards.
  • Perform Lineage using DataStage and Teradata to understand the transformations of the data element as it moves from the System of Record (SOR) to the Authoritative Source (AS).
  • Perform Data Profiling for the Critical Data Elements to understand the basic attributes of the data and its quality using the tool Informatica Data Quality.
  • Log the data issues into the tool JIRA if the profiling found any issues with the data, and understand the corresponding business impact, and lead the effort in identifying a plan for remediation. Work with Monitoring team to set up monitoring plan for the future integrity of certified data elements.

Environment: TeraData, Datastage, Informatica Data Quality,MetaTracker,Charms, Unix, Windows 7.

Confidential

Developer

Roles and Responsibilities:

  • Developed the Parallel Jobs and Sequence Jobs for the Pharmacy Claims Extract.
  • Developed UNIX Scripts for the Extract.
  • Created Zeke Sheets for running the Jobs in the Production.
  • Created Design Document for the Extracts.
  • Created the UTC Document for the Extracts.
  • Created the IT Document for the Extracts.
  • Followed the Agile Methodology and participated in the Scrum Calls daily.

Environment: Datastage 8.7, Unix, Windows 7.

Confidential

Developer

Roles and Responsibilities:

  • Created Detailed Design Document.
  • Created Source Target Mapping Document.
  • Developed Jobs for One time Load as per the functional specs provided by the client.
  • Developed Jobs based on the Query and STM.
  • Developed Master Sequences for all Parallel Jobs.
  • Involved in unit testing of the jobs and created Unit Testing Documents.
  • Created Code Review Document and Cardianl Standard been followed in the Coding.
  • Migration Document were created for migrating the Code from Dev to QA.
  • Solved the Issues on runtime.
  • Coach, solve work problems, and participate in the work of the team and interacted with clients.
  • Understanding the Client Processes and to work efficiently to meet targets and deliver solutions in accordance with quality control standards, business practices and procedures.

Environment: Datastage 8.5, Unix, Windows 7/XP.

Confidential

Developer

Roles and Responsibilities:

  • Developed New Feeds like WINFITS and OPICSPM.
  • Modified the Job according to the Jira Portal.
  • Coach, solve work problems, and participate in the work of the team.
  • Develop ETL jobs using the Data Stage 8.5 as per the functional and technical specs provided by the client.
  • Understanding the Client Processes and to work efficiently to meet targets and deliver solutions in accordance with quality control standards, business practices and procedures.
  • Extensively used DataStage client tools - DataStage Designer, DataStage Director, DataStage Manager, and DataStage Administrator.

Environment: Datastage 8, Unix, Windows 7/XP.

Confidential

Developer

Roles and Responsibilities

  • Developed a process in C on UNIX platform to contribute to the software project, which automated Entire Cluster Status in the network.
  • Wrote a program in C, Scan all the Nodes in the Network and checks whether Job/Session is running or not.
  • Wrote a program in C, Which checks the Memory Status and tells us the Usability of the Particular Node.
  • Wrote a program in C, Can Close Scheduler in network Systems & Slave applications in the network.
  • Wrote a program in C, Checks Which Node is Logged off or Power Off.
  • Combined all the Modules together.
  • Did unit testing of programming projects.

Environment: C,Unix, Windows 7/XP.

Hire Now