We provide IT Staff Augmentation Services!

Hadoop Developer/admin Resume

Charlotte, NC

SUMMARY

  • 10 years of experience in I.T working with Banking, Financial and Retail Clients.
  • 3+ years of experience working on Big Data Hadoop Technologies.
  • Good understanding and knowledge of Hadoop architecture and various components such HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
  • Ability to move the data in and out of Hadoop from various RDBMS and Mainframe system using SQOOP and other traditional data movement technologies
  • Expertise in Cloudera Hadoop & Pivotal HD environments (CDH 5.4.0) and Hands on experience of NoSQL databases like HBase and Cassandra.
  • Experience with Hadoop Ecosystem components like HIVE, PIG, Sqoop, Oozie and Flume.
  • Able to assess business rules, collaborate with stakeholders and perform source - to-target data mapping, design and review.
  • Experience in writing Pig and Hive scripts and extending Hive and Pig core functionality by writing custom UDFs.
  • Worked on performance tuning of Hive using Partitioning/Bucketing/Distribute by, map joins and optimized the queries using several map reduce parameters.
  • Experience in using different file formats like Sequence, AVRO, Parquet, ORC, CSV and using different compression Techniques like GZip/Snappy codec.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce, Pig and Hive scripts.
  • Worked on Hive server1 and Hive server2.
  • Remediated Hive Avro formatted data into Parquet formatted Hive tables.
  • Hands on experience with Spark using scala and python.
  • Expertise in programming languages such as COBOL, JCL, SQL, Easytrieve.
  • Expertise in different database file systems namely DB2, IMS DB, VSAM and Teradata.
  • Expertise in version control tools like Changeman, Endeavor.
  • Worked with database query tools like Toad, pgAdmin III, SPUFI, QMF, and DB2 File-Aid.
  • Used SFTP and NDM file transfer methods in day to day work.
  • Fair knowledge on Autosys and CA7 Scheduler.
  • Extensively work in Agile Scrum development methodology and waterfall model of software development life cycle (SDLC).

TECHNICAL SKILLS

Hadoop Ecosystem Development: HDFS, MapReduce, Sqoop, Pig, Hive, OozieProgramming Languages Python, Unix Shell scripting, Basic Java, Linux, COBOL, JCL, VSAM, SQL & EasytrieveDatabases Teradata, PostgreSQL, DB2, IMS DB, VSAM and MySQL

OLTP: CICS, IMS DB, DB2 Stored Procedures

Tools: PGAdminIII, Toad, SuperPutty, Changeman, CA7, Endeavor, File-Aid, IBM File Manager, NDM, Quality Center (QC), SAR, IBM utilities.

Utilities: IBM Utilities, DFSORT, FASTLOAD

Operating Systems: Windows 95/98/XP/NT, IBM OS/390 with TSO & ISPF, Linux, Server 2003, Server 2008.

PROFESSIONAL EXPERIENCE

Confidential, Charlotte, NC

Hadoop Developer/Admin

Responsibilities:

  • Worked with HaaS admin team to get the new standards.
  • Remediated all shell scripts, hive, pig, MapReduce, databases from old cluster to new Hadoop cluster.
  • Wrote Python script to convert Autosys jobs, HDFS directory location path’s from old standards to new standards.
  • Wrote Python scripts for getting yarn job list for performance metrics.
  • Created java JCEKS keystore for Sqoop password encryption as this is the new platform standards.
  • The old cluster has Hive1 properties in Oozie workflows. Remediated the same to hive2 standards in the new cluster.
  • Supported and guide the team in terms of work prioritization
  • Ensure data quality and accuracy by implementing business and technical reconciliations via scripts and data analysis.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs involved unit testing, interface testing, system testing and user acceptance testing of the workflow tool.
  • Involved in transforming the data/file from local file system/sftp to HDFS and vice versa.
  • Used distcp to move data from one Hadoop cluster to new cluster for data processing and testing.
  • Prepared Unit test cases and performed unit testing.
  • Created external table and partitioned tables in hive for querying purpose.
  • Followed the scrum implementation of scaled agile methodology and managed offshore teams.

Environment: s: Cloudera Hadoop Cluster, HDFS, HIVE, PIG, Sqoop, Oozie, UNIX shell scripting, Teradata SQL Assistant, Toad, SuperPutty, Eclipse, UNIX, Version One.

Confidential

Responsibilities:

  • Involved in Requirement gathering, analysis with Upstream and downstream applications and prepared mapping document.
  • NDM Mainframe copybooks and source files into Hadoop Edgenodes.
  • Created interface to convert mainframe data (EBCDIC) into ASCII.
  • Wrote one time Python scripts to create Hive scripts and Avro schema for the respective copy books.
  • Created hive external tables with pipe delaminated Avro tables for source data and Hive Parquet tables for publish.
  • Worked with SQOOP import functionalities to handle large data set transfer between Teradata database and HDFS.
  • Created Java UDF to handle derived fields while inserting in to Hive table s and Pig scripts to format the data.
  • Worked on joins to create Hive look up tables.
  • Involved in creation of Java/Python UDF’s and added them into temporary hive functions.
  • Created Oozie workflows to schedule and manage list of Hadoop jobs.
  • Involved in Unit testing and Performance Testing. Worked with TQMS team and resolved the defects.
  • Worked in transforming the data/file from local file system/sftp to HDFS and vice versa.
  • Created project design documents and performance metrics documents.
  • Worked with Hadoop shared platform team for Pre and Post production install activates.
  • Responsible to manage the test data coming from different sources Reviewing peer table creation in Hive, data loading and queries.
  • Created Autosys JIL’s to see the upstream and downstream dependencies. Also created the Autosys JILs for the ETL processing.
  • Supported data analyst in running Pig and Hive queries.
  • Performed Data scrubbing and processing with Oozie.
  • Preparing documentation on the cluster configuration for future reference.
  • Experienced in mentoring teams with functional knowledge and business processes.
  • Worked on data validation and Involved pre &post deployment supports.

Environment: s: Cloudera Hadoop Cluster, HDFS, HIVE, PIG, Sqoop, Oozie, UNIX shell scripting, Teradata SQL Assistant, Toad, SuperPutty, Eclipse, UNIX, Version One, SVN.

Confidential

Responsibilities:

  • For a particular source feed, tracing the data movement, at the data element level in each control point, from source to Target and updated each data element lineage in Excel template.
  • Analyzed the Hive and Pig scripts and identify the each element whether the fields were derived or straight move from source to publish layer.
  • Created Hive External table and loaded the Data lineage Excel template into Hive tables.
  • Created Sqoop export scripts to move the data back from HDFS to RDBMS for the business analytics team and reporting teams.
  • Created Oozie workflows to schedule and manage list of Hadoop jobs.
  • Worked with Audit team and provided all the data lineage details that is required for Audit.
  • Generated the Data lineage graph/flow from ISDL tool by providing the data feed and data element.
  • Shared Knowledge transformation to team members.
  • Actively participated in Agile scrum meetings.

Environment: s: Cloudera Hadoop Cluster, HDFS, HIVE, Sqoop, Oozie, SVN, Teradata, Tectica, SuperPutty.

Confidential, Bentonville, AR

Hadoop Developer

Responsibilities:

  • Follow scrum implementation of scaled agile methodology for entire project.
  • Attend daily stand up meeting.
  • Mange offshore team for development and testing.
  • Prepare the technical specification document to capture the technical requirements
  • Prepare the test plan and test scripts document
  • Prepare technical design documents and detailed design document.
  • Working with the Data Scientist to gather requirements for various data mining projects.
  • Transfer and load datasets from different retail channels( Confidential, Confidential .com, Sam’s and Sams.com)
  • Developed Oozie scheduler to importing the data from Teradata to HDFS using SQOOP.
  • Created HIVE queries and perform aggregation on Hive data.
  • Loaded HIVE data into GreenPlum database using GPload utility for performing real time aggregation.
  • Processed(Data >40 TB’s) the Customer scan/visits data for Confidential /Sam’s B&M and related .Com channel using Hadoop/Hive then pushed the processed data into Greenplum for online visualization & report to compare customer visits/sales for current year and past two years(State/Dept./Week/Month).
  • Involved in transforming the data/file from local file system/sftp to HDFS and vice versa.
  • Experienced working in GreenPlum database using PGAdmin III tool.
  • Created HIVE partition managed table for the each incremental loads.
  • Extracted Informix weather forecast data using Sqoop to load into HIVE table.
  • Developed generic SQOOP export utility to export data from HIVE to different types of RDBMS like Teradata, DB2, PostgreSQL (GreenPlum), and SAP HANA.
  • Developed generic SQOOP import utility to load data from various RDBMS sources like Teradata, DB2 and GreenPlum. Importing Confidential Teradata tables into Hadoop using TPT Export.
  • Analyzed large data sets by running Hive queries scripts.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Developed HIVE scripts passing dynamic parameters using hivevar.
  • Experienced working in Oozie workflow and scheduling job.
  • SFTP to load data from external sources to UNIX box and then load into the HDFS.
  • Created partitioned tables in Hive for best performance and faster querying.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.

Environment: s: Pivotal Hadoop Cluster, HDFS, HIVE, PIG, MapReduce, Sqoop, Oozie, SparkUNIX shell scripting, GreenPlum Database, PGAdmin III, SAP Hana Studio, Teradata SQL AssistantSuperPutty, Eclipse, UNIX, Version One.

Confidential, Charlotte, NC

Technical Delivery Lead

Responsibilities:

  • Transformed the business requirements in to Technical specifications.
  • Created combined high level and low level design documents(HLD-LLD)
  • Trace all the requirements through Software Development Life Cycle.
  • Analysis and review of design documents.
  • Code implementation for proposed changes.
  • Prepared the design document, Unit test plan and respective review document
  • Analyzed and review of COBOL program and gave suggestion for code performance and quality related issues.
  • Tested various scenarios in the existing systems to determine the impact and presented the outcome/ challenges. Reviewed SIT scripts.
  • Ran SIT Cycles for Development and regression Projects
  • Correspondence with Client for clarifications and for status reporting
  • Pre and Post validation of component moves to production
  • Resolved the queries raised by offshore team and reviewed the work done by them.
  • Quality coordinator, to ensure all the deliverables adhere to standards with proper documentation
  • Assuring Quality of the project deliverables.
  • Developed several tools in JCL, REXX, COBOL and EASYTRIEVE to automate testing.
  • Involved in pre-install activities - Audit, Cast Endeavour Package/Freeze Change man Package, approvals, preparing PIW, Go No-Go documents

Environment: IBM Mainframe Z/OS, COBOL, IMS DB and Application Programming, JCL, EASYTRIEVE, SPUFI, Endeavour, IBM File Manager, NDM, CA7, SORT

Confidential

Technical Analyst/Lead

Responsibilities:

  • Analyzed the existing programs
  • Coding of program, JCL/PROC changes and Unit Testing
  • System Integration and Performance Testing
  • Analysis and issue resolution of the tickets of statements and letters escalated by the business.
  • Perform the system testing for the various releases as specified by the client.
  • Testing and implementation of proposed changes.
  • Analysis and coding of JCL program as per requirement.
  • Migration Test data’s from production.
  • Forming SQL queries to extract valid data’s.
  • Involved and suggested changes in Batch scheduling
  • Tracking and fixing abends.

Environment: JCL, VSAM, REXX, Endeavor, TSO/ISPF, Jobtrac and FILE-AID.

Hire Now