Hadoop Developer Resume
MinneapoliS
SUMMARY
- Having 8 years of experience in Information Technology with 4 years of experience in Hadoop Ecosystem.
- Strong experience and knowledge of Hadoop, HDFS, Map Reduce and Hadoop ecosystem components like Hive, Pig, Sqoop, Oozie, YARN and NoSQL.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MapReduce concepts and experience in working with MapReduce programs using Apache Hadoop for working with Big Data to analyze large data sets efficiently.
- Hands on experience in writing MAPREDUCE programs in JAVA.
- Implemented workflows in Oozie using Sqoop, MapReduce, Hive and other Java and Shell actions.
- Expertise in developing solutions around SQL and NoSQL databases like HBASE.
- Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion, Data Migrations and Data Mining.
- Captured error logs and test results during the tests on the payloads.
- Good experience in working with compressed files and related formats.
- Knowledge in Hive and Pig core functionality by writing custom UDFs.
- Peer review and manage coding standard and code quality
- Experience in importing and exporting terra bytes of data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Experience in managing and reviewing Hadoop Log files.
- Experience in Core Java, Eclipse, Maven, MySQL, SQL server.
- Hands on experience working with Relational Databases like Oracle 10g, DB2.
- Experience in creating test reports, bug status, and coordinating with the teams on bug tracking.
- Experience in knowledge transfer and imparting training to the end users.
- Involved in iterative life cycles using AGILE development, SCRUM.
- Experience in SDLC process. Analysis, Design, Coding, Unit Testing, SIT, UAT and Implementation experience
- Experience in Agile methodology
- Strong Design Skills - Hands on experience of having worked on Complex design.
- Experience in handling various projects like Development, Maintenance and Production Support.
- Maintaining the quality in every phase of SDLC to exceed customer expectations.
- Documentation Skill: Preparing / reviewing Analysis and Design, unit test cases, unit test results, Requirement Traceability matrix etc... Project documents.
- Good communication, Presentation and Interpersonal skills with excellent problem solving capability
TECHNICAL SKILLS
Hadoop Ecosystem& tools: Hadoop (2.6), HDFS, Map Reduce, Hive, Pig, HBase, Sqoop, Oozie, Zookeeper(3.4.8), YARN
Programming Languages: Core Java, PIG Latin, SQL, HiveQL, UNIX Shell Scripting, COBOL, JCL, CICS
ProjectMethodologies: Agile, Waterfall
API’s/Tools: UBUNTU, Cloudera, PUTTY, Hortons Sand Box, Eclipse IDE, Maven, CSS,HTML 4/5
Operating Systems: Linux, Z/OS, Windows
Databases: Oracle 10g, MYSQL, DB2
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential, Minneapolis
Responsibilities:
- Experienced with requirement gathering, Project planning, architectural solution design and development process in agile environment.
- Worked on data migration from external servers to HDFS using Sqoop.
- Experience in writing Sqoop scripts for migrating data from MySql to HDFS cluster.
- Wrote shell scripts to run the Cron jobs for automate the data migration process from external servers and FTP sites.
- Created hive tables and partitioning and stored the data.
- Experience in optimizing the hive queries to improve the performance of hive queries.
- Developed the Oozie workflows to automate the loading process.
- Developed Hive unit testing framework.
- Experienced in writing several test cases. And performed Hive unit testing and integration testing on those test cases.
- Responsible for building scalable distributed data solutions using Hadoop Eco System
- Responsible for writing MapReduce jobs to handle multiple types of files (JSON, Text, XML)
- Writing PIG UDFs to perform data cleansing and transforming for ETL activities.
- Wrote HIVE UDF for Data analysis and Hive table loads.
- Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest data into HDFS for analysis. Extensively worked on creating combiners to improve the performance MapReduce jobs.
- Worked on Creating the MapReduce jobs to parse the raw weblogs data into delimited records.
- Created partitioned tables in Hive for best performance and faster querying.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Developing Pig Scripts to pull data from HDFS.
- Performed extensive data analysis using Hive and Pig.
- Performed Data scrubbing and processing with Oozie.
- Responsible for managing data coming from different sources.
- Experience in Data Serialization formats for converting Complex objects into sequence bits by using AVRO, JSON, CSV formats.
Environment: Hadoop Framework, CDH 5.2, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Java (JDK1.6), UNIX Shell Scripting, Windows 7, Linux (cent OS 6.4), Eclipse
Hadoop Developer
Confidential, Phoenix
Responsibilities:
- Importing and exporting data into HDFS and Hive from relational data sources like Oracle using Sqoop.
- Worked on debugging, performance tuning of Hive & Pig Jobs
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
- Involved in loading data from HDFS to Hbase using Bulk loading.
- Developed Pig scripts for Data transformations, joins and some pre-aggregations on the raw datasets before storing data on Hive tables.
- Assisted in exporting analyzed data to relational databases using sqoop.
- Involved in developing Map Reduce jobs in Java for data cleaning, pre-processing and data joins.
- Involved in implementing few generic UDF’s to implement business logic
- Developed Simple to complex MapReduce Jobs.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Created HQL and Pig Latin statements for the effective retrieval and storage of data on HDFS from the database.
- Involved in running Hadoop jobs for processing millions of records of text data
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
- Understanding the existing flow matching the functional requirements and Designing the data model
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Created Hbase tables to store various data formats of PII data coming from different portfolios
- Implemented test scripts to support test driven development and continuous integration
- Experience working on processing unstructured data using Pig and Hive
- Gained experience in managing and reviewing Hadoop log files
Environment: Hadoop, HDFS, Pig, Hive, Java, Linux, Sqoop, Oozie, Hbase, Shell Scripting, SQL Server, Ubuntu, Cloudera.
Mainframe Developer
Confidential
Responsibilities:
- Receiving the business requirements from the client / business team.
- Requirements Analysis.
- Technical design Document preparation and review.
- Build & Unit Testing.
- Preparing the HLE estimates based on the impacts identified with details of changes required to meet the requirements.
- Prepare the functional and Detailed Design specification.
- Design and development of programs and jobs using JCL, COBOL, DB2, VSAM, CICS, Stored Procedures, Jobtrack.
- Involved in preparing test plans for Independent Unit testing and Work package testing.
- Testing the Applications for the business requirement and responsible for delivery from offshore.
- Involved in System integration and User Acceptance testing.
- Responsible for correct versioning of code by creating and moving the packages using Changeman.
- Preparing the detailed implementation plan to manage and monitor the implementation activities.
- Also involved in monitoring the Batch Cycles.
- Reviewing programs for QA.
- Coordinating with various teams across globe for various projects and its phases.
- Coordinating with DBA teams as and when necessary.
- Implementation & Release Support.
- Task allocation for resources and monitoring.
- Mentoring the team.
Mainframe Developer
Confidential
Responsibilities:
- Coding. Unit testing of the code.
- Peer Reviewing of programs.
- Analyzing the data in live region
- Fixing the data and sending the response to user and updating INFRA accordingly.
- Monitoring the Batch
- Providing On-call support.
- Resolving Tickets as per SLA