We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

3.00/5 (Submit Your Rating)

PA

SUMMARY:

  • Overall 8+ years of professional IT experience as a software developerand 6+ as Senior Hadoop Developer in software Development and support with experience in developing strategic methods for deploying big data technologies to efficiently solve Big Data processing requirement
  • Solid expertise in the workings of Hadoop internals, architecture and supporting ecosystem components like Hive, Spark, Sqoop, Pig and Oozie.
  • Apart from developing on the Hadoop ecosystem, also have good experience in installing and configuring of the Cloudera’s distribution (CDH 3,4 and 5), Hortonworks distribution ( HDP2.1 and 2.2) and IBM BigInsights(2.1.2 and 3.0.1).
  • Good experience with setting up and configuring a Hadoop cluster on Amazon web Services (EC2) on clusters of nodes running on CentOS 5.4, 6.3 and RHEL.
  • Adept at HiveQL and have good experience of partitioning (time based), dynamic partitioning and bucketing to optimize Hive queries. Also used Hive’s MapJoin to speed up the queries when possible.
  • Used Hive to create tables in both delimited text storage format and binary storage format.
  • Have excellent working experience in using the two popular Hadoop binary storage formats Avro datafiles and Sequence files.
  • Also have experience developing Hive UDAF to apply custom aggregation logic.
  • Created Pig Latin scripts made up of series of operations and transformations that were applied to the input data to produce the required output.
  • Experience with Cloudera Navigator and unravel data for Auditing Hadoop Access.
  • Hands on experience on Hortonworks and Cloudera Hadoop environments.
  • Good experience with the range of Pig functions like Eval, Filter, Load and Store functions.
  • Good working experience using Sqoop to import data into HDFS from RDBMS and vice - versa. Also have good experience in using the Sqoop direct mode with external tables to perform very fast data loads.
  • Good experience on Linux shell scripting.
  • Involved in ingesting data into HDFS using Apache Nifi.
  • Experience in design and development of ETL processes by using Apache Nifi.
  • Good Knowledge on ETL tools like Data Stage, Informatica.
  • Used OOZIE engine for creating workflow and coordinator jobs that schedule and execute various Hadoop jobs such as MapReduce jobs, Hive, Pig and Sqoop operations.
  • Experienced in developing JAVA Map Reduce Programs using Apache Hadoop for analyzing data as per the requirement.
  • Solid experience writing complex SQL queries . Also experienced in working with NOSQL databases like HBase.
  • Experienced in creative & effective front-end development using JSP, JavaScript, HTML 5, DHTML, XHTML Ajax and CSS
  • Working knowledge of database such as Oracle 8i/9i/10g
  • Have extensive experience in building and deploying applications on Web/Application Servers like WebLogic, Web Sphere, and Tomcat.
  • Experience in Building, Deploying and Integrating with Ant, Maven.
  • Experience in processing the hive table data using Spark.
  • Good knowledge on analytical tools like Data Meer, Tableau and R.
  • Experience in development of logging standards and mechanism based on Log4J
  • Strong work ethic with desire to succeed and make significant contributions to the organization
  • Complementing my technical skills are my solid communication skills.

PROFESSIONAL EXPERIENCE:

Confidential, PA

Senior Hadoop Developer

Responsibilities:

  • Responsible for Technical Lead & Business Analyst and Hadoop developer.
  • Evaluate business requirements and prepare detailed specifications that follow project guidelines required to develop written programs.
  • Involved in managing nodes on Hadoop cluster and monitor Hadoop Cluster job Performance using Cloudera manager.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyze large amounts of data sets to determine optimal way to aggregate and report on it.
  • Develop simple to complex MapReduce Jobs using Hive to Cleanse and load downstream data’s
  • Handle importing of data from various data sources, perform transformations using Hive, MapReduce, load data into HDFS and extract the data from MySQL into HDFS using Sqoop.
  • Export the analyzed data from hive tables to SQL databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Hive for data cleansing.
  • Create partitioned tables in Hive and Manage and review Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in MapReduce way.
  • Use Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Written programs in spark using python for data quality check.
  • Used Unix bash scripts to validate the files from Unix to HDFS file systems.
  • Load and transform large sets of structured, semi structured and unstructured data and Manage data coming from different sources.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python.
  • Worked on Informatica Power Center tools- Designer, Repository Manager, Workflow Manager and Workflow Monitor.
  • Parsed high-level design specification to simple ETL coding and mapping standards.
  • Designed and customized data models for Data warehouse supporting data from multiple sources on real time
  • Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse.
  • Created mapping documents to outline data flow from sources to targets.
  • Involved in Dimensional modeling (Star Schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts.
  • Extracted the data from the flat files and other RDBMS databases into staging area and populated onto Data warehouse.
  • Maintained stored definitions, transformation rules and targets definitions using Informatica repository Manager.
  • Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings in the Informatica Designer.
  • Developed mapping parameters and variables to support SQL override.
  • Created mapplets to use them in different mappings.
  • Developed mappings to load into staging tables and then to Dimensions and Facts.
  • Used existing ETL standards to develop these mappings.
  • Worked on different tasks in Workflows like sessions, events raise, event wait, decision, e-mail, command, worklets, Assignment, Timer and scheduling of the workflow.
  • Created sessions, configured workflows to extract data from various sources, transformed data, and loading into data warehouse.
  • Used Type 1 SCD and Type 2 SCD mappings to update slowly Changing Dimension Tables.
  • Extensively used SQL* loader to load data from flat files to the database tables in Oracle.
  • Modified existing mappings for enhancements of new business requirements.
  • Used Debugger to test the mappings and fixed the bugs.
  • Wrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and backup of repository and folder.
  • Involved in Performance tuning at source, target, mappings, sessions, and system levels.
  • Prepared migration document to move the mappings from development to testing and then to production repositories.

Confidential, Texas

Senior Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Written multiple Map Reduce programs in Java for Data Analysis.
  • Wrote Map Reduce job using Pig Latin and Java API.
  • Performed performance tuning and troubleshooting of Map Reduce jobs by analyzing and reviewing Hadoop log files.
  • Developed pig scripts for analyzing large data sets in the Confidential .
  • Collected the logs from the physical machines and the OpenStack controller and integrated into Confidential using Flume.
  • Designed and presented plan for Confidential on impala.
  • Created Hive Tables by using Apache Nifi and loaded the data into tables using Hive Query Language.
  • Clear understanding of Cloudera Manager Enterprise edition.
  • Continuous Monitoring and managing the Hadoop cluster through Cloudera Manager.
  • Experienced in migrating Hive QL into Impala to minimize query response time.
  • Knowledge on handling Hive queries using Spark SQL that integrate with Spark environment.
  • Worked in generating java and groovy codes to process the XML, XSD, CSV, JSON data and incorporated it with Nifi processors to create Hive Tables.
  • Implemented Avro and parquet data formats for apache Hive computations to handle custom business requirements.
  • Responsible for creating Hive tables, loading the structured data resulted from Map Reduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Performed streaming of data into Apache ignite by setting up cache for efficient data analysis.
  • Responsible for performing extensive data validation using Hive.
  • Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
  • Used Kafka to load data in to Confidential and move data into NoSQL databases(Cassandra)
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Involved in submitting and tracking Map Reduce jobs using Job Tracker.
  • Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
  • Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations.
  • Responsible for cleansing the data from source systems using Ab Initio components such as Join, Dedup Sorted, De normalize, Normalize, Reformat, Filter-by-Expression, Rollup.
  • Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources.
  • Implemented Hive Generic UDF's to implement business logic.
  • Implemented test scripts to support test driven development and continuous integration.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Confidential, IL

Hadoop Developer( Hadoop, MapReduce, Talend, Hive QL, Oracle, Cloudera, HDFS, HIVE, HBase, Java, Tableau, PIG, Sqoop, UNIX, Spark, Scala, JSON,AWS )

Responsibilities:

  • Importing and exporting data into HDFS from database and vice versa using Sqoop
  • Created Data Lake as a Data Management Platform for Hadoop.
  • Using Amazon Web Services (AWS) for storage and processing of data in cloud.
  • Using Talend and DMX-h to extract the data from other sources into HDFS and Transform the data.
  • Involved in creating workflow to run multiple hive and Pig jobs, which run independently with time and data availability
  • Using Apache Kafka for Streaming purpose.
  • Involved in developing shell scripts and automated data management from end to end integration work .
  • Developing predictive analytic product by using Apache Spark, SQL/HiveQL.
  • Using Apache Nifi to check whether the data getting onto Hadoop cluster is a good data without any nulls in it.
  • Moving data in and out to Hadoop File System Using Talend Big Data Components.
  • Designed, implemented the data flow and data transformation in Cloudera Enterprise Data Lake.
  • Developed Map Reduce program for parsing and loading into HDFS.
  • Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive Querying .
  • Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig.
  • Working with JSON,XML file formats.
  • Using HBase and NoSQL databases to store majority of data which needs to be divided based on region.

Confidential

Java Application Developer

Responsibilities:

  • Analyzed and reviewed client requirements and design
  • Well-developed skills in testing, debugging and troubleshooting all types of technical issues.
  • Implemented MVC architecture using Spring Framework, coding involved writing Action Classes/Custom Tag Libraries, JSP
  • Good knowledge in OOPS concepts, OOAD, UML
  • Used JDBC for database connectivity and manipulation
  • Used Eclipse for the Development, Testing and Debugging of the application.
  • Used DOM Parser to parse the xml files.
  • Log4j framework has been used for logging debug, info & error data.
  • Used WinSCP to transfer file from local system to other system.
  • Performed Test Driven Development (TDD) using JUnit.
  • Used Profiler for performance tuning
  • Built the application using MAVEN and deployed using WebSphere Application server
  • Gathered and collected information from various programs, analyzed time requirements and prepared documentation to change existing programs.
  • Used SOAP for exchanging XML based messages.
  • Used Microsoft VISIO for developing Use Case Diagrams, Sequence Diagrams and Class Diagrams in the design phase.
  • Developed Custom Tags to simplify the JSP code.
  • Designed UI screens using JSP and HTML.
  • Actively involved in designing and implementing Factory method, Singleton, MVC and Data Access Object design patterns.

Confidential

Java Developer

Responsibilities:

  • Member of application development team at Vsoft.
  • Implemented the presentation layer with HTML, CSS and JavaScript
  • Developed web components using JSP, Servlets and JDBC
  • Implemented secured cookies using Servlets.
  • Wrote complex SQL queries and stored procedures.
  • Implemented Persistent layer using Hibernate API
  • Implemented Transaction and session handling using Hibernate Utils
  • Implemented Search queries using Hibernate Criteria interface.
  • Provided support for loans reports for CB&T
  • Designed and developed Loans reports for Evans bank using Jasper and iReport.
  • Involved in fixing bugs and unit testing with test cases using Junit
  • Resolved issues on outages for Loans reports.
  • Maintained Jasper server on client server and resolved issues.
  • Actively involved in system testing.
  • Fine tuning SQL queries for maximum efficiency to improve the performance
  • Designed Tables and indexes by following normalizations.
  • Involved in Unit testing, Integration testing and User Acceptance testing.
  • Utilizes Java and SQL day to day to debug and fix issues with client processes.

We'd love your feedback!