Hadoop Developer Resume
Fairfax, VA
SUMMARY
- 8 years of Professional experience in IT Software and Services in design, development and testing various applications in telecom and insurance domains.
- Expertise in Hadoop ecosystem - HDFS, YARN, Pig, HBase, Spark and Hive for data analysis, Sqoop for data migration, Flume for data ingestion, Oozie for scheduling and Zookeeper for coordinating cluster resources.
- Experience in providing design architecture for Big Data solutions.
- Involved in designing Hive schemas, using performance tuning techniques like partitioning, bucketing.
- Optimized HiveQL/ pig scripts by using execution engine like Tez, Spark.
- Migrating EDW (Enterprise Data Warehouse) into Big Data and implemented Star Schema in Big Data.
- Expertise in implementing HBase schemas with optimized Row- key design to avoid Hot-spotting.
- Good understanding on Spark SQL, Spark Transformation Engine and Spark Streaming.
- Experience working with Scala.
- Loaded streaming log data from various webservers into HDFS using Flume.
- Exposed HBase tables to web applications with REST web services.
- Used Apache Solr to create full text searches for more TEMPthan 9 million rows.
- Experience in integrating Hive server with visualization tools like Tableau, Qlikview, Informatica using ODBC driver.
- Experience in building Pig scripts to extract, transform and load different file formats- JSON, TXT, XML data onto HDFS, HBase, and Hive for data processing.
- Used SFTP to transfer the files to server.
- Analyzed the data using Hive queries and running Pig scripts to study customer behavior.
- Developed Pig UDF'S to pre-process the data for analysis.
- Good understanding of Java Object Oriented Concepts and development of multi-tier enterprise web applications.-
- Strong problem solving skills, good communication, interpersonal skills and a good team player
- Have the motivation to take independent responsibility as well as ability to contribute and be a productive team member.
- Experience with Operating Systems like Windows, Linux, and Macintosh.
- Excellent debugging and troubleshooting skills.
- Self-motivated with a strong desire to learn and an Effective Team Player.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, YARN, Spark, Pig, Hive, HBase, Sqoop, Flume, Solr, Elastic Search, Scala, Oozie
Cluster: IBM Big Insights 4.1, Hortonworks, Cloudera CDH4
Languages: Java, C, JSP, Shell Script
Web Technologies: HTML5, CSS3, JavaScript, jQuery, XML, XHTML.
Servers: Putty, WebSphere, WebLogic, JBoss, Apache Tomcat.
Database: MySQL, Oracle
IDE’s: Eclipse Mars.1
PROFESSIONAL EXPERIENCE
Confidential, Fairfax, VA
Hadoop Developer
Responsibilities:
- Configured Hadoop components including Hive, Pig, HBase, Spark, Sqoop, Oozie and Hue in the client environment.
- Stored Solr indexes in HDFS.
- Index documents in HDFS using Solr Hadoop connectors.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in HDFS.
- Worked on the backend using Scala and Spark to perform several aggregation logics.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with HDFS reference tables and historical metrics.
- Enabled speedy reviews and first mover advantages by defining the job flow in Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Designed HBase schema to avoid Hotspotting and exposed the data from HBase tables to REST API on UI.
- DevelopedPig scripts to transform raw datafrom several data sources into forming baseline data and loaded the data into HBase tables.
- Involved in creating POCs to ingest and process streaming data using Spark and HDFS.
- Used Flume to collect, aggregate, and store the log data from different web servers.
- Developed Shell Scripts to automate the batch processing and processed the daily jobs through Maestro scheduler.
- Provided design recommendations and thought leadership to sponsors/stakeholders thatimproved review processes and resolved technical problems.
- Co-ordinate with the offshore team and cross-functional teams to ensure that applications are properly tested, configured, and deployed.
- Used Tableau for visualizing and to generate reports.
Environment: CDH 5.7.0(Cent OS): Apache Hadoop 2.7.1, MapReduce, HBase 1.1.2, Pig 0.15.0, Sqoop 1.4.6, Oozie 4.2.0, Java 8, Autosys, Hive 1.2.1, Impala, ZooKeeper 3.4.6, Oracle 11g, PL/SQL, SQL Developer 4.0, UNIX. Rest API, Web Services REST, SQL, ANT, Shell Script
Confidential
Hadoop Developer
Responsibilities:
- Participated in brainstorming sessions on finalizing the data ingestion requirements and design.
- Worked on SQOOP to import data from various relational data sources.
- Working with Flume in bringing click stream data from front facing application logs.
- Worked on strategizing SQOOP jobs to parallelize data loads from source systems.
- Participated in providing inputs for design of the ingestion patterns.
- Participated in strategizing loads without impacting front facing applications.
- Worked on design on Hive data store to store the data from various data sources.
- Developed MapReduce and MRUnit jobs to operate on streaming data.
- Involved in providing inputs to analyst team for functional testing.
- Worked with source system load testing teams to perform loads while ingestion jobs are in progress.
- Worked on performing data standardization using PIG scripts.
- Worked on building analytical data stores for data science team’s model development.
- Worked on design and development of Oozie works flows to perform orchestration of PIG and HIVE jobs.
- Worked on performance tuning of HIVE queries with partitioning and bucketing process.
- Worked with Ambari UI to configure alerts for Hadoop eco system components.
- Participated in tuning various components in Hadoop Eco System.
Environment: Hadoop 2.2 (Horton works) - PIG, MRUnit, Map Reduce, Hive, Hive UD TEZ, HDFS, Apache Sqoop, Oozie, HBase, JUnit, Zookeeper, maven, Hadoop Data Lake with Linux-Cent OS.
Confidential
Hadoop Developer
Responsibilities:
- Involved in data loading strategies by using Sqoop for importing and exporting the data from HDFS to Relational Database systems and vice-versa.1
- Involved in design and creating Hive partitioned tables and External tables.
- Involved in writing Pig scripts for data cleansing.
- Developed Pig Latin scripts by using HCatalog to read the data from hdfs and load into the hive tables.
- Developed Hive queries for the data analysts.
- Developed workflows in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig and MR.
- Using JUnit test extensively written test cases for dis system to test the application.
- Implemented logging mechanism using Log4j with the help of Spring AOP frame work.
Environment: Hadoop Cloudera CDH3, MapReduce, HIVE, PIG, Sqoop, Oozie, Java, Autosys, Hbase, HIVE, Zookeeper, Maven, Oracle 11g, SQL Developer 4.1.3, Unix.
Confidential
Sr. Software Engineer
Responsibilities:
- Involved in the review and analysis of the Functional Specifications, and Requirements Clarification Defects etc.
- Involved in the analysis and design of the initiatives.
- Involved in the development of the mediation platform using Java.
- Involved in design and implementation of migration of Mediation legacy platform to mediation zone.
- Implemented CDR batch processing for MSC, IN, ADSL, and GPRS Streams to Object Oriented Programming.
- Involved in writing Junit Test for Unit testing.
- Involved in Regression Testing of SAT and Development Environments.
- Involved in writing the SQL queries and stored procedures.
- Participated in the test case reviews, and manual testing of the enhancements during Release 1.5.
- Involved in fixing the defects during integration testing.
- Build and deployment of the application using Ant script on to dev and testing environments.
- Participated in the code reviews for various initiatives, Performed Static Code Analysis to follow the Best Practices for Performance and Security.
Environment: MediationZone (Digital Route), Legacy Zone, Core Java 6
Confidential
Application Tester
Responsibilities:
- Interpreting product requirements into test requirements, writing test plans and test cases.
- Modeling conceptual design using Use Case, UML Class and Activity diagrams using Rational Rose.
- Wrote requirement specific SQL and PL/SQL scripts including Stored Procedures, functions, packages and triggers.
- Made enhancements to the application which presented me with the opportunity to go through the entire SDLC.
- Providing daily updates to the on-site team over call and making enhancements.
- Code coverage and Test case presentation.
- System Test Planning & Execution.
- Analyzing the HLD and Business Requirements.
- Preparing TPI in the test planning phase and the execution phase.
- Preparing the Test Cases and Test Execution Plan.
- Defect tracking, reporting and closure. Raised defects in QC.
- Reviewing the test plan document, so that there is no functionality or the requirements missing.
- Used Web Services like SOAP and WSDL to communicate over internet.
- Automated the tests using the integration tool (SOAP-UI).
Environment: WebMethods, SOAP-UI, SEIBEL, CITRIX, Putty, IBM Rational - ClearQuest.
