Sr.hadoop Developer Resume
Bloomington, MN
SUMMARY
- Overall 8+ years of experience in analysis, design, development, implementation of web - based distributed applications
- Having 4+ year’s experience in Big Data technologies as Hadoop Developer with strong expertise in HDFS, Hive, IMPALA, Sqoop, Cassandra, ParAccel, Pig, Map Reduce, Hbase, Flume, Green Plum, Bedrock Workflow and hands on experience in optimized solutions using various Hadoop components like Map reduce, Hive, Sqoop, Pig, HDFS, Flume.
- Proficiency in Hadoop and its Ecosystem and Java/J2EE related technologies
- Experience in interfacing the front end application written in Java, JSP, Struts, Webworks, Spring, JSF, Hibernate, Web service and EJB with Web sphere Application server and Jboss.
- Involved in development and enhancement projects and worked on Horton works HDP platform 1.3 and 2.1.4 distribution system/Cloudera/MapR, Hadoop ecosystems like HDFS, Map Reduce, Hive, Impala, Sqoop, Flume, No SQL Databases - Cassandra, HBase and Analytical Database - Paraccel and have good knowledge on Pig.
- Experience in NoSQL databases like MongoDB, Hbase and Cassandra.
- Experience in Hadoop, Spark cluster and streams processing using Spark Streaming.
- Excellent understanding of Hadoop architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience in planning, designing, deploying, fine-tuning and administering large scale Productions Hadoop clusters.
- Have good experience in extracting and generating statistical analysis using Business Intelligence tool Tableau for better analysis of data.
- Working on implementing Spark for fast processing, to create reports using Tableau for Campaign Management.
- Experience in creating complex SQL Queries and SQL tuning, writing PL/SQL blocks like stored procedures, Functions, Cursors, Index, triggers and packages.
- Very Good knowledge and Hands-on experience in Netezza and Spark (YARN).
- Good knowledge in distributed coordination system Zookeeper and search platform Solr.
- Exposure to Cloudera development environment and management using Cloudera Manager.
- Expertise in all major phases of a SDLC including Design, Development and Deployment, Implementation and support.
- Proactively suggested tactical solution for Quiet Time Alerts project by utilizing the Message broker Architecture.
- Strong experience in designing Message Flows and writing complex ESQL scripts and invoked Web service through message flow.
- Designed and developed a Batch Framework similar to Spring Batch framework.
- Working experience in AGILE and WATERFALL models.
- Worked with MYSQL, SQL, PL/SQL, Triggers and Stored Procedures for the databases.
- Experience in preparing the test cases, documenting and performing unit testing and Integration testing.
- Expertise in cross-platform (PC/Mac, desktop, laptop, tablet) and cross-browser (IE, Chrome, Firefox, Safari) development.
- Skilled in problem solving and troubleshooting, strong organizational and interpersonal skills.
- Possesses professional and cooperative attitude, Adaptable approach to problem analysis and solution definition.
TECHNICAL SKILLS
Languages: Core Java, Python
Programming Architecture: Map Reduce, PIG
Databases: Cassandra, Hbase, MongoDB, Hive, Impala, Greenplum, M7, Oracle, SQL, DB2.
File Systems: HDFS
Tools: & Utilities: Spark, Sqoop, Flume, Jira, Putty, Winscp, Squirrel, Talend.
Primary Skill category: Hadoop - HDFS, Hive, IMPALA Pig, Hbase, Sqoop, Cassandra, ParAccel, Flume,Bedrock, GreenPlum,M7, Tableau, Spark
Sub Skills: Hadoop Cluster Set up, Hive, Sqoop, Paraccel, Cassandra, Shell scripting, Scala.
Project Acquired skills: Hadoop, Hive, MR, ParAcel, Cassandra Setup and development, Core Java, Python. IMS,DB2, Adabas, JCL, Teradata, Focus, Easytrieve
Reporting Tools: Crystal Reports, SQL Server Reporting Services and Data Reports, Business Intelligence and Reporting Tool (BIRT)
Languages: Java JDK1.4/1.5/1.6 (JDK 5/JDK 6), C/C++, Mat lab, R, HTML, SQL, PL/SQL.
Operating Systems: UNIX, Mac, Linux, Windows 2000 / NT / XP / Vista, Android.
PROFESSIONAL EXPERIENCE
Confidential, Bloomington, MN
Sr.Hadoop Developer
Responsibilities:
- Responsible for Requirement gathering, preparation of design documents.
- Involved Low level design for MR, Hive, Impala, Shell scripts to process data.
- Worked on ETL scripts to pull the data from DB2/Oracle Data Base into HDFS.
- Experience in utilizing Spark machine learning techniques implemented in Scala.
- Involved in POC development and unit testing using Spark and Scala.
- Implemented partitioning, dynamic partitions and buckets in HIVE.
- Installing and configuring Hive, Sqoop, Flume, Oozie on the Hadoop clusters.
- Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
- Developed hive tables to upload data from different sources.
- Tuned the Hadoop Clusters and Monitored for the memory management to push the data to No SQL store (HBase).
- Involved for Database Schema design.
- Involved Sprint Planning and Sprint Retrospective meetings.
- Daily Scrum Status meeting.
- Proposed an automated system using Shell script to sqoop the job.
- Worked in Agile development approach.
- Created the estimates and defined the sprint stages.
- Worked in analyzing data using Hive, Pig and custom MapReduce programs in Java.
- Imported data from mainframe dataset to HDFS using Sqoop. Also handled importing of data from various data sources (i.e. Oracle, DB2, Cassandra, and MongoDB) to Hadoop, performed transformations using Hive, Map Reduce.
- Used Sqoop to and mongo Dump to move the data between MongoDB and HDFS.
- Developed a strategy for Full load and incremental load using Sqoop.
- Mainly worked on Hive/Impala queries to categorize data of different claims.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Excellent understanding and knowledge of NoSQL databases like HBase.
- Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
Environment: CDH5, HDFS, Hive, Impala, Java, Sqoop, Oozie Workflows, Shell Scripts, Spark, Scala, MongoDB, IntelliJ, Gradle, Core Java, Junit.
Confidential, Santaclara, CA
Sr.Hadoop Developer
Responsibilities:
- Responsible for Requirement gathering, preparation of design documents
- Involved Low level design for MR, Hive, Shell scripts to process data.
- Worked on ETL scripts to pull the data from DB2/Oracle/MS-SQL Data Base into HDFS.
- Developed hive tables to upload data from different sources.
- Involved for Database Schema design.
- Involved Sprint Planning and Sprint Retrospective meetings
- Daily Scrum Status meeting.
- Performance optimizations on Spark/Scala. Diagnose and resolve performance issues.
- Sprint Demos for each sprint to Product Owner
- Developed Bedrock Workflow for CDB and ECODS, BOSS, SMART Data sources.
- Have setup the 64 node cluster and configured the entire Hadoop platform.
- Migrating the needed data from Oracle, MySQL in to HDFS using Sqoop and importing various formats of flat files in to HDFS.
- Setup MongoDB to store the ever growing application config entries in JSON format.
- Use MongoDB as a contingency DB for the current Oracle clusters.
- Good experience with NoSQL database.
- Worked with NoSQL database Hbase to create tables and store data.
- Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
- Proposed an automated system using Shell script to sqoop the job.
- Worked on Big Data Integration and Analytics based on Hadoop, SOLR, Spark, Kafka, Storm and web Methods technologies.
- Worked in Agile development approach.
- Created the estimates and defined the sprint stages.
- Developed a strategy for Full load and incremental load using Sqoop.
- Mainly worked on Hive queries to categorize data of different claims.
- Integrated the hive warehouse with HBase
- Written customized Hive UDFs in Java where the functionality is too complex.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
- Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
Environment: Apache Hadoop, HDFS, Hive, Java, Scala, Spark, Sqoop, MapR, GreenPlum, NoSQL, MySQL, Tableau, Bedrock Workflows, Shell Scripts.
Confidential, El Segundo, CA
Hadoop Developer
Responsibilities:
- Responsible for Requirement gathering, analyzing Data Sources like Omniture, iTunes, Spotify etc, and preparation of design documents.
- Wrote Map Reduce jobs using Java API.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Involved in loading data from UNIX file system to HDFS.
- Implemented secondary sorting to sort reducer output globally in Map Reduce.
- Implemented data pipeline by chaining multiple Mappers by using Chained Mappers.
- Experienced in handling different types of joins in Hive like Map joins, bucker map joins, sorted bucket map joins.
- Responsible for designing & creating hive tables to upload data
- Developed Shell scripts for Data flow automation, scripts for uploading data into ParAccel server.
- Worked on SQOOP scripts to pull the data from ORACLE Data Base into HDFS
- Written the Map Reduce programs, Hive UDFs in Java.
- Develop HIVE queries for the analysts.
- Created an e-mail notification service upon completion of job for the particular team which requested for the data.
- Strong understanding of Hadoop eco system such as HDFS, Map Reduce, HBase, Zookeeper, Pig, Hadoop streaming, Sqoop, Oozie and Hive
- Defined job work flows as per their dependencies in CRONTAB.
- Played a key role in productionizing the application after testing by BI analysts.
- Maintain System integrity of all sub-components related to Hadoop.
- Involved in orchestration of delta generation for time series data and Developed ETL from ParAcel database.
- Involved for Cassandra Database Schema design.
- Using BULK LOAD Utility data pushed to Cassandra databases.
Environment: Apache Hadoop, HDFS, Pig, Hive, SQOOP, ParAcel, Cassandra, Shell Scripts, Map Reduce, Horton Works, CRON Jobs, Oracle, MySQL.
Confidential, Cincinnati, OH
Software Programmer
Responsibilities:
- Involved in analysis, design and development of desktop and web based applications.
- Used Rational Rose for use case diagrams, Active flow diagrams, Class diagrams, Sequence diagrams and Object diagrams in design phase.
- Implemented J2EE standards, MVC2 architecture using Struts Framework.
- Implementing Servlets, JSP and Ajax to design the user interface.
- Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface.
- Developed application using spring (IOC/MVC/AOP), Hibernate frameworks.
- Worked on the spring MVC framework by creating custom JSP and custom tag libraries to give a rich UI look & feel for web pages.
- Developed presentation layer using HTML, CSS, and JavaScript.
- Developed stored procedures and triggers in PL/SQL.
- Used Html, CSS, JavaScript and JQuery to develop front end pages.
- Hibernate framework is used in persistence layer for mapping an object-oriented domain model to a relational database (oracle).
- Involved in integrating and business layer with DAO layer using custom frameworks, which internally uses Hibernate.
- Created web services using SOAP for SOA to get data from Mainframes and content manager.
- Involved in the development of SQL, PL/SQL Packages, and Stored Procedures.
- Used XML, XSLT, XPATH to extract data from Web Services output XML
- Extensively used JavaScript, JQuery and AJAX for client-side validation.
- Used ANT scripts to fetch, build, and deploy application to development environment.
- Used Log4j for logging and tracing java code.
Environment: J2EE, JSP, JDBC, Hibernante, spring, HTML, Java Script, JQuery, CSS, Oracle.
Confidential
Software Engineer
Responsibilities:
- Conducted requirements gathering sessions with the business user to collect business requirements (BRDs), data requirement, and user interface requirements.
- Responsible for the initiation, planning, execution, control and completion of the project
- Worked alongside the Development team in solving critical issues during the development.
- Responsible for developing management reporting using Cognos reporting tool.
- Conducted User Interview and documented reconciliation work flows.
- Conducted detailed analysis of current processes and developed new process flow, data flow, and work flow models, Use Cases using Rational Rose & MS Visio
- Prepared Use Cases, Business Process Models and Data flow diagrams, User Interface models.
- Gathered & analyzed requirements for EAuto, designed process flow diagrams.
- Defined business processes related to the project and provided technical direction to development workgroup.
- Analyzed the legacy and the Financial Data Warehouse.
- Participated in Data base design sessions, Database normalization meetings.
- Managed Change Request Management and Defect Management.
- Managed UAT testing and developed test strategies, test plans, reviewed QA test plans for appropriate test coverage.
- Coordinating with the Build Team, to deploying the application all along from Integration, Functional, and Regression and till Production.
- Preparing Skill Matrix for all Team Member.
Environment: Java, J2EE, JCL, DB2, CICS.