Sr. Hadoop/big Data Developer Scottdale, Az
OBJECTIVE:
To make the most of my potential and discover new horizons in the field of acting. To utilize my perfect blend of 8 years active experience and creativity, those enable me to perform in the best possible way for many years.
SUMMARY:
- Around 8 years of professional IT experience in Analyzing, Designing, Development, Testing, Documentation, Deployment, Integration, and Maintenance of web based and Client/Server applications using Java and Big Data technologies (Hadoop and Spark).
- Around 5 Years of Big data related architecture experience developing and implementing Data Lake.
- Progressive experience in Requirement gathering, Analysis, Development, Enhancement and Testing of applications.
- Experienced in the Hadoop ecosystem components like Hadoop Map Reduce, Cloudera, Hortonworks, HBase, Oozie, Flume, Kafka, Hive, Scala, SPARK SQL, Data Frames, SQOOP, MySQL,Unix commands, Cassandra, MongoDB, Tableau tool and related Big data tools.
- Developed Apache Spark jobs using Scala in test environment for faster data processing and usedSpark SQL for querying.
- Hands on developing and debugging YARN (MR2) Jobs to process large Datasets.
- Experience in converting Map Reduce applications to Spark.
- Good working experience using Sqoop to import data into HDFS from RDBMS and vice - versa.
- Worked on standards and proof of concept in support of CDH4 and CDH5 implementation using AWS cloudinfrastructure.
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map Reduceand Pig jobs.
- Experience in capturing requirements, developing functional design/specification with the help of technical knowledge and business knowledge acquired.
- Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User DefinedTable-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.
- Good knowledge on executing SparkSQL queries against data in Hive by using hive context in spark.
- Experience in support of IBM Mainframe applications - MVS, COBOL, JCL, PROCs, VSAM, File aid, JCL, SQL and DB2.
- Strong conceptual and technical knowledge of entire SDLC - Requirement Gathering & Analysis, Planning, Design, Development, Testing and Implementation.
- Experience in troubleshooting, finding root causes, Debugging and automating solutions for operational issues in the production environment.
- Mentoring and training project members to enable them to perform their activities effectively.
TECHNICAL SKILLS:
Big Data Technology: HDFS, MapReduce, HBase, Pig, Hive, SOLR, Sqoop, Flume, MongoDB, Cassandra,Puppet, Oozie, Zookeeper, Spark, Kafka, Talend
Hadoop Distribution: Cloudera, Hortonworks, MapR, Apache and IBM Big Insights.
Cloud Computing Service: AWS (Amazon Web Services)
Programming Languages: Java, C/C++, SQL, HTML, CSS, JavaScript, jQuery, Scala, Spark, UNIX shell script, JDBC, Python
Development Frameworks/IDE: Eclipse, NetBeans, Intellji, Spark Eclipse
Databases: JDBC, NoSQL, Oracle 11g/10g/9i/8i, DB2, MySQL
NoSQL: HBase, Cassandra, MongoDB
ETL Tools: Informatica, Talend
Operating System: Windows, Macintosh, Linux and Unix
PROFESSIONAL EXPERIENCE:
Sr. Hadoop/Big data Developer
Confidential, Scottdale, AZ
Responsibilities:- Involved in creating the Impact analysis document, by analyzing all the components thatwould potentially need to be modified as part of this request.
- Involved in modification of COBOL programs and parm cards to change the logic of storenumber processing from 4 to 5 digit.
- Expertise in designing and deployment of Hadoop cluster and different Big Data analytictools including Pig, Hive, HBase, Oozie, Zookeeper, Sqoop, flume, Kafka, Spark,Impala, Cassandra with Cloudera.
- Developed Spark code using Python and Spark-SQL/Spark Streaming for faster testing andprocessing of data.
- Involved in migration from Hadoop System to Spark System.
- Created Visio diagrams depicting the business process flow.
- Implemented different machine learning techniques in Scala using Scala machine learning library.
- Used Spark API over Hadoop YARN as execution engine for data analytics using Hive.
- Developed Sqoop scripts to import and export data from RDBMS into HDFS, HIVE andhandled incremental loading on the customer and transaction information data dynamically.
- Used apache NIFI to copy the data from local file system to HDFS.
- Worked with HBase in creating HBase tables to load the data like semi structured data fromdifferent resources.
- Experience in writing complex Hive scripts.
- Involved in Peer review of the request and providing feedback to team members.
- Documentation is done in the spreadsheet.
- Involved in Unit testing, Integration testing and creating the test plan and test resultswhich cover all the scenarios as specified by the business requirement.
- Provided support during production deployment.
Hadoop Developer
Confidential, San Ramon, CA
Responsibilities:- Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
- Worked on Spark for in memory commutations and comparing the Data Frames for ptimizing performance.
- Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
- Configured Zookeeper to implement node coordination, in clustering support.
- Worked on BigData Analytics to load the data from Source all through into Client'sModern Analytics Platform.
- Analyzed and ingested Policy, Claims, Billing and Agency Data in Client's Solution, which isdone through multiple stages.
- Backed up data on regular basis to a remote cluster using DISTCP.
- Understand business requirements by interacting with clients and transforming the requirements to functional design specifications.
- Design, code, unit test, system test and Debug of code to deliver high qualitydeliverables with least number of defects.
- Worked on Easytrieve programs for report generation.
- Developing the applications using programming languages like Scala and Spark.
- Worked on Dataframes and Spark SQL for efficient data querying and analysis.
- Created numerous Internal and External tables in Hive by partitioning/bucketing concepts based on architectural design of applications.
- Re-write some Hive queries to Spark SQL to reduce the overall batch time.
- Used Sqoop to migrate the data from MySQL tables into HDFS and Hive DB.Implemented importing all tables into Hive DB, incrementalappends and last modified updates etc.
- Developed Tableau reports from Hive tables for Business team, for analysis and research purpose.
- Installation and release activities on products into Hadoop cluster. Debugging and troubleshooting the issues in development and Test environments.
- Identified the root cause for repeatedly occurring tickets and suggested logical solution to clients to cut down on the number of production ticket for the particular application.
Hadoop Developer
Confidential, Grand Rapids, MI
Responsibilities:- Understanding the business requirement and functional design and creating the Technical design for the same.
- Involved in coding of data transformation load programs using batch COBOL programs.
- Involved in creating VSAM files and loading them with key information, which was used as reference files in programs throughout the process, to find out if an item is ScanBased Trading (SBT) item or not.
- Involved in creating extensive test cases which covered all the aspects of the business requirement and capturing the screen shots of the same.
- Created extensive Application Knowledge documents as part of the project and saved them in Visual Source safe (VSS).
- Worked on DB2 load and unload utilities.
- Worked on WMQFTE utility for transferring files onto server, to be fetched and processedby other module members.
- Involved in unit testing and system testing and documentation of test cases and test results to be shared with the onsite clients.
- Provided estimates for the requests, set detailed schedules and expectations, delegated task among team members, monitored overall progress, and provided status report to upper management.
Java/ Hadoop Developer with ETL
Confidential
Responsibilities:- Collected and aggregated large amounts of web log data from different sources such as web servers,mobile and network devices using Apache Flume and stored the data into HDFS for analysis.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase databaseand Sqoop.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generatereports for the BI team.
- Developed PIG UDF'S for manipulating the data according to Business Requirements and also worked ondeveloping custom PIG Loaders.
- Installed and configured Hadoop, Map Reduce, HDFS (Hadoop Distributed File System).
- Developed multiple Map Reduce jobs in java for data cleaning.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views,visit duration, most purchased product on website.
- Created data flow diagrams, data mapping from Source to stage and Stage to Target mappingdocuments indicating the source tables, columns, data types, transformations required and businessrules to be applied.
- Validated ETL mappings and tuned them for better performance and implemented various Performanceand Tuning techniques.
- Used JBoss Application server as the JMS provider to manage the sessions and queues.
- Data integrity / quality testing. Custom table creation and population, custom and package indexanalysis and maintenance in relation to process performance.
Java Developer
Confidential
Responsibilities:- Designed and Developed UI's using JSP by following MVC architecture.
- Extensively used XML where in process details are stored in the database and used the storedXML whenever needed.
- Designed the control which includes Class Diagrams and Sequence Diagrams using VISIO.
- Implemented modules like Client Management, Vendor Management.
- Implemented Access Control Mechanism to provide various access levels to the user.
- Implemented Home Interface, Remote Interface, and Bean Implementation class.
- Designed and developed Unit and integration test cases using JUnit.
- Developed JavaScript for client-side validations in JSP.
- Developed JSPs with Struts taglibs for the presentation layer.
- Wrote PL/SQL queries to access data from Oracle database.
- Prepared test plans and writing test cases.
Java Designer and Developer
Confidential
Responsibilities:- Involved in designing business layer and data managementcomponents using MVC frameworks such as Struts and Java/J2EE.
- Created and configured domains in production, development and testing environments usingconfiguration wizard.
- Deployed and tested the application using Tomcat web server.
- Writing the Junit test cases to all the components in the product.
- Ability to understand Functional Requirements and Design Documents.
- Developed Use Case Diagrams, Class Diagrams, Sequence Diagram, Data Flow Diagram.
- Web related development with JSP, AJAX, HTML, XML, XSLT, and CSS.
- Create and enhance the stored procedures, PL/SQL, SQL for Oracle 9i RDBMS.
- Extensively used UNIX /FTP for shell Scripting and pulling the Logs from the Server.
- Provided further Maintenance and support, this involves working with the Client and solvingtheir problems which include major Bug fixing.