Hadoop Lead Resume

SUMMARY

Hadoop Developer with 4 years of working experience on designing and implementing complete end - to-end Hadoop Infrastructure using MapReduce, PIG, HIVE, Sqoop, Oozie, Flume.
Java Programmer with 6+ years of Extensive programming experience in developing web based applications and Client-Server technologies.
Expert in MongoDB & Cassandra, Implemented & setup the clusters.
Created Util Class to Abstract the MongoDB & Cassandra DB access.
Full Understanding of utilizing JEE technology Stack.
Expert Hands-on in Installing, Configuring, Testing Hadoop Ecosystem components.
Good knowledge ofHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
Experience in working with MapReduce programs using Hadoopfor working with Big Data.
Experience in analyzing data using Hive QL, Pig Latin and custom MapReduce programs in Java
Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
Collecting and aggregating large amount of Log Data using Apache Flume and storing data in HDFS for further analysis.
Job/workflow scheduling and monitoring tools like Oozie.
Experience in designing both time driven and data driven automated workflows using Oozie
Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) using Agile Methodologies.
Transforming some existing programs into lambda architecture.
Experience in automating the Hadoop Installation, configuration and maintaining the cluster by using the tools like puppet.
Experience in setting up monitoring infrastructure for Hadoop cluster using Nagios and Ganglia.
Experience on Hadoop clusters using major Hadoop Distributions - Cloudera(CDH4, CDH5), Hortonworks(HDP)
Experience in different layers of Hadoop Framework - Storage (HDFS), Analysis (Pig and Hive), Engineering (Jobs and Workflows).
Experienced in using Integrated Development environments like Eclipse, NetBeans, Kate and gEdit.
Migration from different databases (i.e.Oracle, DB2, Cassandra, MongoDB) to Hadoop.
Worked and migrated RDMBS databases into different NoSQL database.
Prior experience working as Software Developer in Java related Frameworks Like Spring, ORM Frameworks(Hibernate, ATG e-commerce, JDO, OpenJPA ), ESB Related frameworks (Apache Camel, Mule ESB), OSGi Frameworks (Apache Karaf), Open Source JMS Framework (Apache Kafka), Open Source Data Ingestion Framework (Apache Spark, Apache Storm).
Experience in designing and coding web applications using Core Java & web Technologies- JSP, Servlets and JDBC.
Excellent knowledge in Java and SQL in application development and deployment.
Hands on experience in creating various database objects like tables, views, functions, and triggers using SQL.
Excellent technical, communication, analytical and problem solving skills and ability to get on well with people including cross-cultural backgrounds and trouble-shooting capabilities.
Familiar with data warehousing “fact” and “dim” table and star schema and combined with Google Fusion tables for visualization.
Familiar with Scala, closures, higher order functions, monads.
Holding Expertise Knowledge of version control tools like SVN, CSV or Git.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Hive, Pig, Sqoop, Spark, Storm, Oozie,MongoDB,Cassandra

Operating Systems: Windows XP, Windows 7/8, Linux Distro (Ubuntu, Mint, Fedora)

Languages: Java, Python, R, C#, R, Haskell

Java Technologies: JDBC 4.1, Servlets 2.4, JSP 2.0

Web Technologies: HTML, JavaScript, jQuery, AJAX

Scripting Language: UNIX Shell Script, K Shell

Frameworks: Spring4.0, Hibernate5.0

RDBMS DB: Oracle, MySQL, PostgreSQL, IBM DB2

NoSQL Technologies: Cassandra, MongoDB, Neo4j,HBase

Servers: Tomcat, JBoss, Web Logic

Tools: & Utilities: Eclipse, Net Beans, My Eclipse, SVN, Git, Maven, SOAP UIJMX explorer, XML Spy

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Lead

Responsibilities:

Devised and lead the implementation of the next generation architecture for more efficient data ingestion and processing.
Proficiency with mentoring and on-boarding new engineers who are not proficient in Hadoop and getting them up to speed quickly.
Experience with being a technical lead of a team of engineers.
Proficiency with modern natural language processing and general machine learning techniques and approaches
Extensive experience with Hadoop and HBase, including multiple public presentations about these technologies.
Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems and suggested some solution translation via lambda architecture.
Used Splunk for HadoopOps for Managing, Monitoring and reviewing whole infrastructure lives operations & activity.
Also Managed MapReduce job to rapidly sort, filter and report on performance metrics, time, status, user or resource usage.
Identify concurrent job workloads that may impact or be impacted by failures or bottlenecks.
Created a definitive record of user activity across the cluster and with role-based access to Splunk searches same.
Developed some utility helper classes to get data from HBase tables.
Worked as an agile team member to carry out any activity, did pair programming or supported work and provided code review & performance optimization existing MapReduce programs like customized partitioner or combiner or Input reader classes etc.
Attending daily status call to follow scrum process to complete each user story within timeline.
Also be the part of triage call to handle defect reported by tester team or QA team.
Coordinating with EM to resolve any configuration related issues.
Implemented Cluster for Cassandra, MongoDB as a part of POC to address HBase limitations.
Worked on Implementation of toolkit that abstracted solr & ElasticSearch.
Maintaining Authentication module to support Kerberos.
Viewing various aspect of cluster using Cloudera Manager.
Introduced data ingestion using spark streaming for page Views by reading an event stream over HTTP, and groups it into 1-second intervals. It then transforms the event stream to get a D-Stream of (URL, 1) pairs called ones, and performs a running count of these using the runningReduce operator.

Environment: Hadoop, Linux, CDH4, MapReduce, HDFS, Hive, Pig, Shell Scripting, Sqoop, Java 7, NoSQL,Eclipse, Oracle 11g, Maven, Log4j, Mockito, Git, ATG ecommerce, Spring, Apache Kafka, Apache Spark, Logstash, ElasticSearch, solr.

Confidential

Hadoop Developer

Responsibilities:

Loading the data from the different Data sources like (Teradata, DB2, Oracle and flatfiles) into HDFS using Sqoop and load into Hive tables, which are partitioned.
Created different pig scripts & converted them as shell command to provide aliases for common operation for project business flow.
Implemented various Hive queries for Analysis and call then from java client engine to run on different nodes.
Created few Hive UDF’s to as well to hide or abstract complex repetitive rules.
Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
Developed bash scripts to bring the log files from FTP server and then processing it to load into Hive tables.
All the bash scripts are scheduled using Resource Manager Scheduler.
Moved data from HDFS to Cassandra using Map Reduce and BulkOutputFormat class.
Developed Map Reduce programs for applying business rules on the data.
Did Implementation using Apache Kafka replacement for a more traditional message broker (JMS Solace) to reduce licensing and decouple processing from data producers, to buffer unprocessed messages.
Implemented receiver based approach, here I worked on Spark streaming for linking with StreamingContext using java API and handle proper closing & waiting stages as well.
Experience in Implementing Rack Topology scripts to the Hadoop Cluster.
Implemented the part to resolve issues related with old Hazelcast API EntryProcessor.
Participated with the admin team in designing and upgrading CDH 3 to HDP 4.
Developed Some Helper class for abstracting Cassandra cluster connection act as core toolkit.
Enhanced existing module written in python scripts.
Used dashboard tools like Tableau.

Environment: Hadoop HDP, Linux, MapReduce, HDFS, Hive, Pig, Tableau, NoSQL,Shell Scripting, Sqoop, Java 6, Eclipse, Oracle 10g, Maven, Open source technologies Apache Kafka, Apache Spark, Hazelcast, Git, Mockito, python.

Confidential, NY

Associate Hadoop Consultant

Responsibilities:

Understand the exact requirement of report from the Business groups and users.
Frequent interactions with Business partners.
Imported trading and derivatives data in Hadoop Distributed File System and Eco System (MapReduce, Pig, Hive, Sqoop).
Was part of activity to setup Hadoop ecosystem Confidential dev & QA Environment.
Managed and reviewed Hadoop Log files.
Responsible writing PIG Script and Hive queries for data processing
Running Sqoop for importing data from Oracle & Other Database.
Creation of shell script to collect raw logs from different machines.
Created Partition in hive as static and dynamic.
Implemented Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH,GENERATE, GROUP, COGROUP, ORDER, LIMIT, AND UNION.
Defined some PIG UDF for some financial functions such as swap, hedging, Speculation and arbitrage
Coded many MapReduce program to process unstructured logs file.
Worked on Import and export data into HDFS and Hive using Sqoop
Used parameterize pig script and optimized script using illustrate and explain.
Involved in the process of configuring HA, Kerberos security issues and name node failure restoration activity time to time as a part of zero downtime.
Implemented FAIR Scheduler as well.

Environment: Hadoop, Linux, MapReduce, HDFS, Hive, Pig, Shell Scripting, Sqoop, Java 6, Eclipse, Ant, Log4j and Junit.

Confidential, NY

Associate Java Consultant with Hadoop

Responsibilities:

Write design document based on requirements from MMSEA user guide.
Designed and developed Medicare-Medicaid system using Model driven architecture on a customized framework built on Spring.
Performed requirement gathering, design, coding, testing, implementation and deployment.
Worked on modeling of Dialog process, Business Processes and coding Business Objects, QueryMapper and JUnit files.
Created the Business Objects methods using Java and integrating the activity diagrams.
Worked in web services using SOAP, WSDL.
Wrote Query Mappers and JUnit Test Cases.
Developed the UI using XSL and java script.
Managed software configuration using ClearCase and SVN.
Design, develop and test features and enhancements.
Perform error rate analysis for production issues and technical errors.
Provide production support. Fix production defects.
Analyze user requirement document and develop test plan, which includes test objectives, test strategies, test environment, and test priorities.
Perform Functional testing, Performance testing, Integration testing, Regression testing, Smoke testing and User Acceptance Testing (UAT).
Converted Complex SQL queries running Confidential mainframes into pig and Hive as a part of migration from mainframes into Hadoop cluster.

Environment: Shell Scripting, Java 6, JEE, Spring, Hibernate, Eclipse, Oracle 10g, Javascripts, Servlets, Nodejs, JMS, Ant, Log4j and Junit, Hadoop (Pig & Hive).

Confidential

Software Engineer

Responsibilities:

Coordinate with the Technical Director on current programming tasks.
Collaborate with other programmers to design and implement features.
Quickly produce well-organized, optimized, and documented source code.
Create and document software tools required by artists or other developers.
Debug existing source code and polish feature sets.
Contribute to technical design documentation.
Work independently when required.
Continuously learn and improve skills.

Environment: Windows, Java 6, Java Card API, Java Communication API, IVR, Eclipse, Ant, Log4j and Junit.

Confidential

Java Developer

Responsibilities:

Implemented Processor classes to provide opportunity/risk assessment, special ad-hoc analysis and what-if scenarios to enable financial decisions and inventory action plans.
Enhanced the classes that help in to Identify opportunities in product flow (timing/quantity/distribution), brand assortment tiring to store volume groups, price strategy, program mix, locational demographic trends, and pre-season investment strategy.
Supported the development of modules that handle item rationalization with item performance information to recommend assortment edits for under-performing colors/styles based on relative GMROI/unit turn.
Converted Strategies into code to address liabilities & liquidate in a timely manner through collaboration with merchants, inventory managers and Inventory Director.
Provided analysis and strategy recommendations to reduce non-productive inventory, increase sales-unit turn, and reach targeted Weeks of Supply or seasonal sell-through % for key programs.
Directly partnered with VP/DMM to develop strategic merchandise business plans including long term financial strategies, divisional inventory, and turn and GMROI targets.
Collaborated with inventory management peers and Inventory Director to manage system, developing tools and driving consistent business practices between all businesses.
Identified trends, opportunities, and risks to current forecasts and the next period's plan.

Environment: Windows & Linux, Java 6, Eclipse, UNIX Shell Scripting, Oracle 10g, Javascripts, Servlets, Nodejs, JMS, Maven, Log4j and Junit.

Confidential

Software Engineer

Responsibilities:

Working in collaboration with clients to create cutting-edge AI solutions.
Analyzing the usage data to optimize the performance of the platform.
Participating in experimental product development.

Environment: Windows Java 6, Eclipse, Encog, Weka, Ant, Log4j and Junit.

Confidential

Trainee Developer

Responsibilities:

May assist in requirements capture under supervision.
Produces detailed low-level designs from high level design
Specifications for components of low level complexity.
Develops, builds and unit tests components of low level
Complexity from detailed low-level designs.
Carries out unit testing on own developed code, developing
Test harnesses if necessary.
Completes incident management cycle under supervision.
Applies all relevant standards and procedures to own work.
Develops technical knowledge and awareness of those
Technical areas in which requested to code.
Accurately records own time and accurately reports progress on own work.

Environment: Windows, Java 5, Spring, Hibernate, JSP, Servlet, Ant, Log4j and Junit.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship