Sr. Hadoop Developer Resume
New York, NY
SUMMARY:
- 8+ years of progressive experience in the IT industry with proven expertise in Analysis, Design, Development, Implementation and Testing of software applications using Big Data(Hadoop) Technologies and Java based technologies.
- 4+ years of hands on experience with Big Data Hadoopcore and Eco - System components including Spark, Scala, HDFS, Map Reduce, Hive, Pig, Storm, Kafka, YARN, HBase, Oozie, Zookeeper, Flume, Sqoop and Cassandra.
- Experience working with Horton works distribution and Cloudera Hadoop distribution.
- Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Managing and Reviewing data backups and log files.
- Developed multiple spark jobs in Scala/python for data cleaning, pre-processing and aggregating.
- Expertise in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
- Optimized streaming log files with no time latency using Flume and more importantly operating the data down stream flow to Hadoopecosystems and it analysis segments.
- Developed multiple MapReduce jobs in java for data cleaning, pre-processing.
- Automated all the jobs for extracting the data from different Data Sources like MySQL to pushing the result set data to Hadoop Distributed File System.
- Experience in importing the data from the MySQL into the HDFS using Sqoop.
- Hands onwith NoSQL databases like MongoDB, HBase and Cassandra.
- Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
- Developed Pig Latin scripts for data cleansing and Transformation.
- Good knowledge on various scripting languages like Linux/Unix shell scripting and Python.
- Hands onimporting the unstructured data into the HDFS using Flume.
- Experience working with Build tools like Maven and Ant.
- Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
- Experience in working with Databases like oracle, MySQL, IBM DB2, Teradata.
- Experience in Core java and J2EE technologies such as spring, structs, Hibernate, JDBC, EJB, Servlets, JSP and JavaScript.
- Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
- Experienced and skilled Agile Developer with a strong record of excellent teamwork and successful coding.
- Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.
TECHNICAL SKILLS:
Hadoop Ecosystem: Hadoop, HDFS, MapReduce, Hive, Pig, Spark-streaming, Scala, Kafka, Storm, Zoo Keeper, HBase, Yarn, Spark, Sqoop, Flume, Mahout.
Programming Languages: C++, JAVA, Python, Scala
Hadoop Distributions: Apache Hadoop, ClouderaHadoop Distribution CDH3, CDH4, CDH5 and Horton works Data Platform (HDP)
NoSQL Databases: HBase, Cassandra, MongoDB
Query Languages: HiveQL, SQL, PL/SQL, Pig
Web Technologies: Java, J2EE, Struts, Spring, JSP, Servlet, JDBC, EJB, JavaScript
IDE’s: Eclipse, NetBeans
Frameworks: MVC, Struts, Spring, Hibernate
Build Tools: Ant, Maven
Databases: Oracle, MYSQL, MS Access, DB2, Teradata
Operating systems: Windows (Red Hat, CentOS), Linux, Unix, CentOS
Scripting Languages: Shell scripting
Version Control system: SVN, GIT, CVS
PROFESSIONAL EXPERIENCE:
Confidential, New york, NY
Sr. Hadoop Developer
Responsibilities:
- Migrated complex Map reduce programs into Spark RDD transformations, actions.
- Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Map Reduce, Hive and spark.
- Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
- Exporting of result set from HIVE to MySQL using Sqoop export tool for further processing.
- Evaluated the performance of Apache Spark in analyzing genomic data.
- Implemented Hive complex UDF's to execute business logic with Hive Queries.
- Implemented Impala for data analysis.
- Prepared Linux shell scripts for automating the process.
- Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
- Automation of all the jobs starting from pulling the Data from different Data Sources like MySQL and pushing the result dataset to Hadoop Distributed File System and running MR, PIG, and Hive jobs using Kettle and Oozie (Work Flow management).
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Load and transform large sets of structured, semi structured, and unstructured data with Map Reduce, Hive, and Pig.
- Involved in loading data from LINUX file system, servers, Java web services using Kafka Producers, partitions.
- Evaluated usage of Oozie for Workflow Orchestration.
- Worked with NoSQL databases like HBase in creating tables to load large sets of semi structured data coming from various sources.
- Created partitioned tables in Hive, mentored analyst and test team for writing Hive Queries.
- Involved in cluster setup, monitoring, test benchmarks for results.
- Involved in agile methodologies, daily scrum meetings, Sprint planning's.
Environment: Hadoop, Spark, HDFS, Pig, Hive, Flume, Sqoop, kafka, Oozie, HBase, Zookeeper, MySQL, Shell scripting, Linux Red Hat, core Java 7, Eclipse.
Confidential, Hartford, CT
Sr. Hadoop Developer
Responsibilities:
- Good understanding and related experience with Hadoop stack-internals, Hive, Pig and Map Reduce.
- Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.
- Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
- Involved in loading data from UNIX file system to HDFS.
- Wrote MapReduce jobs to discover trends in data usage by users.
- Used Map Reduce JUnit for unit testing.
- Involved in managing and reviewing Hadoop log files.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Wrote pig UDF's.
- Develop HIVE queries for the analysts.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Exported the result set from HIVE to MySQL using Shell scripts.
- Worked with SPARK for quick analytics on object relationships.
- Used Zookeeper for various types of centralized configurations.
- Involved in maintaining various Unix Shell scripts.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Automated all the jobs starting from pulling the Data from different Data Sources like MySQL to pushing the result set Data to Hadoop Distributed File System using Sqoop.
- Used SVN for version control.
- Worked on defect resolution in production environment.
- Worked on production issues and logged them in Quality Centre.
- Helped the team to increase Cluster from 25 Nodes to 40 Nodes.
- Maintain System integrity of all sub-components (primarily HDFS, MR, HBase, and Flume).
- Monitor System health and logs and respond accordingly to any warning or failure conditions.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Spark, Yarn, Sqoop, Java 1.6, UNIX Shell Scripting.
Confidential, Indianapolis, IN
Hadoop Developer
Responsibilities:
- Experience in configuration, management, supporting and monitoring Hadoop cluster using Cloudera distribution.
- Worked in Agile scrum development model on analyzing Hadoop cluster and different Big Data analytic tools including Map Reduce, Pig, Hive, Flume, Oozie and SQOOP.
- Configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Loaded data into cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Established custom MapReduce programs to analyze data and used Pig Latin to clean unwanted data.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Implemented Partitioning, dynamic Partitions and Buckets in Hive for increasing performance benefit and helping in organizing data in logical fashion.
- Implemented in loading and transforming of large data sets of different types of data formats like structured and semi-structured data.
- Responsible to manage data coming from different sources.
- Involved in creating Hive Tables, loading data and writing hive queries.
- Involved in scheduling Oozie workflow engine to run jobs automatically.
- Implemented No SQL database like HBase for storing and processing different formats of data.
- Involved in Testing and coordination with business in User testing.
- Involved in Unit testing and delivered Unit test plans and results documents.
Environment: Apache Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, Oozie, HBase, UNIX shell scripting, Zookeeper, Java, Eclipse.
Confidential
Java/J2EE Developer
Responsibilities:
- Involved in Java, J2EE, struts, web services and Hibernate in a fast paced development environment.
- Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.
- Involved in design and implementation of web tier using Servlets and JSP.
- Used Apache POI for Excel files reading.
- Developed the user interface using JSP and Java Script to view all online trading transactions.
- Designed and developed Data Access Objects (DAO) to access the database.
- Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects.
- Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.
- Coded HTML pages using CSS for static content generation with JavaScript for validations.
- Used JDBC API to connect to the database and carry out database operations.
- Used JSP and JSTL Tag Libraries for developing User Interface components.
- Performing Code Reviews.
- Performed unit testing, system testing and integration testing.
- Involved in building and deployment of application in Linux environment.
Environment: Java, J2EE, JDBC, Struts, Servlets, JSP, JavaScript, HTML, SQL. Hibernate, Eclipse, Apache POI, CSS.
Confidential
Java Developer
Responsibilities:
- Involved in creating use case, class, sequence, package dependency diagrams using UML.
- Also involved in analysis and requirements gathering phase.
- Developed Server side code using Servlets, JSPs running on Apache tomcat 3.0 and Enterprise Beans running on IBM Web Sphere Application Server.
- Developed web pages using HTML, JSP, DHTML and CSS.
- Used JavaScript for certain form validations, submissions and other client side operations.
- Created Stateless Session Beans to communicate with the client.
- Created the database tables in Oracle 7i; created the required SQL queries and used JDBC to perform database operations.
Environment: Java, HTML, JSP, CSS, Servlets, JavaScript, JDBC, Oracle 7i, EJB 1.1, Apache tomcat 3.0, IBM Web sphere.
