Hadoop Developer Resume
Boston, MA
SUMMARY
- Around 8 years of experience in full software development life cycle from concept through the delivery of applications and customizable solutions with emphasis on Object Oriented Programming, Java/J2EE, SQL and Hadoop / Big Data technologies.
- Excellent understanding of Big Data and Hadoop Ecosystems.
- Hands - on experience in setting up and configuring Apache Hadoop and Cloudera CDH clusters on Ubuntu, Red Hat Linux distribution environments.
- Experience in importing and exporting data from relational database into HDFS using Sqoop.
- Knowledge of Professional software engineering practices & best practices SDLC, including coding standards, code reviews, source control management, building process, testing and operations.
- Hands on experience on Hadoop Ecosystem components (HDFS, MapReduce, Pig, Hive, HBase, Sqoop, Flume, Oozie, Spark(Python),Storm, MongoDB, Cassandra).
- Involved in Installation and configuration of Hadoop Ecosystem components along with Hadoop Admin.
- Hands on Experience in performing analytics on structured data in Hive with Hive queries, Views, Partitioning, Bucketing and UDF’s using HiveQL.
- Performed tuning of Hive queries along with Java Mapreduce programs in order to reduce execution time and achieve higher scalability.
- Extensive experience in configuring flume to stream data into HDFS.
- Hands on experience on Teradata Migration to Hadoop Platform.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Efficiently dealt with structured or unstructured data.
- Experience in managing and reviewing Hadoop log files.
- Hands-on experience in Python scripting, Jenkin deployments and Linux shell scripting.
- Automated Spark-SQL scripts using Unix shell script and Involved in Optimization of Hive Queries.Also used Optimization techniques for better performance.
- Performed Data Ingestion to Hadoop file system from different data sources.
- Analyzed different file formats and large data sets by running Hive queries and Pig scripts.
- Using Apache Flume,collected and stored streaming data(log data) in HDFS.
- Extensively used Apache Sqoop for efficiently transferring bulk data between Apache Hadoop and relational databases (Teradata).
- Responsible for creating, modifying and deleting topics (Kafka Queues) as and when required by the Business team.
- Developed tests cases and POC’s to benchmark and verify data flow through the Kafka clusters.
- Automated sqoop,hive and pig jobs using Oozie scheduling.
- Extensive knowledge in NoSQL databases like Hbase,MangoDB,Cassandra.
- Have good knowledge on writing and using the user defined functions in HIVE,PIG and MapReduce.
- Configured & deployed and maintained multi-node Dev and Test Kafka Clusters.
- Able to transform the complex business requirement of user into technicaal specification.
- Familiarity with version control tools such as git.
- Good working experience on different OS like UNIX/Linux, Mac-OSX, Windows.
- Great documentation skills of design documents constituting of UML diagrams.
- Experienced in working with senior level managers, business people and developers across multiple disciplines.
- Weekly meetings with technical collaborators and active participation in data warehousing and data analysis as per customer needs.
- Strong written and oral communication.
- Ability to quickly learn and adapt to the new working environment and emerging new technologies.
TECHNICAL SKILLS
Big Data/Hadoop: Hadoop Ecosystems(Hive,Pig,Sqoop,Flume,Zookeeper,oozie,MR,Hbase),Kafka,Storm andSpark(Spark SQL,Scala)
Scripting Language: Unix Shell Scripting, Python
Methodologies: Agile, Waterfall model
Technologies: Hadoop,Spark,Scala,Db2,CoreJava,JDBC,JavaScript,SQL
Database: Teradata, SQL, My SQL, DB2,HBase, Cassandra
Servers: Tomcat
IDE: Eclipse, Net Beans
PROFESSIONAL EXPERIENCE
Confidential, Boston, MA
Hadoop Developer
Responsibilities:
- Involved in complete SDLC life cycle of big data project that includes requirement analysis, design, coding, testing and production
- Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop jobs for last saved value.
- Created Talend workflows to configure various jobs using various Bigdata Connectors like Hive, HBASE consumers to provide clean data abstraction between producers and consumers.
- Created Map Reduce programs in order to analyze data and used Pig Latin to transform data.
- Installed and configured Hive and wrote Hive Generic UDF to successfully implement business requirements.
- Involved in creating hive tables, loading data into tables and writing Hive queries.
- Experienced with using different kind of compression techniques to save disk usage and optimize data transfer over network using Lzo, Snappy etc. in Hive tables.
- Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
- Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala.
- Developed Scripts and Auto sys Jobs to schedule a bundle (group of coordinators), which consists of variousHadoop Programs using Oozie.
- Hands on experience with Accessing and perform CRUD operations against HBase data using Talend.
Environment: Hadoop, Spark, HDFS, Map Reduce, HBase, Talend, Hive, Flume, Sqoop, PIG, CDH
Confidential, Los Angeles, CA
Hadoop Developer
Responsibilities:
- Worked extensively in creating MapReduce jobs to power data for search and aggregation
- Designed a data warehouse using Hive
- Worked extensively with Sqoop for importing metadata from Oracle
- Extensively used Pig for data cleansing
- Created partitioned tables in Hive
- Worked with business teams and created Hive queries for ad hoc access.
- Evaluated usage of Oozie for Workflow Orchestration
- Mentored analyst and test team for writing Hive Queries
- Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Hortonworks, Oozie, Oracle 11g/10g.
Confidential
Java Developer
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC).
- Implemented Multithread concepts inJavaclasses to avoid deadlocking.
- Involved in High Level Design and prepared Logical view of the application.
- Involved in designing and developing of Object Oriented methodologies using UML and created Use Case, Class, Sequence diagrams and also in complete development, testing and maintenance process of the application.
- Created CorejavaInterfaces and Abstract classes for different functionalities.
- Responsible for Analysis, Design, Development and Integration of UI components with backend usingJ2EEtechnologies such as Servlets, JSP, JDBC.
Confidential
Application Developer
Responsibilities:
- Worked on Db2 and Sql for FSDB.
- Involved in source analysis and Inventory Phase of the Project.
- Involved in Coding new modules, bug fixing, testing of Jobs and ABEND handling.
- Involved in preparation of project report and took KT for new members of the team.
- Involved in Unit testing, System Testing, UAT, Integration Testing, Regression Testing and Deployments.
- Involved in distribution & management of project work with other vendors like Accenture.
