Sr. Hadoop Developer Resume
Chicago, IL
SUMMARY
- Around 8+ years of experience in IT which includes Analysis, Design,Development, Implementation & maintenance of projects in Big Data usingApacheHadoop/Spark echo systems, design and development of web applications using JAVA.
- Around 4+ years of experience in analysis, design, development and integration using Big Data Hadoopecosystem components like Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, HBase, Zookeeper and Flume with Cloudera distribution in working with various file formats like Avro,Parquet and compression techniques like Snappy, LZO, GZip.
- In depth understanding/knowledge of Hadoop Architecture and its components such as Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts and HDFS Framework.
- Hands on delivery experience working on popular Hadoop distribution platforms such as Cloudera and Horton Works.
- Expertisein implementing customized Partitioners and Combiners for effective datadistributions.
- Developed multiple realtime use cases using Kafka, Flumeand Spark Streaming.
- Experience in importing and exporting multi Terabytes of data using Sqoop from HDFS, Hive to Relational Database Systems (RDBMS) and vice - versa.
- Good working knowledge on HDFS Designs, Daemons and HDFS High Availability.
- Expertise in using builtin Hive SerDe and developing custom SerDes.
- Developed multiple Internal and external Hive Tables using Dynamic Partitioning & bucketing.
- Wrote multiple customized MapReduce Programs for various Input file formats.
- Strong knowledge in developing NoSQL applications using HBase and Cassandra.
- Tuned multiple Hadoop applications for better optimization.
- Developed multiple hive views for accessing HBase Tables data.
- Used complex Spark SQL programs for better joining and display the results on Kibana dashboard.
- Hands on experience in developing many PIG UDFs and Hive UDFs during data cleansing and transformation.
- Expertise in using various formats likeText, Parquet while creating Hive Tables.
- Expertise in working with various distribution systems such as Cloudera and Hortonworks.
- Experience in analyzing large scale data to identify new analytics, insights, trends, and relationships with a strong focus on data clustering.
- Expertise in collecting data from various source systems as social media and databases.
- Extensively used Oozie for automating multiple PIG scripts and Hive queries.
- Expertise in tuning Impala queries to overcome multiple concurrence jobs and out of memory errors.
- Rigorously applied transformations in Spark programs.
- Setup AWS cloud based CDH 5.13 cluster and developed merchant campaigns using Hadoop.
- Setup multiple role-based modules for Apache sentry in Hive.
- End-to-end hands on in ETL process and setup automation to load terabytes data into HDFS.
- Good Experience in Developing Applications using core java, Collections, Threads, JDBC,Servlets, JSP, Struts, Hibernate, XMLcomponents using various IDEs such as Eclipse6.0, MyEclipse.
- Experience in SQL programming in writing queries using joins, stored procedures, triggers, functions and performing query optimization techniques with Oracle, SQL Server, MySQL.
- Excellent team worker with good interpersonal skills and leadership qualities.
- Excellent organizational and communication skills.
- Excellent in understanding of Agile and scrum methodologies.
TECHNICAL SKILLS
Programming Languages: C, C++, Java, Scala, Python, UNIX
Distributed Computing: Apache Hadoop, HDFS, MapReduce, Pig, Hive, Oozie, Hue, Kerberos, Sentry, Zookeeper, Kafka, Flume, Impala, HBase and Sqoop
Web Development: HTML, JSP, XML, Java Script and AJAX
Web Application Server: Tomcat 6.0, JBoss 4.2 and Web Logic 8.1
Operating Systems: Windows, Unix, iOS, Ubuntu and RedHat Linux
Tools: Eclipse, NetBeans, Visual Studio, Agitator, Bugzilla, ArcStyler (MDA), Rational Rose, Enterprise Architect and Rational Software Architect
Source Control Tools: VSS, Rational Clear Case, Subversion
Application Framework: Struts 1.3, spring 2.5, Hibernate 3.3, Jasper Reports, JUnit and JAXB
RDBMS: Oracle and SQL Server 2016
NOSQL: MongoDB, Cassandra and HBase
PROFESSIONAL EXPERIENCE
Confidential, Chicago,IL
Sr. Hadoop Developer
Responsibilities:
- Handling the importing of data from various data sources (media, MySQL) and performingtransformations using Hive, MapReduce.
- Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing
- Ran Pig scripts on Local Mode, Pseudo Mode, and Distributed Mode in various stages of testing.
- Configured Hadoop cluster with Name node and slaves and formatted HDFS.
- Performed Importing and exporting data from SQL server to HDFS and Hive using Sqoop.
- End-to-end involvement in data ingestion, cleansing, and transformation in Hadoop.
- Logical implementation and interaction with HBase.
- ImplementedApache PIGscripts to load data from and to store data intoHive.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Used Impala to read, write and query the Hadoop data in HDFS from Cassandra and configured Kafka to read and write messages from external programs.
- Optimizing existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
- Create a complete processing engine, based on Cloudera distribution, enhanced performance.
- Developed data pipeline using Flume, Spark and Hive to ingest, transform and analyzing data.
- Wrote Flume configuration files for importing streaming log data into MongoDB with Flume.
- Involved in production Hadoop cluster set up, administration, maintenance, monitoring and support.
- Designed and Modified Database tables and used HBASE Queries to insert and fetch datafrom tables.
- Developing and supporting Map-Reduce Programs running on the cluster.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume1.7.0.
- Implemented the file validation framework, UDFs, UDTFs and DAOs.
- Preparation of Technical architecture and Low -level design documents.
- Tested raw data and executed performance scripts
Environment: Linuxsuse12, eclipse photon(64bit), jdk1.8.0, Hadoop2.9.0, flume 1.7.0, HDFS, MapReduce, Pig0.16.0, Spark, Hive 2.0, Apache-Maven3.0.3
Confidential, Peachtree City, GA
Hadoop Developer
Responsibilities:
- Performed various source data ingestions, cleansing, and transformation in Hadoop.
- Design and developed manyMap-Reduce Programs running on the cluster.
- Developed Pig Scripts to perform ETL procedures on the data in HDFS.
- Analyzed the partitioned and bucketed data and compute various metrics for reporting.
- Created HBase tables to store various data formats of data coming from different systems.
- Worked on improving the performance of existing Pig and Hive Queries.
- Developed Oozie workflow engines to automate Hive and Pig jobs.
- Involved in developing HiveUDFs and reused in some other requirements. Worked on performing Join operations.
- Developed fingerprinting rules on HIVE which help in uniquely identifying a driver profile.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Exported the result set from Hive to MySQL using Sqoop after processing the data.
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Used Hive to partition and bucket data.
- Advanced knowledge in performance troubleshooting and tuningCassandraclusters.
- Analyzing the source data to know the quality of data by usingTalend Data Quality.
- Involved in creating Hive tables, loading with data and writing hive queries.
- Developed REST APIs using Java, Play framework.
- Model and Create the consolidated Cassandra, Filo DB and Spark tables based on the data profiling.
- Used OOZIE1.2.1Operational Services for batch processing and scheduling workflows dynamically and created UDF's to store specialized data structures in HBase and Cassandra.
- Developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
- Used Impala to read, write and query the Hadoop data in HDFS from Cassandra and configured Kafka to read and write messages from external programs.
- Preparation of Technical architecture and Low-level design documents
- Tested raw data and executed performance scripts.
Environment: Eclipse, jdk1.8.0, Hadoop2.8, HDFS, MapReduce, Pig0.15.0, Hive2.0, HBase, Apache-Maven3.0.3
Confidential, Atlanta, GA
Hadoop Developer
Responsibilities:
- Involved in Design and Development of technical specifications.
- Developed multiple Spark jobs in Scala for data cleaning and preprocessing.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Developed simple/complex MapReduce jobs using Hive and Pig.
- Processed HDFS data and created external tables using Hive and developed scripts to ingest and repair tables that can be reused across the project.
- Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files Developed the UNIX shell scripts for creating the reports from Hive data.
- Analyzing the requirement to setup a cluster
- Created the script files for processing data and loading to HDFS
- Created the Apache PIGscripts to process the HDFS data.
- Created Hive tables to store the processed results in a tabular format.
- Developed the Sqoop scripts in order to make the interaction between Pig and Cassandra Database.
- Created CLI commands using HDFS.
- Created two different users (hduser for performing hdfs operations and map red user for performing map-reduceoperations only)
- Involved in crawl data flat files generation from various retailers to HDFS for further processing.
- Setup Hive with MySQL as a Remote Metastore
- Log/text files generated by various products are moved into HDFS location
- Created MapReduce code that will take input as log files and parse the logs and structure them in tabular format to facilitate effective querying on the log data
- Created ExternalHive Table on top of parsed data.
Environment: Cloudera Distribution, Hadoop, Map Reduce, HDFS, Python, Hive, HBase, HiveQL, Sqoop, Java, UNIX, Maven.
Confidential
Java Developer
Responsibilities:
- Worked on one of the most critical modules for project, right from the beginning phase which included requirement gathering, analysis, design, review and development.
- Module lead located to another location had KT from him about roughly 2 weeks, Lead was absorbed by client.
- Took initiative in building a new team of more than 6 members with proper knowledge transfer sessions assigning and managing tasks with JIRA.
- Learned Backbone JS and worked with UI team on UI enhancements.
- Actively participating in the daily Scrums, understanding new user stories.
- Implementing new requirements after discussion with Scrum masters.
- Working with BA,QA to identify and fix bugs, raise new feature and enhancements.
- Was greatly appreciated by client with appreciation certificate and client bonus of 10k and 50k respectively.
- Analyze the generated Junit and add proper asserts and make it more code specific along with increasing the code coverage. This helped to boast my product knowledge as well as my Junit writing skills.
- Which Improved Code Quality to a commendable level.
- Interacting with business and reporting teams for requirement gathering, configure walkthrough and UAT.
- Addressed issues related to application integration and compatibility
- Performed enhancements to User Group Tree using DOJO and JSON to meet business needs.
- Provided technical solutions to create various database connections to Oracle, Teradata, DB2 in Server and BO Designer for Dev, QA and Prod Environments.
Environment: Linux, eclipse, Java, Servlets, JSP, EJB, xml, JSON, Dojo, SQL server, spring, JUnit and Eclipse, SQL, UNIX, UML
Confidential
Java Developer
Responsibilities:
- Developed the Restful Web Services for various XSD schemas.
- Used Servlets to implement Business components.
- Designed and developed required Manager Classes for database operations.
- Developed various Servlets for monitoring the application.
- Designed and developed the front-end using HTML and JSP.
- Involved in eliciting the requirement, use case modeling, design, leading the development team and tracking the project status.
- Understanding Functional and Technical Requirement Specification of the customer’s need for the product.
- Impact analysis of the application under development.
- Preparing Design Specification, User Test Plans, and Approach Notes for the application under development.
- Creating UML Diagrams like Class Diagram, Sequence Diagram and Activity Diagram.
- Developing application as per given specifications.
- Creation of User based modules implementing different levels of security.
- Involved in designing of Front-end, Implementing Functionality with Business Logic.
- Mentoring and grooming juniors technically as well as professionally on agile practices and Java/J2EE development issues.
- Providing access to all customers and billing data online.
- Allowing department access to accounts receivable information. Subject to security and authorization constraints.
- Involved in designing and Code Reviews.
- Assisted in troubleshooting architectural problems.
- Design and development of report generation by using velocity framework.
- Interacting with business and reporting teams for requirement gathering, configure walkthrough and UAT.
- Generating monthly Sales report, Collection Report and Receivable-aging reports.
- Prepared use cases designed and developed object models and class diagrams.
- Developed SQL statements to improve back-end communications.
- Incorporated custom logging mechanism for tracing errors, resolving all issues and bugs before deploying the application in the WebSphere Server.
- Received praise from users, shareholders and analysts for developing a highly interactive and intuitive UI using JSP, AJAX, JSF and JQuery techniques.
- Preparation of Technical architecture and Low -level design documents.
- Tested raw data and executed performance scripts.
Environment: Java, Servlets, Struts, JSP, DOJO, JSON, Hibernate,XML, Junit, Eclipse and Oracle, Web Services