Sr Hadoop Developer Resume
Dallas, TexaS
SUMMARY:
- 8+ years of experience in IT industry with extensive experience in Java, J2ee and Big data technologies.
- 3+ years working of exclusive experience on Big Data technologies and Hadoop stack
- Strong experience working with HDFS, Map reduce, Spark. Hive, Pig, Sqoop, Flume, Kafka, Oozie and HBase.
- Good understanding of distributed systems, HDFS architecture, internal working details of Map reduce and Spark processing frameworks.
- More than one year of hands on experience using Spark framework with Scala.
- Good exposure to performance tuning hive queries, map reduce jobs, spark jobs.
- Worked with various formats of files like delimited text files, click stream log files, Apache log files, Avro files, JSON files, XML Files
- Has good understanding of various compression techniques used in Hadoop processing like Gzip, SNAPPY, LZO etc.,
- Expertise in Inbound and Outbound (importing/exporting) data form/to traditional RDBMS using Apache SQOOP.
- Tuned PIG and HIVE scripts by understanding the joins, group and aggregation between them.
- Extensively worked on HiveQL, join operations, writing custom UDF’s and having good experience in optimizing Hive Queries.
- Worked on various Hadoop Distributions (Cloud era, Horton works, and Amazon AWS) to implement and make use of those.
- Mastered in using different columnar file formats like RCFile, ORC and Parquet formats.
- Experience data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
- Hands on experience in installing, configuring and deploying Hadoop distributions in cloud environments (Amazon Web Services).
- Good experience in optimizing Map - Reduce algorithms by using Combiners and Custom Practitioners.
- Hands on experience in NOSQL databases like HBase, Cassandra and Mongo DB.
- Expertise in back-end/server side java technologies such as: Web services, Java persistence API (JPA), Java Messaging Service (JMS), Java Data Base Connectivity (JDBC)
- Experience includes application development in Java (client/server), JSP, Servlet programming, Enterprise Java Beans, Struts, JSF, JDBC, spring, Spring Integration, Hibernate.
- Very good understanding in AGILE scrum process.
- Experience in using version control tools like Bit-Bucket, GIT, and SVN etc.
- Having good knowledge of Oracle 8i, 9i, 10g as Database and excellent in writing the SQL queries
- Performed performance tuning and productivity improvement activities
- Extensively use of use case diagrams, use case model, sequence diagrams using rational rose.
- Proactive in time management and problem solving skills, self-motivated and good analytical skills.
- Have analytical and organizational skills with the ability to multitask and meet the deadlines.
- Excellent interpersonal skills in areas such as teamwork, communication and presentation to business users or management teams.
TECHNICAL SKILLS:
Big Data Ecosystems: Hadoop, Map Reduce, Spark, HDFS, HBase, Pig, Hive, Sqoop, Oozie, Storm, Kafka and Flume.
Spark Streaming Technologies: Spark Streaming, Storm
Scripting Languages: Python, Bash, Java Scripting, HTML5, CSS3
Programming Languages: Java, Scala, SQL, PL/SQL
Databases: RDBMS, NoSQL, Oracle.
Java/J2EE Technologies: Servlets, JSP (EL, JSTL, Custom Tags), JSF, Apache Struts, Junit, Hibernate 3.x,log4J Java Beans, EJB 2.0/3.0, JDBC,RMI, JMS, JNDI.
Tools: Eclipse, Maven, Ant, MS Visual Studio, Net Beans
Methodologies: Agile, Waterfall
PROFESSIONAL EXPERIENCE:
Confidential, Dallas, Texas
Sr Hadoop Developer
Responsibilities:
- Developed simple to complex Map Reduce jobs using Java language for processing and validating the data.
- Developed data pipeline using Sqoop, Spark, Map Reduce, and Hive to ingest, transform and analyze operational data.
- Developed Map Reduce and Spark jobs to summarize and transform+ raw data.
- Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.
- Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
- Real time streaming the data using Spark with Kafka
- Handled importing data from different data sources into HDFS using Sqoop and also performing transformations using Hive, Map Reduce and then loading data into HDFS.
- Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
- Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
- Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
- Created HBase tables and column families to store the user event data.
- Scheduled and executed workflows in Oozie to run Hive and Pig jobs.
- Used Impala to read, write and query the Hadoop data in Hive.
Environment: Hadoop, HDFS, HBase, Pig, Hive, MapReduce, Sqoop, Flume, ETL, REST, Java, Python, PL/SQL, Oracle 11g, Unix/Linux.
Confidential, NYC, NY
Sr. Hadoop Developer
Responsibilities:
- Integrated Kafka with Spark Streaming for real time data processing
- Stored the processed data by using low level Java API’s to ingest data directly to HBase and HDFS.
- Experience in writing Spark applications for Data validation, cleansing, transformations and custom aggregations.
- Imported data from different sources into Spark RDD for processing.
- Developed custom aggregate functions using Spark SQL and performed interactive querying.
- Worked on installing cluster, commissioning & decommissioning of Data node, Name node high availability, capacity planning, and slots configuration.
- Developed Spark applications for the entire batch processing by using Scala.
- Utilized spark data frame and spark sqlapi extensively for all the processing
- Experience in managing and reviewing Hadoop log files.
- Experience in hive partitioning, bucketing and perform joins on hive tables and utilizing hive SerDes like REGEX, JSON and AVRO.
- Exported the analyzed data to the relational databases using Sqoop and to generate reports for the BI team.
- Executed tasks for upgrading cluster on the staging platform before doing it on production cluster.
- Perform maintenance, monitoring, deployments, and upgrades across infrastructure that supports all our Hadoop clusters.
- Installed and configured various components of Hadoop ecosystem.
- Optimized HIVE analytics SQL queries, created tables/views, written custom UDFs and Hive based exception processing.
- Involved in transforming the relational database to legacy labels to HDFS, and HBASE tables using Sqoop and vice versa.
- Replaced default Derby metadata storage system for Hive with MySQL system.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
- Configured Fair Scheduler to provide fair resources to all the applications across the cluster.
Environment: Cloud era 5.4, Cloud era Manager, Hue, Spark, Kafka, HBase, HDFS, Hive, Pig, Sqoop, Kafka, Map reduce, DataStax, IBM Data Stage 8.1(Designer, Director, Administrator), Flat files, Oracle 11g/10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting
Confidential, Chicago, IL
Hadoop Developer
Responsibilities:
- Understanding business needs, analyzing functional specifications and map those to develop and designing Map Reduce programs and algorithms.
- Created Hive Tables, loaded transactional data from Teradata using Sqoop.
- Developed Map Reduce jobs for cleaning, accessing and validating the data.
- Implemented Hive Generic UDF’s to in corporate business logic into Hive Queries.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most visited page on website.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Wrote Pig scripts to transform raw data from several data sources
- Monitored workload, job performance and capacity planning using Cloud era Manager.
- Involved in build applications using Maven and integrated with Continuous Integration servers like Jenkins to build jobs.
- Involved in End-to-End implementation of ETL logics.
- Performing data migration from Legacy Databases RDBMS to HDFS using Sqoop.
- Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
Environment: Hadoop 1.x, HDFS Map Reduce, Hive, Pig, HBase, Sqoop, Oozie, Maven, Shell Scripting, CDH3, Cloud era Manager
Confidential
Java Developer
Responsibilities:
- Involved in the design and implementation of the architecture for the project using OOAD, UML design patterns.
- Involved in design and development of server side layer using XML, JSP, JDBC, JNDI, EJB and DAO patterns using eclipse IDE.
- Work involved extensive usage of HTML, CSS, JavaScript and Ajax for client side development and validations.
- Used parsers for the conversion of XML files to java objects and vice versa.
- Developed screens using XML documents and XSL.
- Developed Client programs for consuming the Web services published by the Country Defaults Department which keeps in track of the information regarding life span, inflation rates, retirement age, etc. using Apache Axis.
- Developed java beans and jsp's by using Spring and JSTL tag libs for supplementsDeveloped java beans and jsp's by using Spring and JSTL tag libs for supplements Developed java beans and jsp's by using spring and JSTL tag libs for supplements.
- Development of EJB’s, Servlets and JSP files for implementing Business rules and Security options using IBM Web Sphere.
- Involved in creating tables, stored procedures in SQL for data manipulation and retrieval using SQL Server, Oracle and DB2.Involved in creating tables, stored procedures in SQL for data manipulation and retrieval using SQL Server 2000, Oracle 10 g,
- Trained end users on developed application.
Environment: Java, JSF Framework, Eclipse IDE, Ajax, Apache Axis, OOAD, Web Logic, Java script, HTML, XML, CSS, SQL Server, Oracle, Web services, Ajax, Spring, OOAD and UML, Windows.
Confidential
Java Developer
Responsibilities:
- Participated in requirement gathering and converting the requirements into technical specifications.
- Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
- Designed user-interface and checking validations using JavaScript.
- Involved in design of JSP’s and Servlets for navigation among the modules.
- Developed various EJBs for handling business logic and data manipulations from database.
- Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
- Developed SQL queries and Stored Procedures using PL/SQL to retrieve and insert into multiple database schemas.
- Developed the XML Schema and Web services for the data maintenance and structures Wrote test cases in JUnit for unit testing of classes.
- Provided Technical support for production environments resolving the issues, analysing the defects, providing and implementing the solution defects.
- Built and deployed Java applications into multiple UNIX based environments and produced both unit and functional test results along with release notes.
- Developed the presentation layer using CSS and HTML taken from bootstrap to develop for browsers.
Environment: Java, spring, Jsp, Hibernate, XML, HTML, JavaScript, JDBC, CSS, SOAP Web services.
