Sr. Big Data Leaddeveloper Resume
Brenham, TX
PROFESSIONAL SUMMARY:
- A Hadoop Professional wif over 8 Plus years of IT experience includes 3 Plus years of experience in Big Data, Hadoop Eco System related technologies wif domain experience in Financial, Banking, Insurance, Retail and Non - profit Organizations in Software Development and support of applications.
- Excellent understanding/noledge ofHadoopEcosystem including HDFS, MapReduce, Hive, Pig, Spark, Kafka, YARN, HBase, Oozie, ZooKeeper, Flume and Sqoop based Big Data Platforms.
- Expertise in design and implementation of Big Data solutions in Banking, Retail and E-commerce domains.
- Experienced wif NoSQL databases like Hbase and Cassandra.
- Comprehensive experience in building Web-based applications using J2EE Frame works like EJB, Struts and JMS.
- Excellent ability to use analytical tools to mine data and evaluate teh underlying patterns.
- Assisted in Cluster maintenance, Cluster Monitoring, Managing and Reviewing data backups and log files.
- Hands-on experience in developing MapReduce programs using ApacheHadoopfor analyzing teh Big Data.
- Expertise in optimizing traffic across network using Combiners, joining multiple schema datasets using Joins and organizing data using Partitioners and Buckets.
- Experienced in writing complex MapReduce programs dat work wif different file formats like Text, Sequence, Xml and Avro.
- Expertise in composing MapReduce Pipelines wif many user-defined functions using Apache PIG.
- Implemented business logic by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sourcesHIVE.
- Expertise in Hive Query Language (HiveQL), Hive Security and debugging Hive issues.
- Responsible for performing extensive data validation using HIVE Dynamic Partitioning and Bucketing.
- Experience in developing custom UDFs for Pig and Hive to incorporate methods and functionality of Java into Pig Latin and HQL (Hive QL).
- Worked on different set of tables like External Tables and Managed Tables.
- Experience in scheduling MapReduce/Hive jobs using Oozie.
- Experience in ingesting large volumes of data into Hadoop using Sqoop.
- Expertise in creatingdatabases, users, tables, triggers,macros, views, stored procedures, functions, Packages and joins in Oracle database.
- Experience in writing real time query processing using Cloudera Impala.
- Acted as SME and Module Lead for teh major projects undertaken.
- Expert database engineer, NoSQL and relational data modeling.
- Responsible for building scalable distributed data solutions using Hbase, Cassandra.
- Performed various CRUD operations in Cassandra database by loading time series data into Cassandra database.
- Worked wif Apache Spark for quick analytics on object relationships.
- Hands-on experience in writing Scala code to perform quick analytics in Spark.
- Hands-on noledge about Spark RDD’s, DataFrames and to perform transformations and actions on RDD’s.
- Experience in building clusters on AWS using Amazon EC2 services and Cloudera manager.
- Experience in Big Data platforms like Hortonworks, Cloudera, Amazon AWS and Apache.
- Complete domain and development Life Cycle noledge of Data Warehousing & Client Server Concepts and noledge of basicdata modeling.
- Good noledge onAgile Methodologyand teh scrum process.
- Very good in Fast learning, Analytical thinking, decision making and problem solving skills.
TECHNICAL SKILLS:
Web Technologies: JSP, REST API, HTML5, CSS, JavaScript
JEE Technologies: Servlets, Web Services, SOAP, WebLogic, Apache Jakarta-Tomcat Languages and Hadoop Components Java, Hadoop, COBOL, CICS, C, C++, SQL, PL / SQL, Sqoop, Flume, Hive, Pig, MapReduce, Scala, YARN, Oozie, Spark, Impala, Hue
SQL and NoSQL Databases: Cassandra, Hbase, Oracle, DB2, MySQL, SQLite, MS SQL Server 2008 / 2012, MS Access.
Operating Systems: Windows 98/NT/XP/Vista/7, Windows CE, Linux, UNIX, IOS, MAC.
Methodologies: Agile, Rapid Application Development, Waterfall Model, Iterative Model
Big data Platforms: Hortonworks, Cloudera, Amazon AWS, Apache
Frameworks: Spring, Hibernate, EJB, Struts
PROFESSIONAL EXPERIENCE:
Confidential, Brenham, TX
Sr. Big Data LeadDeveloper
RESPONSIBILITIES:
- Acted as a SME and Module Lead for two major projects undertaken wif a team size of 4 people.
- Involved in deploying Hadoop clusters on AWS using Amazon EC2 services and Cloudera Manager.
- Worked on Big data platform Cloudera. Used Kafka as a messaging system to get data from different sources.
- Creating various Hive and Pig Latin scripts for performing ETL transformations on teh transactional and application specific data sources.
- Wrote and executed PIG scripts using Grunt shell.
- Big data analysis using Pig and User defined functions (UDF).
- Performed joins, group by and other operations in Hive and PIG.
- Processed and formatted teh output from PIG, Hive before sending to theHadoopoutput file.
- Used HIVE definition to map teh output file to tables.
- Used Oozie to schedule map reduce and Hive jobs to generate weekly and monthly reports.
- Reviewed teh HDFS usage and system design for future scalability and fault-tolerance.
- Importing data from relational data stores toHadoopusing Sqoop.
- Incremental data movement using Sqoop and Oozie jobs.
- Used Impala for real time query processing in Cloudera.
- Worked wif Apache Spark for quick analytics on object relationships by writing code in Scala.
- Involved in Spark Streaming architecture and nodes setup meeting to perform real time processing of data dat is ingested from Kafka.
- Developed scalable distributed solutions using Cassandra database and performed various CRUD operations.
- Created UDF’s to encrypt teh customer sensitive data and stored into HDFS and performed analysis using PIG.
- Effective working wif teh team in performing teh big data tasks and delivering teh projects in time.
- Involved in cluster setup meetings wif teh administration team.
Environment: Apache Hadoop 2.2.0, Cloudera, MapReduce, Hive, Hbase, HDFS, Cassandra, PIG, Sqoop, Impala, Oozie, Kafka, Java 1.7, Python, UNIX, Shell Scripting, XML.
Confidential, Atlanta, GA
Sr. Hadoop Developer
RESPONSIBILITIES:
- Worked on Hortonworks platform. Developed data pipeline using Flume and Sqoop to ingest customer behavioral data and financial histories from traditional databases into HDFS for analysis.
- Ingested large volumes of data from Teradata to Hadoop using Sqoop.
- Involved in writing Map Reduce jobs.
- Involved in Sqoop, HDFS Put or Copy from Local to ingest data.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing teh data onto HDFS.
- Involved in developing Pig UDFs for teh needed functionality dat is not available from Apache Pig.
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive tables.
- Involved in developing Hive UDFs for teh needed functionality dat is not available from Apache Hive.
- Computed various metrics using Java Map Reduce to calculate metrics dat define user experience, revenue etc.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract teh data from weblogs and store in HDFS.
- Performed various CRUD operations on Cassandra Clusters.
- Involved in using SQOOP for importing and exporting data into HDFS.
- Involved in processing ingested raw data using Map Reduce, Apache Pig and Hive.
- Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
- Involved in emitting processed data fromHadoopto relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
- Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and Map Reduce) and move teh data files wifin and outside of HDFS.
Environment: Hadoop 2.2.0, Map Reduce, Cassandra, Kafka, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java, Hortonworks, HDFS, Eclipse.
Confidential, Jacksonville, FL
Hadoop Developer
RESPONSIBILITIES:
- Part of team for developing and writing PIG scripts.
- Loaded teh data from RDBMS SERVER to Hive using Sqoop.
- Created Hive tables to store teh processed results in a tabular format.
- Developed teh Sqoop scripts in order to make teh interaction between Hive and MySQL Database.
- Developed Java Mapper and Reducer programs for complex business requirements.
- Developed Java custom record reader, partitioner and serialization techniques.
- Used different data formats (Text format and Avro format) while loading teh data into HDFS.
- Created Managed tables and External tables in Hive and loaded data from HDFS.
- Performed complex HiveQL queries on Hive tables.
- Optimized teh Hive tables using optimization techniques like partitions and bucketing to provide better performance wif HiveQL queries.
- Created partitioned tables and loaded data using both static partition and dynamic partition method.
- Created custom user defined functions in Hive.
- Performed SQOOP import from Oracle to load teh data in HDFS and directly into Hive tables.
- Performed incremental data movement to Hadoop using Sqoop.
- Developed Pig Scripts to store unstructured data in HDFS.
- Scheduled map reduce jobs in production environment using Oozie scheduler.
- Analyzed teh Hadoop logs using PIG scripts to oversee teh errors caused by teh team.
- Experience in gathering requirements from teh client, giving estimates for developing projects and delivering teh projects in time.
Environment: HDFS, Map Reduce, Hive, Sqoop, Pig, Flume, HBase, Oozie Scheduler, Java, Oracle, Shell Scripts.
Confidential, Phoenix, AZ
Java and HadoopDeveloper
RESPONSIBILITIES:
- Extensively implemented various QA methodologies, testing strategies, and test plans in all stages of SDLC by followed Agile SCRUM methodology.
- Developed Pig Scripts for validating and cleansing teh data.
- Developed MapReduce programs to phrase teh raw data, and stored teh refined data in Cognition DB.
- Created HIVE queries for moving data from Cornerstone (Data Lake) to HDFS locations.
- Provided design recommendations and thought leadership to sponsors/stakeholders dat improved review processes and resolved technical problems.
- Managed and reviewedHadooplog files.
- Shared responsibility for administration ofHadoop, Hive and Pig.
- Involved in teh process of load, transform and analyze Transactions data from various providers into Hadoopon an on-going basis.
- Extensively worked on PIG scripts.
- WroteTeradata Macrosand used various Teradataanalytic functions.
- Involved in migration projects tomigrate datafrom data warehouses on Oracle/DB2 and migrated those toTeradata.
- Performance tuned and optimized various complex SQL queries.
- SQL queries and back end testing, Tableau report testing, deployment into UAT and Production.
- Participated and conducted Issue Log weekly status meetings, Report status meetings and Project status meetings to discuss issues and workarounds
- Communicated wifdevelopersthroughout all teh phases of testing to eliminate Roadblocks
- Generated daily progress report and represented in daily Agile Scrum meetings.
Environment: Hadoop, HDFS, Hive, Map Reduce, Core Java, Teradata, Oracle, UNIX, Tableau.
Confidential
Software Developer
RESPONSIBILITIES:
- Designed and developed UI screens wif JSF to provide interactive screens to display data for Provider module.
- Developed and implemented client side and server side validations.
- Developed teh business layer logic and implemented EJBs Session beans.
- Writing teh test plans and test cases for teh developed screens.
- Worked on bug fixing and enhancements on change requests.
- Designed and developed UI screens wif Struts to provide interactive screens to display data.
- Performed bug verification, release testing and provided support for Oracle based applications.
- Designed and developed presentation layers as well as business layer for teh entire application.
- Database access was done using JDBC. Accessed stored procedures using JDBC.
- Extensively involved in creatingPL/SQL objects me.e. Procedures, Functions, and Packages.
- Extensively involved indebuggingthe existing PL/SQL objects.
- Involved inperformance tuningthe existing objects.
- Worked on bug fixing and enhancements on change requests.
- Executing test cases and fixing teh bugs through unit testing.
Environment: Java/J2EE, Servlets, JSP, Apache Tomcat, Websphere Application server 6.0.1, EJB, Struts, Oracle, XML, HTML, MY SQL, MS-SQL server
