Sr. Big Data/ Hadoop Developer Resume
Houston, TX
SUMMARY
- Over 8+ years of professional IT experience including 4+ years in Big data ecosystem related technologies. Expertise in Big Data technologies as consultant, proven capability in project - based teamwork and also as an individual developerwif good communication skills.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience in working wif Hadoop clusters using AWS EMR, Cloudera, Pivotal and Horton Works Distributions.
- Hands on experience in installing, configuring, and using Hadoopecosystem components like HadoopMap Reduce(MR), HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
- Hands on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie, and other Hadooprelated eco-systems(Kafka, Spark etc) as a Data Storage and Retrieval systems.
- Worked wif teh Search relevancy team to improve relevancy and ranking of search results using SOLR
- Performed importing and exporting data into HDFS and Hive using Sqoop
- Experience in managing and reviewing Hadooplog files.
- Experience in analyzing/troubleshooting from production logs to generate performance load
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Experience developing debugging and testing C# scripts. C# will be used to perform teh following activities Having experience in Performing REST / SOA
- Having experience in Azure Data Factory activities,
- Having experience in PowerShell Scripting
- Experience parsing JSON, XML and semi-structured data files.
- Extending Hive and Pig core functionality by writing UDFs.
- Analyze a MapReduce job and determine input and output data paths are handled in CDH.
- Good experience installing, configuring, testing Hadoop ecosystem components.
- Well-experienced Mapper, Reducer, Combiner, Partitioner, Shuffling and Sort process along wif Custom Partitioning for efficient Bucketing.
- Experience parsing JSON, XML and semi-structured data files
- Good experience in writing PIG and Hive UDF’s according to requirements.
- Experience in MAPR to develop, build and develop HADOOP based application.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Having good Experience in Hortonworks 2.x.
- Conducted on-site POC and Pilot for IBM OPTIM product suite for data discovery, sub-setting & data masking.
- Experienced in implementing POC's to migrate iterative map reduce programs into Spark transformations using Scala
- Experience composing, orchestrating, and monitory Data Factory activities and pipelines.
- Hands on experience in Agile and Scrum methodologies.
- Extensive experience in working wif teh Customers to gather required information to analyze, provide data fix or code fix for technical problems, and providing Technical Solution documents for teh users.
- Hands on experience in application development using Java, RDBMS(ORACLE, SQL SERVER), and Linux shell scripting.
- Having experience in developing debugging and testing C# scripts
- Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
- Have flair to adapt to new software applications and products, self-starter, have excellent communication skills and good understanding of business work flow.
- Expertise in Object-oriented analysis and programming (OOAD) like UML and use of various design patterns
- Working knowledge in SQL, PL/SQL, Stored Procedures, Functions, Packages, DB Triggers and Indexes.
- Having experience in HANA DB and ETL
- Good experience in designing teh jobs and transformations and load teh data sequentially & parallel for initial and incremental loads.
TECHNICAL SKILLS
Big Data: Hadoop, MapReduce, HDFS, Hive wif Tez, Pig, Sqoop, Oozie, Zookeeper, Flum Apache Mahout, AWS, YARN, Storm, Kafka, Spark, Impala, Python, Solr and HBase MongoDB and Cassandra.
Languages: C, C++, Java, SQL, PL/SQL, UMLDatabases: Oracle 8i/9i/10g/11g, SQL Server 7.0 /2000, DB2, MS Access
Technologies: Java 5, Java 6, AJAX, Log4j, Java Help, Java API, JDBC 2.0, and Java Beans
Methodologies: CMMI, Agile Software development, Six Sigma, Quantitative,project Management, UML, Design Patterns:
Framework: Ajax, Struts 2.0, JUnit, log4j 1.2, MOCK OBJECTS, Hibernate
Application Server: Apache Tomcat 5.x 6.0, JBOSS 4.0
Tools: HTML, Java Script, XMLTesting Tools: NetBeans, Eclipse, WSAD, RAD
Operating System: UNIX, Mac OSX, Windows Hyper V
Control tools: CVS, Tortoise SVNOthers: MS Office
PROFESSIONAL EXPERIENCE
Confidential, HOUSTON, TX
SR. BIG DATA/ HADOOP DEVELOPER
Responsibilities:
- Experience wif professional software engineering practices and best practices for teh full software development life cycle including coding standards, code reviews, source control management and build processes.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Responsible for building scalable distributed data solutions using Hadoop.
- Work closely wif various levels of individuals to coordinate and prioritize multiple projects. Estimate scope, schedule and track projects throughout SDLC.
- Worked in teh BI team in teh area of Big Data Hadoopcluster implementation and data integration in developing large-scale system software.
- Worked in HadoopMap Reduce, HDFS Developed multiple Map Reduce jobs in java for data cleaning and processing.
- Worked extensively in creating Map Reduce jobs to power data for search and aggregation.
- Experienced in managing and reviewing Hadooplog files.
- Experienced in running Hadoopstreaming jobs to process terabytes data
- Designed a data warehouse using Hive
- Handling structured, semi structured and unstructured data
- Worked extensively wif Sqoop for importing and exporting teh data from HDFS to Relational Database systems and vice-versa.
- Developed Simple to complex Map Reduce Jobs using Hive and Pig.
- Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted teh data from MySQL into HDFS using Sqoop
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
- Extensively used Pig for data cleansing.
- Created partitioned tables in Hive.
- Managed and reviewed Hadoop log files.
- Involved in creating Hive tables, loading wif data and writing hive queries that will run internally in MapReduce way.
- Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Responsible to manage data coming from different sources
- Extensively used Pig for data cleansing.
- Created partitioned tables in Hive.
- Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS
- Developed teh Pig UDF'S to pre-process teh data for analysis.
- Develop Hive queries for teh analyst
- Developed workflow in Oozie to automate teh tasks of loading teh data into HDFS and pre-processing wif Pig.
- Mentored analyst and test team for writing Hive Queries.
- Involved in teh database migrations to transfer data from one database to other and complete virtualization of many client applications
- Supports and assist QA Engineers in understanding, testing and troubleshooting.
- Written build scripts using ant/maven and participated in teh deployment of one or more production systems
- Production Rollout Support that includes monitoring teh solution post go-live and resolving any issues that are discovered by teh client and client services teams.
- Designed, documented operational problems by following standards and procedures using a software-reporting tool JIRA.
Technologies: Hadoop, MapReduce, HDFS, Hive, HBase, Sqoop, Java (jdk1.6), Pig, Flume, Oracle 11/10g, DB2, Teradata, MySQL, Eclipse, PL/SQL, Java, Linux, Shell Scripting, SQL Developer, Toad, Putty, XML/HTML, JIRA
Confidential, Baltimore, MD
BIG DATA/ HADOOP DEVELOPER
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology
- Experience wif Big Data Analytics implementations, using Hadoop or Cloudera, Map Reduce or Horton works.
- Developed data pipeline using Flume, Sqoop, Pig and MapReduce to ingest customer behavioral data and purchase histories into HDFS for analysis.
- Performed cluster co-ordination and assisted wif data capacity planning and node forecasting using ZooKeeper.
- Extracted data from Oracle SQL server and MySQL databases to HDFS using Sqoop.
- Worked on writing and optimizing MapReduce jobs.
- Experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
- Created Hive tables to store teh processed results in a tabular format and written Hive scripts to transform and aggregate teh disparate data.
- Automated teh process for extraction of data from warehouses and weblogs into HIVE tables by developing workflows and coordinator jobs in Oozie.
- Transferred data from Hive tables to HBase via stage tables using Pig and used Impala for interactive querying of HBase tables.
- Experience in using Avro, Parquet, RCFile and JSON file formats and developed UDFs using Hive and Pig.
- Responsible for cluster maintenance, rebalancing blocks, commissioning and decommissioning of nodes, monitoring and troubleshooting, manage and review data backups and log files.
- Exported teh aggregated data onto RDBMS using Sqoop for creating dashboards in teh Tableau and developed trend analysis using statistical features.
- Responsible for building scalable distributed data solutions on a cluster using Cloudera Distribution.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Developed Sqoop scripts to import and export data from MySQL and handled incremental and updated changes into HDFS layer.
- Developed workflow in Oozie to orchestrate a series of Pig scripts to cleanse data, such as removing unnecessary information or merging many small files into large, compressed files using pig pipelines in teh data preparation stage.
- Created Hive tables and loaded teh data into tables to query using HiveQL.
- Implemented partitioning and bucketing in HIVE tables and executed teh scripts in parallel to improve teh performance.
- Created HBase tables to store various data formats as input coming from different sources.
- Developed different kind of custom filters and handled pre-defined filters on HBase data using Java API.
- Utilized Agile Scrum Methodology to manage and organize teh team wif regular code review sessions.
Technologies: Hadoop Framework, MapReduce, Hive, Sqoop, Flume, Oozie, Hadoop framework, MapReduce, HDFS, Pig, Hive, HBase, Sqoop, Java (J2EE), XML, DB2, Log4j, Linux, Java, UNIX Shell Scripting, Oracle 11g/12g, Windows NT, IBM Datastage 8.1, TOAD 9.6
Confidential, Huston, TX
BIG DATA/ HADOOP DEVELOPER
Responsibilities:
- Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Designed and developed Oozie workflows for automating jobs.
- Mainly working on handling of Big Data Analytics and infrastructure of Hadoop, MapReduce
- Got good experience wif NoSQL database.
- Performed Map Reduce Programs those are running on teh cluster.
- Installed and configured Hive and also written Hive UDFs.
- Created HBase tables to store variable data formats of data coming from different portfolios.
- Implemented best income logic using Pig scripts.
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh Business Intelligence (BI) team.
- Supported in setting up QA environment and updating configurations for implementing scripts wif Pig and Sqoop.
- Writing HadoopMR programs to get teh logs and feed into Cassandra for Analytics purpose
- Moving data from Oracle to HDFS and vice-versa using SQOOP.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Worked wif different file formats and compression techniques to determine standards
- Developed Hive queries and UDF's to analyze/transform teh data in HDFS.
- Developed Hive scripts for implementing control tables logic in HDFS.
- Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
- Developed Pig scripts and UDF's as per teh Business logic.
- Importing log files using Flume into HDFS and load into Hive tables to query data.
- Developed pig scripts to convert teh data from Avro to Text file format.
- Developed hive scripts for implementing control tables logic in HDFS.
- Developed Sqoop commands to pull teh data from Teradata.
- Analyzing/Transforming data wif Hive and Pig.
- Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
- Designed and developed read lock capability in HDFS.
- Involved in End-to-End implementation of ETL logic.
- Effective coordination wif offshore team and managed project deliverable on time.
- Worked on QA support activities, test data creation and Unit testing activities.
Technologies: JDK, RedHat Linux, HDFS, Mahout, Map-Reduce, Apache Crunch, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, and HBase.
Confidential
JAVA/J2EE DEVELOPER
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) of teh application like Requirement gathering, Design, Analysis and Code development.
- Prepared Use Cases, sequence diagrams, class diagrams and deployment diagrams based on UML to enforce Rational Unified Process using Rational Rose.
- Extensively worked on user interface for few modules using HTML, JSP's, and JavaScript.
- Generated Business Logic using servlets, Session beans and deployed them on Web logic server.
- Created complex SQL queries and stored procedures.
- Used Hibernate ORM framework wif spring framework for data persistence and transaction management.
- Wrote test cases in JUnit for unit testing of classes.
- Provided technical support for production environments resolving teh issues, analyzing teh defects, providing and implementing teh solution defects.
- Built and deployed Java application into multiple Unix based environments and produced both unit and functional test results along wif release notes.
- Analyzed teh banking and existing system requirements and validated them to suit J2EE architecture.
- Designed teh process flow between front-end and server side components
- Developed and implemented teh MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
- Developed web based presentation-using JSP, AJAX using Servlets technologies and implemented using struts framework.
- Designed and developed backend java Components residing on different machines to exchange information and data using JMS.
- Involved in creating teh Hibernate POJO Objects and mapped using Hibernate Annotations.
- Used JavaScript for client-side validation and Struts Validator Framework for form validations.
- Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO), Data Access Object.
- Written Junit Test cases for performing unit testing.
- Integrated Spring DAO for data access using Hibernate, used HQL and SQL for querying databases.
- Worked wif QA team for testing and resolving defects.
- Used ANT automated build scripts to compile and package teh application.
- Used JIRA for bug tracking and project management.
Technologies: J2EE, JSP, JDBC, Spring Core, Struts, Hibernate, Design Patterns, XML, WebLogic, Apache Axis, ANT, Clear case, JUnit, JavaScript, WebServices, SOAP, XSLT, JIRA, Oracle, PL/SQL Developer and Windows