Hadoop Developer Resume Profile
An Antonio, TexaS
Professional Summary:
- About7 years of experience inIT which includes Analysis, Design, Development, Testing, ETL, Database programming and User training of Software applications using Big Data Analytics, JAVA/J2EE, Informatica, My SQL and Oracle.
- 3 years of work experience on Hadoop Eco-System development and administration, including MapReduce, Hive, Pig, Hbase, Sqoop, Flume, Oozie and HDFS administration.
- Developed MapReduce jobs in Java for data cleansing, preprocessing and analysis. Implemented multiple mappers to handle data from multiple sources. Utilized distributed cache for in memory look-up of files.
- Professional experience in installing, configuring and using Apache Hadoop ecosystem components like Hadoop MapReduce, HDFS, Hbase, Zoo Keeper, Oozie, Hive, Sqoop, Pig and Flume.
- Developed custom Input Formats to process different data sources containing URLs and multiple patterns and developed custom Writable Data Types to handle different formats of timestamps.
- Clear understanding on Hadoop MRV1 architectural components viz. HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node and YARN architectural components viz. Resource Manager, Node Manager and Application Master.
- Good experience in Hive partitioning, bucketing and peform different types of joins on Hive tables and implementing Hive serdes like REGEX, JSON and Avro.
- Experience in developing customized Hive UDFs and UDAFs in Java, JDBC connectivity with hive, development and execution of Pig scripts and Pig UDF's
- Experience in validating and cleansing the data using Pig statements and hands-on experience in developing Pig MACROS.
- Experience in using Sqoop to migrate data to and fro from HDFS and My SQL or Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.
- Experience inVertica, Impala, Solrand NOSQL databases like MongoDB and Cassandra and also performed benchmarking on BigSQL, Impala and Hive.
- Responsible for integrating Hadoop and Informatica successfully, and migrated several projects to Hadoop.
- Extensively used Informatica Power Center in end-to-end of Data warehousing ETL routines, which includes writing custom scripts, data mining and data quality process.
- Strong experience in Dimensional Modeling using Star and Snowflake schemas to build Fact and Dimensional tables using Erwin Data Modeling tool.
- Worked on reusable code known as Tie outs to maintain the data consistency.
- Hands on Experience in Databases: Teradata, Neteeza, Oracle, MS SQL Server, MySQL and DB2, also developed stored procedures and queries using PL/SQL.
- Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSF, Struts, Spring, Hibernate, EJB.
- Experienced in web technologies like XML, XSD, XSLT, JSP, JavaScript, WSDL, REST and SOAP.
- Experience with web-based UI development using jQuery, CSS, HTML5, XHTML.
- Worked on Agile methodology,SOA for many of the applications and also good knowledge of Log4j for error logging.
- Experienced in writing Shell, UNIX and ANT scripts for builds and deployments to different environments. Network File system, FTP services, and Mail services.
Technical Skills
Hadoop/Big Data | Hadoop Cloudera, Horton Works , HDFS, Mapreduce, HBase, Pig, Hive, Sqoop, Flume, MongoDB, Cassandra, Oozie, Zookeeper, Impala, Solr |
Java J2EE Technologies | Java JDK 1.4/1.5/1.6 JDK 5/JDK 6 , HTML, Servlets, JSP, JDBC, JNDI, Java Beans |
IDE's | Eclipse, Net beans |
Frameworks | MVC, Struts, Hibernate, Spring |
Programming languages | C, C , Python, PERL, Ant scripts, Linux shell scripts, SQL, PL/SQL |
Databases | Oracle 11g/10g/9i, My SQL, DB2, MS-SQL Server |
Web Servers | Web Logic, Web Sphere, Apache CXF/XFire, Apache Axis, SOAP, REST |
Web Technologies | HTML, XML, JavaScript, AJAX, SOAP, WSDL |
Network Protocols | TCP/IP, UDP, HTTP, DNS, DHCP |
ETL Tools | Informatica, Pentaho |
Development Tools | TOAD, Maven, Visio, Rational Rose, Endur 8.x/10.x/11.x |
Operating Systems | Mac OSX, Unix, Windows, Linux |
Professional Experience:
Confidential
Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop and migrate legacy Retail applications ETL to Hadoop.
- Experienced in setting up Hadoop clusters and benchmarked for internal use.
- Developed and Designed ETL Applications and Automated using Oozie workflows and Shell scripts with error handling and mailing Systems.
- Developed simple to complex Map Reduce jobs and used Hive and Pig Scripts for analyzing the data.
- Hands on experience in moving data from databases and DWH to HDFS file system-using Sqoop and used various compression techniques to optimize the data storage.
- Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner and SQL, Lookup File and Database to develop robust mappings in the Informatica Designer.
- Used Avro Serdes to handle Avro Format Data in Hive and Impala.
- Developed Bankers Rounding UDF for HIVE/PIG or Implemented Teradata Rounding in HIVE/PIG.
- Developed MapReduce Jobs to validate and implement business logics.
- Hands on exporting the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.
- Implemented Agile methodology to collect requirements and develop solutions in Hadoop eco-system.
- Experience in designing ETL solutions using Informatica Power Center tools such as Designer, Repository Manager, Workflow Manager and Workflow Monitor.
- Created Shell scripts to fine tune the ETL flow of the Informatica workflows.
- Implemented performance-tuning techniques along various stages of the ETL process.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Environment:CloudEra, Hadoop, MapReduce, HDFS, Hive, Java jdk1.7 , Pig, Linux, XML. HBase, Zookeeper, Sqoop.
Confidential
Hadoop Developer
Responsibilities:
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume
- Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase from HDFS.
- Created Impala tables for faster access of data Designed ETL jobs to identify and remove duplicate records using sort and remove duplicate stage and Generated Keys for the unique records using Surrogate key Generator Stage.
- Experience in creating integration between Hive and HBase for effective usage and performed MR Unit testing for the Map Reduce jobs.
- Involved in transforming data from Mainframe tables to HDFS, and HBase tables using Sqoop and Pentaho kettle.
- Implemented business logic by writing Pig and Hive UDFs for some aggregative operations and to get the results from them.
- Prepared Avro schema files for generating Hive tables and shell scripts for executing Hadoop commands for single execution.
- Working Knowledge in NoSQL Databases like Hbase, MongoDB and Cassandra and also experienced in sync up Solr with HBase to compute indexed views for data exploration.
- Hands on experience in exporting the results into relational databases using Sqoop for visualization and to generate reports for the BI team.
- Automated all the jobs for pulling data from FTP server to load data into Hive tables using Oozie workflows.
- Worked closely with the business analysts to convert the Business Requirements into Technical Requirements and to make sure that the correct source table attributes are identified as per Dimensional Data Modeling Fact Table Attributes and Dimensional Table Attributes .
- Involved in loading data from UNIX file system to HDFS.
- Responsible for complete SDLC management using different methodologies like Agile and Waterfall etc.
- Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster are installed and configured.
- Cloudera Manger was used to monitor and manage the Hadoop Cluster.
Environment: Hadoop, Map Reducer, Cloudera Manager, HDFS, Hive, Pig, HBase, Solr, Impala, Sqoop, Flume, Oozie, UNIX shell scripting SQL, Java jdk 1.6 , Eclipse.
Confidential
Software Engineer
Responsibilities:
- Preparing the documentation for High Level design, Low Level design of the application and Process Flow of control for the entire application
- System Study- Understand the legacy system and the requirements for Developing the Consolidated Application
- Designed the Web application implementing the Struts framework for Model-View Controller MVC pattern to make it extensible and flexible
- Implemented the architecture with Struts-Config, Action Form classes and Action classes.
- Implemented the Consolidated applications front-end pages using JSP's, JSTL, and Struts Tag Libraries
- Used Spring Framework for Dependency injection and integrated with the Struts Framework and Hibernate.
- Developed the helper classes used by most of the components in this application
- Configured connection caches for JDBC connections.
- Used extensive JavaScript for creating the Global templates that can be used across the JSP Pages
- Developed code for generating the XML requests required for calling the web services.
- Developed code for processing the web service response obtained in XML as a String after calling the web Services using SAX parser
- Configured Logger, appender and layout using log4j, used Hibernate for Object Relational Mapping and also Ant 1.6.5 for building JAR's and WAR.
- Rational Clear Case was used as Source and version control for rebasing and delivering the code.
- Prepared Unit test cases as well performed Unit testing and Integration testing.
Environment:Servlet, JSP, EJB, Struts, Hibernate, LDAP, JNDI, HTML, XML, DOM, SAX, ANT, Weblogic Server 8.1, Oracle9i
Confidential
JavaProgrammer
Responsibilities:
- Responsible for understanding the business requirement.
- Worked with Business Analyst and helped representing the business domain details.
- Actively involved in setting coding standards and writing related documentation.
- Prepared the High and Low level design document.
- The web service is created using top down approach and tested using SOAP UI tool
- Used Hibernate 3.3.1 to interact with Database.
- Developed JSPs Servlets to dynamically generate HTML and display data to client side.
- An Admin tool is created in struts MVC design pattern to add preferred vehicle to Database.
- Basic authentication is provided for preferred web service.
- Designed Web Applications using MVC design pattern.
- Developed Shell script to retrieve the vendor files dynamically and used Cron tab to execute these scripts periodically.
- Designed the Batch Process for processing vendor data files using IBM Web sphere Application Server's Task Manager Framework.
- Performed unit testing using JUnit Testing Framework and Log4J to monitor the error log.
Environment: IBM RAD, IBM Web Sphere App Server 7.0,Java/J2EE, Spring 3.0, JDK 1.5, Web services, SOAP, Servlets, JSP, ANT 1.6.x, Ajax, Hibernate 3.3.1,Custom tags.
Confidential
ProgrammerAnalyst
Responsibilities:
- Implementing the J2EE design patterns like session facade, business delegate, value object pattern and Data access objects in the project.
- Designed and developed for the Presentation layer using JSP/Servlets and MVC type 2 Struts Frame work.
- Utilized struts 1.2 features Tiles, Dynaforms, beanUtilsetc and Writing Action classes.
- Developed Session beans to for search application.
- Writing JDBC programming to persist the data in Oracle database.
- Writing Test case to test the application.
- Deployed the beans on the Weblogic Application Server.
- Implemented Log4j for the debug and error logging purpose.
Environment: Java 1.4,J2EE, JSP 2.0, HTML, Java Script, JFC Swing , JDBC, SQL, PL-SQL procedures, Weblogic Application Server 8.1, Oracle8i, Struts Frame work 1.2,Ant, JUnit, Log4j and Windows NT.