Hadoop Developer Resume
Omaha, NE
SUMMARY
- 7+ Years of extensive experience in IT including 2+ years of Big Data Ecosystem related technologies.
- Experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster.
- Good understanding/knowledge of Hadoop Architecture.
- Hands on experience in installing, configuring and using ecosystem components like HadoopMapReduce, HDFS, Hbase, ZooKeeper, Oozie, Flume, Sqoop, Pig & Hive with CDH3&4&5 clusters..
- Experience in managing Hadoop clusters using Cloudera Manager
- Set up standards and processes for Hadoop based application design and implementation.
- Experience in analyzing data using HIVEQL, PIG Latin, H - Base and custom MapReduce programs in JAVA.
- Extending HIVE and PIG core functionality by writing custom UDF’s like UDAFs and UDTFs.
- Good experience in analysis using PIG,HIVE and understanding of SQOOP.
- Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
- Experience in Designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and Hadoop ecosystem.
- Worked on NoSQL databases including HBase.
- Experience in database development using SQL and PL/SQL and experience working on databases like Oracle 9i/10g and SQL Server.
- Performeddata analysisusingMySQL, SQL Server Management Studio, and Oracle.
- Strong skills in Datastage Administrator in UNIX & LINUX environments, Report creation using OLAP data source and having knowledge in OLAP universe.
- Profound knowledge of the principle of DW using Fact Tables, Dimension Tables, Star schema modeling and Snowflake Schema modeling.
- Excellent experience in ETL analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases.
- Excellent experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into the Confidential data warehouse.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Experience in using Apache Flume for collecting, aggregating and moving large amounts of data from application servers.
- Experience in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
- Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Diverse experience utilizing Java tools in business, Web, and client-server environments including Java Platform, J2EE, EJB, JSP, Java Servlets, Struts, and Java database Connectivity (JDBC) technologies.
- Solid background in Object-Oriented analysis (OOAD) and design. Very good at various Design Patterns, UML and Enterprise Application Integration EAI.
- Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
- Good communication skills, work ethics and the ability to work in a team efficiently with good leadership skills.
TECHNICAL SKILLS
Big data/Hadoop: HDFS, Map Reduce, HIVE, PIG, HBase, Sqoop, Flume, Oozie, Zookeeper.
Java Technologies: Core Java, I8N, JFC, Swing, Beans, Log4j, Reflection.
J2EE Technologies: Servlets, JSP, JDBC, JNDI, Java Beans.
Methodologies: Agile, UML, Design Patterns (Core Java and J2EE).
Monitoring and Reporting: Ganglia, Nagios, Custom Shell scripts.
Frameworks: MVC, Struts, Hibernate, Spring.
Programming Languages: C,C++, Java, Python, Ant scripts, Linux shell scripts.
Database: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server.
Web Servers: WebLogic, WebSphere, Apache Tomcat.
Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL.
Network Protocols: SSH,TCP/IP, UDP, HTTP, DNS, DHCP.
PROFESSIONAL EXPERIENCE:
Confidential, Omaha, NE
Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig.
- Provide mentorship and guidance to other architects to help them become independent.
- Provide review and feedback for existing physical architecture, data architecture and individual code.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Involved in Hadoop cluster task like commissioning & decommissioningNodes without any effect to running jobs and data.
- Wrote MapReduce jobs to discover trends in data usage by users.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Helped the team to increase the Cluster size from 22 to 30 Nodes.
- Job management using Fair scheduler.
- Develop Core Framework based on Hadoop to Migrate Existing ETL (RDBMS) Solution.
- Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS
- A deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.
- Responsible for smooth error-free configuration of DWH-ETL solution and Integration with Hadoop.
- Worked extensively with Sqoop for importing metadata from Oracle.
- Involved in creating Hive tables, and loading and analyzing data using hive queries.
- Responsible for managing data from multiple source.
- Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as we as RDBMS and NoSQL data stores for data access and analysis.
- Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Assisted in exporting analyzed data to relational databases using Sqoop.
- Expert knowledge developing and debugging in Java/J2EE.
- Wrote Hive Queries and UDF’s.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Extracted feeds form social media sites such as Facebook, Twitter.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Gained experience in managing and reviewing Hadoop log files.
Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Java, Oracle 10g, MySQL, Ubuntu.
Confidential, Atlanta, GA
Hadoop Developer
Responsibilities:
- Developed shell scripts to automate the cluster installation.
- Played a major role in choosing the right configurations for Hadoop.
- Involved in start to end process of hadoop cluster installation, configuration and monitoring.
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Setup and benchmarked Hadoop/HBase clusters for internal use.
- Developed Pig Latin scripts to extract and filter relevant data from the web server output files to load into HDFS.
- Responsible for building scalable distributed data solutions using Hadoop.
- Upgrading the Hadoop Cluster to CDH2 and setup High availability Cluster Integrate the HIVE with existing applications.
- A deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Familiar with ETL Standards and Process and developed ETL logic as per standards from Source-Flat File, Flat-File-Stage, Stage-Work, Work-Work Interim tables and Work Interim tables- Confidential Tables
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Created HBase tables to store variable data formats of data coming from different portfolios.
- Involved in transforming data from Mainframe tables to HDFS, and HBASE tables using Sqoop.
- Implemented test scripts to support test driven development and continuous integration.
- Specifying the Cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms..
- Extracted the data from MySQL into HDFS using Sqoop.
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behaviour.
- Used UDF's to implement business logic in Hadoop.
- Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other Sources.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as Required.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generatereports for the BI team.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Java (jdk 1.6), Eclipse
Confidential, Bolingbrook, IL
Java/Hadoop Developer
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
- Analyzed Hadoop clusters, other analytical tools used in big data like Hive, Pig and databases like HBase.
- Used Hadoop to build scalable distributed data solution .
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Devised procedures that solve complex business problems with due considerations for hardware/software capacity and limitations, operating times and desired results.
- Worked hands on with ETL process.
- Extracted feeds form social media sites such as Facebook, Twitter using Flume .
- Involved in loading data from LINUX file system to HDFS.
- Used Sqoop extensively to ingest data from various source systems into HDFS.
- Written Hive queries for data analysis to meet the business requirements.
- Created Hive tables and worked on them using Hive QL.
- Installed cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration.
- Assisted in managing and reviewing Hadoop log files.
- Assisted in loading large sets of data (Structure, Semi Structured, Unstructured).
- Implemented Hadoop cluster on Ubuntu Linux.
- Cluster coordination services through Zookeeper.
- Installed and configured Flume, Sqoop, Pig, Hive, HBase on Hadoop clusters.
- Managed Hadoop clusters include commissioning & decommissioning cluster nodes for maintenance and capacity needs.
- Wrote test cases in JUnit for unit testing of classes..
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
- Involved in templates and screens in HTML and JavaScript.
Environment: Eclipse IDE, Linux, Hadoop Map Reduce, Pig Latin, Sqoop, Java, Hive, Hbase, Unix Shell Scripting.
Confidential, Minneapolis, MN
Sr. Java / J2EE Developer
Responsibilities:
- Requirement analysis of the business specifications, development of programs Specification, System Testing, Internal code reviews for quality, Client Interaction.
- Actively involved in development and provided support for implementation.
- Partly worked as a technical lead in the design and development of enterprise Service Platform.
- Used JMS messages to log audit messages (success and failure) on to GAL (General Audit Logging).
- Used Hibernate for persisting the customer and billing information’s and EHCache for second level caching.
- MDB’s are used for nightly builds for auto billing and payment services.
- Used WS-Security for authenticating the SOAP messages along with encryption and decryption.
- Actively involved in resolving the design bottlenecks and optimized queries depending on the service calls with respect to cost and time spent.
- Performance tuning to identify and solve possible bottle necks in the application.
- Ensured code quality using tools like Find Bugs and Hudson.
- Parameterized different JVM settings to obtain optimal values for the application.
- Automated the deployment to each environment.
Environment: Eclipse, IBM RAD, JAX-WS, XML, XSD, Java, J2EE, Struts, Spring, Hibernate, Ajax, CVS, log4j, JUnit, Oracle, Linux, Weblogic, and Load Runner.
Confidential
Java/JEE Developer
Responsibilities:
- Created a Java based 24x7 Web application.
- Designing of the logical and physical data model, generation of DDL scripts, and writing DML scripts for Oracle 9i database.
- Tuning of SQL statement to improve performance, and consequently meet the SLAs.
- Gathering business requirements and writing functional specifications and detailed design documents.
- Building and deployment of Java applications into multiple Unix based environments and producing both unit and functional test results along with release notes.
- Documentation of the process involved in the software development.
Environment: Java 1.5, XML, XSL, XHTML, Oracle 9i, PL/SQL.