Hadoop Developer Resume
Tampa, FL
SUMMARY:
- 7+Years of comprehensive experience as an Apache Hadoop Developer.
- Expertise in writing Hadoop Jobs for analyzing data using Hive, Pig, and oozie.
- Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
- Experience in working with MapReduce programs using Hadoop for working with Big Data.
- Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Working experience in designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Spark, Flume, and zookeeper.
- Experience in providing support to data analyst in running Pig and Hive queries.
- Developed Map Reduce programs to perform the analysis.
- Performed Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Good experience using Apache SPARK, Storm, and Kafka.
- Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
- Experience in automating the Hadoop Installation, configuration and maintaining the cluster by using the tools like puppet.
- Experience in working with flume to load the log data from multiple sources directly into HDFS.
- Strong debugging and problem-solving skills with excellent understanding of system development methodologies, techniques, and tools.
- Good knowledge of No-SQL databases- HBASE, Cassandra.
- Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) in different application domain involving different technologies varying from object-oriented technology to Internet programming on Windows NT, Linux and UNIX/ Solaris platforms and RUP methodologies.
- Familiar with RDBMS concepts and worked on Oracle 8i/9i, SQLServer 7.0., DB2 8.x/7.x
- Involved in writing shell scripts, Ant scripts for Unix OS for application deployments to production region.
- Exceptional ability to quickly master new concepts and capable of working in-group as well as independently with excellent communication skills.
TECHNICAL SKILLS:
Hadoop/Big Data: HDFS, MapReduce, Hive, Spark, Pig, Sqoop, Flume, Hortonworks, Oozie, and ZooKeeper.
No SQL Databases: Hbase, Cassandra, mongoDB
Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, UNIX shell scripts
Operating Systems: Sun Solaris, HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8
Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata
Tools: and IDE: Eclipse, Toad, JDeveloper, DB Visualizer
Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP
PROFESSIONAL EXPERIENCE:
Hadoop Developer
Confidential, Tampa, FL
Responsibilities:- Responsible for building scalable distributed data solutions using Hadoop
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
- Upgrading the Hadoop Cluster from CDH4 to CDH5, setting up High availability Cluster and integrating HIVE with existing applications.
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior
- Installed Oozie workflow engine to run multiple Hive and Pig jobs
- Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop
- Worked extensively with Sqoop for importing metadata from Oracle
- Configured Sqoop and developed scripts to extract data from MySQL into HDFS
- Hands-on experience with productionalizing Hadoop applications viz. administration, configuration management, monitoring, debugging and performance tuning
- Created Hbase tables to store various data formats of PII data coming from different portfolios
- Cluster coordination services through ZooKeeper
- Helped with the sizing and performance tuning of the Cassandra cluster
- Involved in the process of Cassandra data modeling and building efficient data structures.
- Trained and mentored analyst and test team on Hadoop framework, HDFS, Map Reduce concepts, Hadoop Ecosystem
- Responsible for architecting Hadoop clusters.
- Assist with the addition of Hadoop processing to the IT infrastructure.
- Perform data analysis using Hive and Pig
Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Spark, Cloudera Manager, Storm, Cassandra, Pig, Sqoop, Oozie, ZooKeeper, Teradata, PL/SQL, MySQL, Windows, Horton works, Oozie, HBase
Hadoop Developer
Confidential, NJ
Responsibilities:- Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
- Used Spark API over Hadoop YARN to perform analytics on data in Hive.
- Responsible for Installation and configuration of Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Developed workflows using Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig
- Implemented scripts to transmit sysprin information from Oracle to HBase using Sqoop.
- Worked on partitioning the HIVE table and running the scripts in parallel to reduce the run time of the scripts.
- Optimized Map/Reduced Jobs to use HDFS efficiently by using various compression mechanisms
- Analyzes the data by performing Hive queries and running Pig scripts to study data
- Implemented business logic by writing Pig UDF's in Java and used various UDF's from Piggybanks and other sources.
- Worked with application teams to install the operating system, Hadoop updates, patches, version upgrades as required.
- Exported the analyzed data to the relational databases using Scoop for visualization and to generate reports for the BI team.
- Supported in setting up QA environment and updating configuration for implementing scripts with Pig and Scoop.
- Implemented testing scripts to support test-driven development and continuous integration.
Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Java, SQL, Ganglia, Scoop, Flume, Oozie, Java, Maven, Eclipse.
Hadoop Developer
Confidential, Portland, OR
Responsibilities:- Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and pre-processing
- Importing and exporting data into HDFS and Hive using Sqoop
- Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures
- Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile, and network devices and pushed to HDFS
- Developed Puppet scripts to install Hive, Sqoop, etc. on the nodes
- Load and transform large sets of structured, semi-structured and unstructured data
- Supported Map Reduce Programs those are running on the cluster
- Load log data into HDFS using Flume, Kafka
- Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions
- Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
- Automation script to monitor HDFS and HBase through Cron jobs
- Develop high-performance cache, making the site stable and improving its performance
- Create a complete processing engine, based on Cloudera's distribution, enhanced to performance
- Administrative support for parallel computation research on a 24-node Fedora/ Linux cluster
Environment: Hadoop, MapReduce, HDFS, Hive, CouchDB, Flume, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat 6. Java, JDBC, JNDI, Struts, Maven, SQL language, Oracle, XML, Eclipse.
Java Developer
Confidential
Responsibilities:- Created and updated database objects like Complex stored procedures, joins, tables, User Defined functions (UDF’s) for business logic.
- Created views to facilitate easy user interface implementation, and triggers on them to facilitate consistent data entry into the database.
- Experience in Creating and Updating Clustered and Non Clustered Indexes to keep up the SQL Server Performance.
- Transformed the data using different SSIS transformations like Aggregate, split, join, merge, derived column, Multicast, term extraction and data conversion of SQL Server Integration Services (SSIS).
- Created logging for ETL load at package and task level to log number of records processed by each package and task in a package using SSIS.
- Worked with control flow and data flow tasks like Containers, Precedence Constraints, and Execute SQL task in SSIS.
- Used Script task for Data Flow and Error Handling in SSIS.
- Responsible for ongoing maintenance and change management to existing reports and optimize report performance in SSRS-2008.
- Used Reporting objects like Tabular, Matrix, Sub Reports, Parameterized reports using SSRS 2008.
- Designed and developed monthly, Bi- Weekly, Tax Reports for our Clients
- Extensively worked in migrating office writer reports to SSRS-2008.
- Interacted with the Business Admin and gather the requirements and analyzed them.
- Excellent analytical, communication and interpersonal skills.
Environment: Microsoft SQL Server 2008 R2, T-SQL, SSIS, SSRS, Visual Source Safe, MS Excel, Performance Point Server 2007.
Confidential
Java/Web Developer
Responsibilities:- Developed the GUI of the system using JSP and client-side validations are performed using JavaScript.
- Built and accessed the database using JDBC for Oracle.
- Made stored procedures with PL/SQL at the database end.
- Developed Web pages and login form of insurance company using HTML/DHTML/CSS.
- Involved in coding JSP for the new employee registration, login and Single Sign On
- Worked on Servlets to handle client requests and carry out server side processing.
- Deployed the application on Tomcat Server.
- Used MVC architecture to build web application.
Environment: Java, Servlets, JSP, Tomcat, Oracle, RAD7.x, Applets, Apache ANT, XML, JavaScript, HTML.