Senior Hadoop Developer Resume
Cincinnati Area, OH
PROFESSIONAL SUMMARY:
- Having 7+ years of professional IT experience which includes 3+ years of experience in Big data ecosystem related technologies.
- Excellent understanding of Hadoop architecture and complete understanding of Hadoop daemons and various components such as HDFS, YARN, Resource Manager, Node Manager, Name Node, Data Node and Map Reduce programming paradigm.
- Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
- Hands on experience using Cloudera and Horton works bigdata distributions.
- Responsible for writing MapReduce programs in Hadoop and scripts in Pig, and Hive and writing UDFs.
- Experience in importing and exporting data using Sqoop to HDFS from Relational Database Systems and vice - versa.
- Experience in tuning the performances by using Partitioning, Bucketing and Indexing in HIVE.
- Hands on experience in extending the core functionalities of HIVE using UDF, UDAF and UDTF.
- Developed MapReduce jobs to automate transfer the data from HBase.
- Experience in Spark systems and good understanding of spark concepts.
- Handling different file formats on Parquet, Apache Avro, Sequence file, JSON, XML and Flat file.
- Extensive experience working in Oracle, DB2, Apache Cassandra, SQL Server and My SQL database.
- Experience in Administering, Installation, configuration, troubleshooting, Security, Backup, Performance Monitoring and Fine-tuning of Linux Redhat.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Hands on experience in Java collection and Java performance tuning.
- Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
- Knowledge in job/workflow scheduling and monitoring tools like Oozie, UC4 and Zookeeper.
- Working experience on 200 node clusters with 9 petabytes of data size.
- Responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Experience working with off-shore teams and communicating daily status on issues, road-blocks.
TECHNICAL SKILLS:
Hadoop Core Services: HDFS, MapReduce, YARN
Hadoop Data Services: Pig, Hive, HBase, Zoo Keeper, Sqoop, Flume, Oozie, Spark, Kafka
Security: Kerberos
Programming languages: C, C++, Java, Python, Scala, Linux shell scripts
Databases: Oracle 12c/11g/10g/9i, MySQL, DB2, MS-SQL Server
Operating Systems: Red Hat Linux, HP-UX 10.x, Sun Solaris9, 10, Windows 95/98/2000/XP/Vista/7, 8, 10
Web Technologies: HTML, XML, JavaScript
ETL Tools: Informatica, Pentaho
Environment: Tools: SQL Developer, Win SCP, Putty.
PROFESSIONAL EXPERIENCE:
Confidential, Cincinnati Area, OH
Senior Hadoop Developer
Responsibilities:
- Understand how to apply technologies to solve big data problems and to develop innovative big data solutions.
- Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real time and persist it to Cassandra
- Worked in Spark and Scala for Data Analytics. Handle ETL Framework in Spark for writing data from HDFS to Hive.
- Worked on Talend with Hadoop. Worked in migrating from Informatica Talend jobs.
- Loading the data from the different Data sources like (Teradata and DB2) into HDFS using SQOOP and load into Hive tables, which are partitioned.
- Worked and learned a great deal from AWS Cloud services like EC2, S3, and EBS.
- Worked with different data sources like Avro data files, XML files, JSON files, SQL server and Oracle to load data into Hive tables.
- Worked on Talend with Hadoop. Worked in migrating from Informatica Talend jobs
- Worked on analyzing Hadoop stack and different Big data tools including Pig and Hive, Hbase database and Sqoop.
- Understand how to apply technologies to solve bigdata problems and to develop innovative big data solutions.
- Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
- Developed bigdata ingestion framework to process multi TB data including data quality checks.
- Designed, developed and maintained BigData streaming and batch applications using Storm.
- Managed real time data processing and real time Data Ingestion in MongoDB and Hive using Storm
- Created Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
- Installed and configured Proof of Concepts (POC) environments for Map Reduce, Hive, Oozie, Flume, HBase and other major components of Hadoop distributed system.
Environment: Spark, HDFS, Kafka, Map Reduce (MR1), Pig, Hive, Sqoop, Cassandra, AWS, Talend, Java, Linux Shell Scripting
Confidential, Atlanta GA
Hadoop Developer
Responsibilities:
- Analyzed large data sets by running Hive queries and Pig scripts.
- Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database system Oracle and vice-versa.
- Involved in creating Hive tables, and loading and analyzing data using hive queries.
- Developed Simple to complex MapReduce Jobs using Hive and Pig.
- Involved in running Hadoop jobs for processing millions of records of text data.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Helped business processes by developing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Developed workflows to process flume log data using Apache Spark in Scala.
- Maven in compiling, testing and documenting the Scala code used for Apache Spark.
- Assisted with data capacity planning and node forecasting.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Handling structured and unstructured data and applying ETL processes.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Worked on a POC that involves building DataStax Cassandra cluster and had written Java programs to store data to and retrieve data from it.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Assisted in exporting analyzed data to relational databases using Sqoop.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
- Shared responsibility for administration of Hadoop, Hive and Pig.
- Production Rollout Support and resolving any issues that are discovered by the client and client services teams.
Environment: Hadoop, Map Reduce, HDFS, Hive, Sqoop, Oozie, HBase, ZooKeeper, Java (jdk1.6), Spark, Scala, PL/SQL, SQL, Toad 9.6, UNIX Shell Scripting.
Confidential, Dallas
Hadoop Developer
Responsibilities:
- Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Used Oozie workflow engine and UC4 scheduling to run multiple Hive and Pig Jobs.
- Experienced in managing and reviewing Hadoop log files.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Created Hive tables and involved in data loading and writing Hive UDFs.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Worked with systems to accept events from Kafka producer and emit into DB.
- Involved in developing Hadoop MapReduce jobs for merging and appending the repository data.
- Experience in optimization of Map reduce algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization for a HDFS cluster.
- Hands on experience in setting up HBase Column based storage repository for archiving and retro data.
- Working experience with NOSQL database.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Experience working with off-shore teams and communicating daily status on issues, road-blocks.
Environment: Hadoop, MapReduce, HDFS, Hive, PIG, Sqoop, Oozie, UC4, Kafka, Cloudera, Flume, HBase, ZooKeeper, Oracle, NoSQL and Unix/Linux, Java (JDK 1.6), Eclipse
Confidential, Dallas Texas
Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster and different date analytic tools including Pig, Hive and Sqoop.
- Created HDFS (Hadoop Distributed File System), and MapReduce jobs in java.
- Implemented NameNode backup using NFS for High availability.
- Involved in loading data from UNIX file system to HDFS.
- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
- Responsible for developing data pipeline using Sqoop and pig to extract the data from weblogs and store in HDFS.
- Implemented test scripts to support test driven development and continuous integration.
- Responsible to manage data coming from different sources.
- Exported the analyzed data to the relational database MySQL using Sqoop for visualization and to generate reports.
- Load and transform large sets of structured, semi structured and unstructured data
- Experience in managing and reviewing Hadoop log files.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report.
- Automated workflows using shell scripts to pull data from various databases into Hadoop.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Environment: Hadoop, HDFS, MapReduce, Hive, HBase, Sqoop, PIG, JAVA, Eclipse, MySQL and Ubuntu.
Confidential
Java Developer
Responsibilities:
- Involved in the analysis, design, implementation, and testing of the project
- Implemented the presentation layer with HTML, XHTML and JavaScript
- Developed web components using JSP, Servlets and JDBC
- Designed tables and indexes
- Wrote complex SQL queries and stored procedures
- Involved in fixing bugs and unit testing with test cases using JUnit
- Actively involved in the system testing
- Involved in implementing service layer using Spring IOC module
- Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product
Environment: Java, JSP, Servlets, JDBC, JavaScript, MySQL, JUnit, Eclipse IDE