Hadoop Developer Resume
PennsylvaniA
SUMMARY
- 7+ years of extensive Professional IT experience, including 5+ years of Hadoop/Bigdata experience, capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
- Expert in importing and exporting of data using Sqoop from HDFS to relational database system and vice-versa.
- Expert level of scripting using Pig scripts and Hive queries for processing and analyzing large volume of data.
- Experience with Oozie Workflow Engine in running workflow designing, job scheduling with actions that run Hadoop Map/Reduce and Pig jobs.
- Good Experience in developing and implementing big data solutions and data mining applications on Hadoop using Hive, PIG, Hbase, Hue, Oozie workflows and designing and implementing Java Map Reduce programs.
- Good knowledge on Hadoop cluster administration, Monitoring and managing Hadoop cluster using cloudera manager.
- Knowledge in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Flume, Zookeeper and Kafka, Spark.
- Experience in managing and reviewing Hadoop log files.
- Good Experience with flume tool for data ingestion from various data producers (webservers) into Hadoop.
- Expert knowledge in Cassandra no-sql database.
- Good knowledge in NOSQL databases HBASE, MongoDB.
- Sound Relational Database Concepts and extensively worked with ORACLE, MySQL, SQL Server.
- Good Experience with databases, writing complex queries and stored procedures using SQL and PL/SQL.
- Hands On experience on developing UDF, DATA Frames and SQL Queries inSPARKSQL.
- Experience in using sequence file, RC file and Avro file formats.
- Good understanding of Classic Hadoop and Yarn architecture along with various Hadoop Demons such as Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node, Resource Manager, Node Manager, Application Master and Containers.
- Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN) setups.
- Expert in Java MapReduce Jobs, User Defined functions for Pig and Hive.
- Experience in handling messaging services using Apache Kafka.
- Basic Knowledge on Kerberos security for Hadoop.
- Good knowledge in developing and maintaining ETL mappings to extract the data from multiple sources then load to Databases.
- Experience in Business Intelligence tool Tableaufor visually analyzing the data.
- Experience in building, maintaining multiple Hadoop clusters of different sizes and configuration and setting up the rack topology for large clusters.
- Good Knowledge in hibernate ORM framework with spring framework.
- Experience in developing and implementing web applications using Java, JSP, CSS, HTML, HTML5, XHTML and Java script, JSON, XML, JDBC.
- Good knowledge in Scala and Python scripting languages.
- Knowledge on Soap and Rest web services.
- Involvement in all phases of SDLC from project proposal, planning, analysis, development, testing, deployment and support.
- Experience in working in 24X7 Support and used to meet deadlines, adaptable to ever changing priorities.
- Proven ability to work with senior technical managers and staff to provide expert-level support for the installation, maintenance, upgrading, and administration of full-featured database management systems.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent, result-oriented with problem solving as well and ability to work well with people and to maintain a good relation with the organization.
TECHNICAL SKILLS
Bigdata and Hadoop: Bigdata Hadoop, Apache/Cloudera HDFS 1.X/2.X, MapReduce, YARN, Sqoop, Flume, Hive, Pig, Oozie, Zookeeper, Kafka,Hue, Ambari, Cloudera Manager, Spark.
Databases: Oracle, MySQL, Microsoft Sql Server NoSQL Databases: Cassandra, HBase, MongoDB
Hadoop Distributions: HortonWorks, Apache Hadoop, Cloudera
Programming Languages: C, C++, Java, Scala, SQL and PL/SQL, Unix Shell Scripting, Python
Packages: MS Office, MS Visio
Operating Systems: MS Windows, Linux, Ubuntu, CentOS
IDE Tools: Eclipse, Net Beans
ETL Tools: Informatica
Methodologies: SDLC, Agile, Scrum Framework
Other skills: Tableau, HTML, Java Script, Soap and Rest Web Services, spring, Hibernate, JDBC, Junit Testing, Jenkins, Apache Tomcat, Maven, GitHub, Kerberos
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential - Pennsylvania
Responsibilities:
- Installed and Configured Apache Hadoop clusters for application development andHadooptools like Hive, Pig, Oozie, Zookeeper, Hbase, Flume and Sqoop.
- Implemented multiple Map Reduce Jobs in java for data cleaning and pre-processing.
- Worked in a team with 40 node cluster and increase cluster by adding Nodes, the configuration for additional data nodes was done by Commissioning process inHadoop.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.
- Responsible to manage data coming from different sources.
- Managed and scheduled Jobs on aHadoopcluster.
- Implemented a script to transmit information from Oracle to Hbase using Sqoop.
- Involved in defining job flows, managing and reviewing log files.
- Installed Oozie workflow engine to run multiple Map Reduce, HiveQL and Pig jobs.
- Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications.
- DevelopedSparkapplications to migrate data into HBase tables from traditional databases like MySQL, Oracle.
- DevelopedSparkcode using Scala andSpark-SQL/Streaming for faster testing and processing of data.
- Performed ETL jobs to integrate the data to HDFS using Informatica.
- Implemented a script to transmit information from Webservers to Hadoop using Flume.
- Created Hive tables to store the processed results in a tabular format.
- Was done various compressions and file formats like snappy, Gzip, Avro, Sequence, text.
- Wrote complex Hive queries and UDFs.
- Used hibernate framework with spring to persist and retrieve data from database.
- Involved in forecast based on the present results and insights derived from data analysis.
- Involved in collecting the data and identifying data patterns to build trained model using MachineLearning.
- Prepare Developer (Unit) Test cases and execute developer testing.
- Implemented test scripts to support test driven development and continuous integration.
- Created and maintained Technical documentation for launchingHADOOPClusters and for executing Hive queries and Pig Scripts.
- Worked on visualization tool tableau for visually analyzing the data.
Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Spark, Sqoop, Flume, Oozie, and Big Data, java, Junit testing, Oracle, MySQL, Tableau, LINUX, Windows.
Hadoop Developer
Confidential - Irving, TX
Responsibilities:
- Worked on data migration from existing data sources to Hadoop file system.
- Interact with the Business Analysts to get the requirements and formulate them into system usecases.
- Understand customer business use cases and be able to translate them to analytical data applications and models to implement a solution.
- Created flow charts, sequence diagrams, schemas, data model of underlying system, pseudocode and class diagrams using Microsoft Visio.
- Developed and maintained ETL (Extract, Transformation and Loading) mappings to extract the data from multiple source systems like Oracle, SQL server and Flat files and loaded into Oracle.
- Developed Informatica Workflows and sessions associated with the mappings using Workflow Manager.
- Worked on ingesting real time streaming data by usingsparkstreaming and analyzed data using SparkSQL using Scala.
- Created custom Database Encryption & Decryption UDF that could be plugged in while Ingesting data to External Hive Tables for maintaining security at table or column level.
- Worked on document oriented database like Cassandra.
- Worked on No-Sql Cassandra database for processing very large semi structured and structured tables by defining column families.
- Developed map-reduce programs for different patterns of data on Hadoop cluster.
- Developed java map reduce programs using core concepts like OOPS, Multithreading, Collections and IO. Compiled and built the application using MAVEN and used SVN as version control system.
- Createddata ingestion plansfor loading the data from external sources usingSqoop.
- Created flow charts, sequence diagrams, schemas, data model of underlying system, pseudo code and class diagrams using Microsoft Visio.
Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Spark, Sqoop, Oozie, Informatica, Cassandra, java, Scala, Maven, Oracle 11g/10g, MySQL, MS Visio, LINUX, Windows.
Hadoop Developer
Confidential
Responsibilities:
- Analyzed large data sets by running Hive queries and Pig scripts.
- Involved in creating Hive tables, and loading and analyzing data using hive queries.
- Developed Simple to complex Map Reduce Jobs using Hive and Pig
- Involved in runningHadoopjobs for processing millions of records of text data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Monitor System health and logs and respond accordingly to any warning or failure conditions.
- Implemented the workflows using Apache Oozie framework to automate tasks.
- Worked with application teams to install operating system,Hadoopupdates, patches, versionupgrades as required.
- Developed unit test cases forHadoopMapReduce jobs with JUnit.
- Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Involved in loading data from LINUX file system to HDFS.
- Worked with tableau visualization tool to create dashboards and worksheets.
- Assisted in exporting analyzed data to relational databases using Sqoop.
- Supported Map Reduce Programs those are running on the cluster.
- Created and maintained Technical documentation for launchingHadoopClusters and for executing Hive queries and Pig Scripts.
Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Sqoop, Oozie, and Big Data, java, Oracle 11g/10g, MySQL, tableau, LINUX, Windows.
Hadoop Developer
Confidential
Responsibilities:
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase NoSQL database and Sqoop.
- Importing and exporting data inHDFS and Hive using Sqoop.
- Experience with NoSQL databases.
- Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
- Written Hive UDFS to extract data from staging tables.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Familiarized with job scheduling using Fair Scheduler so that CPU time is well distributed amongst all the jobs.
- Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
- Managed Hadoop log files.
- Analyzed the web log data using the HiveQL.
Environment: Java 6, Eclipse, Hadoop, Hive, Hbase, MangoDB, Linux, Map Reduce, HDFS, Shell Scripting, MySQL.
Java Developer
Confidential
Responsibilities:
- Involved in the requirement analysis, design, and development as well as in the testing of this product and metric calculation.
- Responsible for monitoring System Development, understanding the framework used.
- Held technical discussion with team for feasibility study.
- Building and Customizing Application. Writing and performing Unit/Integration test cases.
- Establishing JDBC connection using database connection pool.
- Involved in debugging the applications.
Environment: Java, J2EE, JavaScript, HTML, MY SQL