We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

San Diego, CA

SUMMARY:

  • 7+ years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies.
  • 4+ years of comprehensive experience in Big Data processing using Apache Hadoop and its ecosystem (MapReduce, Pig, Hive, Sqoop, Flume, HBase, Spark, NoSQL, Oozie, Sqoop, Kafka, Zoo Keeper and Flume).
  • In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
  • Knowledge on testing with Big Data Technologies like Hadoop, MapReduce, Hive, Pig, HBase, Kafka and Spark.
  • Hands on experience in installing, configuring and testing ecosystem components like Hadoop MapReduce, HDFS, HBase, Zoo Keeper, Oozie, Hive, HDP, Cassandra, Sqoop, PIG, Flume.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Good Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
  • Experience in preparing test plans and executing the test cases.
  • Good experience and great knowledge in testing the process for Hadoop based application design and implementation.
  • Good knowledge of java to do the Map Reduce Testing.
  • Experience in developing PIG Latin Scripts and using Hive Query Language.
  • Experience in scripting for automation and monitoring using Python.
  • Good knowledge in programming Spark using Scala and Experienced in handling Spark SQL, Streaming and complex analytics using Spark over Cloudera Hadoop YARN.
  • Experience in developing, trouble shooting and customizing Manual as well as Automation scripts using Quick Test Professional.
  • Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
  • Sound knowledge on using job scheduling and monitoring tools like Kafka, Oozie and Zookeeper.
  • Experienced in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
  • Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • Expertise in writing Map - Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
  • Experience working on NoSQL databases including Cassandra, Mongo DB and HBase.
  • Strong Knowledge in understanding Open source with Network Controllers.
  • Ability to work effectively with associates at all levels within the organization.
  • Strong background in mathematics and have very good analytical and problem-solving skills.

TECHNICAL SKILLS:

Hadoop Technologies: HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Flume, Oozie, Cassandra, YARN, Apache Spark, Impala, Kafka, MapReduce.

Hadoop Distribution: Cloudera CDHs, Hortonworks HDPs, MAPR.

Programming Languages: Core Java, Python, SQL, C, HTML.

Database Systems: Oracle, MySQL, HBase, Cassandra

IDE Tools: Eclipse, NetBeans, IntelliJ

Monitoring Tools: Ambari, Cloudera Manager

Operating Systems: Windows, Linux, UNIX

PROFESSIONAL EXPERIENCE:

Senior Hadoop Developer

Confidential, San Diego, CA

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Experienced in working with different Hadoop ecosystem components such as HDFS, MapReduce, HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Storm, Oozie, Impala and Flume.
  • Importing and exporting data into HDFS from Relational databases and vice versa using Sqoop.
  • In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
  • Created partitions, bucketing across state in Hive to handle structured data.
  • Implemented Dash boards that handle HiveQL queries internally like Aggregation functions, basic hive operations, and different kind of join operations.
  • Used Pig in three distinct workloads like pipelines, iterative processing and research.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Kafka, Flume.
  • Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
  • Implemented MapReduce jobs to write data into Avro format.
  • Created Hive tables to store the processed results in a tabular format.
  • Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Implemented various MapReduce Jobs in custom environments and updating them to HBase tables by generating hive queries.
  • Performed Sqoop operations for various file transfers through the HBase tables for processing of data to several MangoDB.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and have a good experience in using Spark-Shell and Spark Streaming.
  • Evaluated Oozie for workflow orchestration in the automation of MapReduce jobs, Pig and Hive jobs.
  • Created tables, secondary indexes, join indexes in Teradata development Environment for testing.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Captured the data logs from web server into HDFS using Flume & Splunk for analysis.
  • Experienced in writing Pig scripts and Pig UDFs to pre-process the data for analysis.

Environment: HDFS, Hive, Pig, MapReduce, CDH, Spark, AVRO, Sqoop, Oozie, Flume, Teradata, Kafka, Storm, Scala, HBase, SQL, Mango DB, Talend, Java, Splunk, Unix.

Senior Hadoop Developer

Confidential, Mclean, VA

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop .
  • Worked in the BI team in the area of Big Data cluster implementation and data integration in developing large-scale system software.
  • Worked in Hadoop MapReduce, HDFS Developed multiple MapReduce jobs in java for data cleaning and processing.
  • Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Designed a data warehouse using Hive.
  • Handling structured, semi structured and unstructured data.
  • Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems and vice-versa.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop .
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Pig for data cleansing and created partitioned tables in Hive.
  • Managed and reviewed Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Integrated Hive and HBase to perform queries using Impala .
  • Responsible to manage data coming from different sources.
  • Extensively used Pig for data cleansing.
  • Created partitioned tables in Hive.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Mentored analyst and test team for writing Hive Queries.

Environment: Hadoop, MapReduce, HDFS, Hive, HBase, Sqoop, Impala, Java (jdk1.6), Pig, Flume, Oracle 11/10g, MySQL, Eclipse, PL/SQL, Java, Shell Scripting, SQL Developer, Putty, XML/HTML.

Hadoop Developer

Confidential, NY

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and pre-processing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Used Multithreading, synchronization, caching and memory management.
  • Used JAVA application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle (SDLC).
  • Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Built BIG data clusters using Apache Spark architecture for Analytics.
  • Developed PIG Latin scripts for the analysis of semi structured data. Developed and involved in the industry specific UDF (user defined functions)
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
  • Used Sqoop to import data into HDFS and Hive from other data systems.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Implemented partitioning, dynamic partitions and buckets in HIVE.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Supported Map Reduce Programs those are running on the cluster.
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs.
  • Utilized Java and MySQL from day to day to debug and fix issues with client processes.
  • Managed and reviewed log files.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, Spark, MongoDB, Flume, Spark, HTML, XML, SQL, MySQL, Core Java, Eclipse, Shell scripting, UNIX.

Hadoop Developer

Confidential, CA

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
  • Involved in loading data from LINUX file system to HDFS.
  • Worked on installing cluster, commissioning & decommissioning of data nodes, name node recovery, capacity planning, and slots configuration.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from various sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Cluster coordination services through Zookeeper.
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Zookeeper, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.

Java Developer

Confidential

Responsibilities:

  • Involved in design and development phases of Software Development Life Cycle (SDLC).
  • Involved in designing UML Use case diagrams, Class diagrams, and Sequence diagrams using Rational Rose.
  • Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
  • Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
  • Developed a Dojo based front end including forms and controls and programmed event handling.
  • Created Action Classes which route submittals to appropriate EJB components and render retrieved information.
  • Used Core java and object-oriented concepts.
  • Used JDBC to connect to backend databases, Oracle and SQL Server 2005.
  • Proficient in writing SQL queries, stored procedures for multiple databases, Oracle and SQL Server 2005.
  • Wrote Stored Procedures using PL/SQL. Performed query optimization to achieve faster indexing and making the system more scalable.
  • Deployed application on windows using IBM Web Sphere Application Server.
  • Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report.
  • Used Web Services - WSDL and REST for getting credit card information from third party.
  • Used ANT scripts to build the application and deployed on Web Sphere Application Server.

Environment: Core Java, J2EE, Oracle, SQL Server, JSP, JDK, JavaScript, HTML, CSS, Web Services, Windows.

Java Developer

Confidential

Responsibilities:

  • Interacted with customers, identified System Requirements and developed Software Requirement Specifications.
  • Developed the application using Core Java, J2EE and JSP with DB-Derby as backend.
  • Developed Use Cases, High Level Design and Detailed Design documents.
  • Implementing Multi-threading concepts.
  • Involved in initial project setup and guidelines.
  • Implementing Java design patterns wherever required.
  • Front-end development using JSP.
  • Installation and deploying in Tomcat server.
  • Responsible for development, maintenance, implementation and support of the System.
  • Responsible for change management & enhancements (major/minor).
  • Different types of testing with Unit, System, Integration testing etc. is carried out during the testing phase.
  • Generating reports to the user in different formats like PDF, Excel, CSV.
  • Developed guidelines/checklists & maintained version control to ensure the project is at CMM 5.

Environment: Java, J2EE, JSP, JDBC, JUnit, XML, HTML, Apache Tomcat, PDF, Excel, CSV.

Hire Now