Senior Hadoop Developer Resume San Diego, CA - Hire IT People

SUMMARY:

7+ years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies.
4+ years of comprehensive experience in Big Data processing using Apache Hadoop and its ecosystem (MapReduce, Pig, Hive, Sqoop, Flume, HBase, Spark, NoSQL, Oozie, Sqoop, Kafka, Zoo Keeper and Flume).
In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
Knowledge on testing with Big Data Technologies like Hadoop, MapReduce, Hive, Pig, HBase, Kafka and Spark.
Hands on experience in installing, configuring and testing ecosystem components like Hadoop MapReduce, HDFS, HBase, Zoo Keeper, Oozie, Hive, HDP, Cassandra, Sqoop, PIG, Flume.
Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
Good Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
Experience in preparing test plans and executing the test cases.
Good experience and great knowledge in testing the process for Hadoop based application design and implementation.
Good knowledge of java to do the Map Reduce Testing.
Experience in developing PIG Latin Scripts and using Hive Query Language.
Experience in scripting for automation and monitoring using Python.
Good knowledge in programming Spark using Scala and Experienced in handling Spark SQL, Streaming and complex analytics using Spark over Cloudera Hadoop YARN.
Experience in developing, trouble shooting and customizing Manual as well as Automation scripts using Quick Test Professional.
Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
Sound knowledge on using job scheduling and monitoring tools like Kafka, Oozie and Zookeeper.
Experienced in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
Expertise in writing Map - Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
Experience working on NoSQL databases including Cassandra, Mongo DB and HBase.
Strong Knowledge in understanding Open source with Network Controllers.
Ability to work effectively with associates at all levels within the organization.
Strong background in mathematics and have very good analytical and problem-solving skills.

TECHNICAL SKILLS:

Hadoop Technologies: HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Flume, Oozie, Cassandra, YARN, Apache Spark, Impala, Kafka, MapReduce.

Hadoop Distribution: Cloudera CDHs, Hortonworks HDPs, MAPR.

Programming Languages: Core Java, Python, SQL, C, HTML.

Database Systems: Oracle, MySQL, HBase, Cassandra

IDE Tools: Eclipse, NetBeans, IntelliJ

Monitoring Tools: Ambari, Cloudera Manager

Operating Systems: Windows, Linux, UNIX

PROFESSIONAL EXPERIENCE:

Senior Hadoop Developer

Confidential, San Diego, CA

Responsibilities:

Handled importing of data from various data sources, performed transformations using Hive, MapReduce loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
Experienced in working with different Hadoop ecosystem components such as HDFS, MapReduce, HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Storm, Oozie, Impala and Flume.
Importing and exporting data into HDFS from Relational databases and vice versa using Sqoop.
In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
Created partitions, bucketing across state in Hive to handle structured data.
Implemented Dash boards that handle HiveQL queries internally like Aggregation functions, basic hive operations, and different kind of join operations.
Used Pig in three distinct workloads like pipelines, iterative processing and research.
Involved in moving all log files generated from various sources to HDFS for further processing through Kafka, Flume.
Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
Implemented MapReduce jobs to write data into Avro format.
Created Hive tables to store the processed results in a tabular format.
Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Implemented various MapReduce Jobs in custom environments and updating them to HBase tables by generating hive queries.
Performed Sqoop operations for various file transfers through the HBase tables for processing of data to several MangoDB.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and have a good experience in using Spark-Shell and Spark Streaming.
Evaluated Oozie for workflow orchestration in the automation of MapReduce jobs, Pig and Hive jobs.
Created tables, secondary indexes, join indexes in Teradata development Environment for testing.
Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
Captured the data logs from web server into HDFS using Flume & Splunk for analysis.
Experienced in writing Pig scripts and Pig UDFs to pre-process the data for analysis.

Environment: HDFS, Hive, Pig, MapReduce, CDH, Spark, AVRO, Sqoop, Oozie, Flume, Teradata, Kafka, Storm, Scala, HBase, SQL, Mango DB, Talend, Java, Splunk, Unix.

Senior Hadoop Developer

Confidential, Mclean, VA

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop .
Worked in the BI team in the area of Big Data cluster implementation and data integration in developing large-scale system software.
Worked in Hadoop MapReduce, HDFS Developed multiple MapReduce jobs in java for data cleaning and processing.
Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Designed a data warehouse using Hive.
Handling structured, semi structured and unstructured data.
Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems and vice-versa.
Developed Simple to complex MapReduce Jobs using Hive and Pig.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop .
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Extensively used Pig for data cleansing and created partitioned tables in Hive.
Managed and reviewed Hadoop log files.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Integrated Hive and HBase to perform queries using Impala .
Responsible to manage data coming from different sources.
Extensively used Pig for data cleansing.
Created partitioned tables in Hive.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
Developed the Pig UDF'S to pre-process the data for analysis.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Mentored analyst and test team for writing Hive Queries.

Environment: Hadoop, MapReduce, HDFS, Hive, HBase, Sqoop, Impala, Java (jdk1.6), Pig, Flume, Oracle 11/10g, MySQL, Eclipse, PL/SQL, Java, Shell Scripting, SQL Developer, Putty, XML/HTML.

Hadoop Developer

Confidential, NY

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and pre-processing.
Importing and exporting data into HDFS and Hive using Sqoop.
Used Multithreading, synchronization, caching and memory management.
Used JAVA application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle (SDLC).
Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
Built BIG data clusters using Apache Spark architecture for Analytics.
Developed PIG Latin scripts for the analysis of semi structured data. Developed and involved in the industry specific UDF (user defined functions)
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Used Sqoop to import data into HDFS and Hive from other data systems.
Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
Implemented partitioning, dynamic partitions and buckets in HIVE.
Load and transform large sets of structured, semi structured and unstructured data.
Supported Map Reduce Programs those are running on the cluster.
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs.
Utilized Java and MySQL from day to day to debug and fix issues with client processes.
Managed and reviewed log files.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, Spark, MongoDB, Flume, Spark, HTML, XML, SQL, MySQL, Core Java, Eclipse, Shell scripting, UNIX.

Hadoop Developer

Confidential, CA

Responsibilities:

Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
Responsible for building scalable distributed data solutions using Hadoop.
Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
Involved in loading data from LINUX file system to HDFS.
Worked on installing cluster, commissioning & decommissioning of data nodes, name node recovery, capacity planning, and slots configuration.
Created HBase tables to store variable data formats of PII data coming from different portfolios.
Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.
Implemented best income logic using Pig scripts and UDFs.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance Pig queries.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Responsible to manage data coming from various sources.
Involved in loading data from UNIX file system to HDFS.
Load and transform large sets of structured, semi structured and unstructured data.
Cluster coordination services through Zookeeper.
Experience in managing and reviewing Hadoop log files.
Job management using Fair scheduler.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Installed Oozie workflow engine to run multiple Hive and pig jobs.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Zookeeper, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.

Java Developer

Confidential

Responsibilities:

Involved in design and development phases of Software Development Life Cycle (SDLC).
Involved in designing UML Use case diagrams, Class diagrams, and Sequence diagrams using Rational Rose.
Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
Developed a Dojo based front end including forms and controls and programmed event handling.
Created Action Classes which route submittals to appropriate EJB components and render retrieved information.
Used Core java and object-oriented concepts.
Used JDBC to connect to backend databases, Oracle and SQL Server 2005.
Proficient in writing SQL queries, stored procedures for multiple databases, Oracle and SQL Server 2005.
Wrote Stored Procedures using PL/SQL. Performed query optimization to achieve faster indexing and making the system more scalable.
Deployed application on windows using IBM Web Sphere Application Server.
Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report.
Used Web Services - WSDL and REST for getting credit card information from third party.
Used ANT scripts to build the application and deployed on Web Sphere Application Server.

Environment: Core Java, J2EE, Oracle, SQL Server, JSP, JDK, JavaScript, HTML, CSS, Web Services, Windows.

Java Developer

Confidential

Responsibilities:

Interacted with customers, identified System Requirements and developed Software Requirement Specifications.
Developed the application using Core Java, J2EE and JSP with DB-Derby as backend.
Developed Use Cases, High Level Design and Detailed Design documents.
Implementing Multi-threading concepts.
Involved in initial project setup and guidelines.
Implementing Java design patterns wherever required.
Front-end development using JSP.
Installation and deploying in Tomcat server.
Responsible for development, maintenance, implementation and support of the System.
Responsible for change management & enhancements (major/minor).
Different types of testing with Unit, System, Integration testing etc. is carried out during the testing phase.
Generating reports to the user in different formats like PDF, Excel, CSV.
Developed guidelines/checklists & maintained version control to ensure the project is at CMM 5.

Environment: Java, J2EE, JSP, JDBC, JUnit, XML, HTML, Apache Tomcat, PDF, Excel, CSV.

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

San Diego, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship