Hadoop Developer Resume
Baltimore, MD
SUMMARY
- Have 5 Years of experience in Design and Development, knowledge in Hadoop administration activities such as installation, configuration and maintenance of the cluster.
- Hands on experience in Hadoop and its related technologies - Hive, Pig, Sqoop, Oozie, Flume and MapReduce.
- Experience in dealing with Hadoop frameworks including Hadoop ecosystem, MapReduce, Pig, Hive, Flume, Spark, ZooKeeper, Oozie and Impala.
- Proficiency of ETL development in Informatica Power Center (Admin, Designer, Workflow Manager, Workflow Monitor, Repository Manager, Metadata Manager) for Extracting, Cleaning, Managing, Transforming and Loading data.
- Expertise in BigData architectures and Communication systems in Hadoop framework.
- Good Programming experience with SQL, PL/SQL Database technologies and its relational databases including Oracle, Teradata, MS-SQL.
- Experienced with NoSQL databases - HBase, MongoDB and Cassandra.
- Experience on Data Warehousing ETL (Extraction, Transformation and Loading).
- Experience in writing MapReduce programs using Java, Python to perform data processing and analysis.
- Startup and shutdown scripts, crontabs, file system maintenance and backupScripting and automation using Shell Scripting (BASH, KSH) in LINUX.
- Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS
- Good knowledge in writing Hadoop Jobs for analyzing data using Hive and Pig.
- Hands on experience in import/export of data using data management tool Sqoop.
- Good Knowledge in streaming the data to HDFS using Flume.
- Hands on experience on Apache Hadoop Administration, Linux Administration.
- Experienced in BigData storage and File System Design.
- Experience in Design, Installation, Configuration, Support & managing the Hadoop Clusters.
- Expertise in developing MapReduce applications using Apache Hadoop working with BigData.
- Have experience in writing the Hadoop scripts, MapReduce programs.
- Loading the log files from multiple sources directly into HDFS using Flume tool.
- Have strong knowledge in Hadoop Platforms and in other distributed data processing platforms.
- Experienced in Software Development Life Cycle (SDLC), application design, functional and technical Specs, and use case development using UML.
- Experience in web services using XML, SOAP and HTML.
- Have experience in designing and maintaining the system tools for all the scripts and automation processes, monitor all capacity planning.
- Excellent interpersonal, communication skills and a very good team player willing to take on new and varied projects and an ability to handle changing priorities and deadlines.
TECHNICAL SKILLS
Hadoop Tools: HDFS, MapReduce, Pig, Hive, Flume, Oozie, Zookeeper, HBase, Ambari, and Sqoop, Databases Oracle 9i, MYSQL
Languages: Java, J2EE, SQL
Operating System: Windows 8, Linux, Unix
Development Tools: Eclipse, MYSQL
Web Technologies: VMware, JSP, Servlets, JDBC, JavaBeans
Databases: Oracle 11G, DB2, MS SQL Server 2000, 2005, 2008
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential, Baltimore, MD
Responsibilities:
- Imported Bulk Data into HBase using MapReduce programs.
- Wrote multiple java programs to pull the data from HBase
- Experience in analyzing Business Functionalities and its requirement.
- Experience in modifying Hive Scripts to allow encrypted data from Tokenization.
- Create HIVE Scripts for table creations, data ingestion and to process HDFS data.
- Involved in creating Hive QL tables, loading data and writing Hive QL queries, which invoked and run MapReduce jobs in the backend.
- Have experience in handling complex data types such as tuples and map in Pig.
- Have experience in creating workflow and coordinator using Oozie for regular jobs and to automate the tasks of loading the data into HDFS.
- Involved in pulling the data from various sources and processing the Data-at-Rest utilizing ecosystems like MapReduce Frameworks, HBase, Hive, Oozie, Flume, Sqoop etc.
- To exchange the information we have used the web services SOAP and REST.
- Wrote several Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics that can statistically signify the success of the experiment.
- Experience in tracking the jobs status while the task/job runs.
- Perform analytics on Time Series Data exists in HBase using HBase API.
- Experience in writing database objects Stored Procedures, Triggers, SQL, PL/SQL packages and Cursors for Oracle and creating Oracle functions using custom UDF's using java and creating UDF's in hive.
- Experience in optimizing MapReduce algorithm using combiners and partitions to deliver the best results and also worked on Application performance optimization for a HDFS Cluster.
Environment: Hadoop, HDFS, MapReduce, Hive, Sqoop, HBase, Apache Spark, Oozie Scheduler, Java, Unix Shell Scripts, Git, Maven, PL/SQL, Python, Scala, Cloudera.
Hadoop Developer
Confidential, Plano, TX
Responsibilities:
- Have Strong Knowledge in Hadoop Architecture and various components such as HDFS, Name Node, Data node, Resource Manager, node Manager and YARN/MapReduce programming paradigm.
- Experience in Monitoring Hadoop Cluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics and Charge Back customers on their usage.
- Worked on ETL Tools to import the data from multiple sources to perform transformations.
- Tested the raw data and executed the scripts, by distributing the responsibilities to Hadoop, Pig and Hive.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Experienced in assigning the number of mappers and reducers to MapReduce Clusters.
- Experience in writing Shell Scripts in order to monitor the health of Hadoop daemon services, quick in fixing error messages/ failure conditions.
- Monitored multiple Hadoop Clusters environment using Ganglia.
- Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI Team.
- Performed MapReduce programs on log data to transform into structured way to find user location, age group, spending time.
- Worked with application teams to install operating system Hadoop updates patches versions upgrades as required.
Environment: Big Data/ Hadoop, Spark, HDFS, MapReduce, Hive, Pig Sqoop, Flume, Impala, Oozie, Ganglia, Java, and DB2
SQL Server Developer
Confidential
Responsibilities:
- Interacted with Team and Analysis, Design and Developed Database using ER Diagram, Normalization and relational database concepts.
- Involved in Design, Development and Testing of the system.
- Developed SQL Server Stored procedures, Tuned SQL Queries(using Indexes and Execution plan).
- Developed User Defined Functions and Created Views.
- Created Triggers to maintain the Referential Integrity.
- Implemented Exceptional Handling.
- Worked on client requirement and wrote Complex Queries to generate Crystal Reports.
- Creating and automating the regular jobs.
- Tuned and Optimized SQL Queries using Execution Plan and Profiler.
- Rebuilding Indexes and tables as part of Performance Tuning Exercise.
- Involved in performing database Backup and Recovery.
Environment: SQL Server 7.0/2000, SQL, T-SQL, Visual Basics 6.0/5.0, Crystal Reports 7/4.5