Senior Hadoop Engineer Resume Florida - Hire IT People

SUMMARY

Over all IT experience 8+ years. 4+ years experience on Hadoop ecosystem.
Experience with complete software development life cycle (SDLC) and Software Engineering including Requirement gathering, Analyzing, Designing, Implementing, Testing, Support and Maintenance
Strong in Developing MapReduce Applications, Configuring the Development Environment, Tuning Jobs and Creating MapReduce Workflows
Experience in performing data enrichment, cleansing, analytics, aggregations using Hive and Pig
Experience in importing and exporting data from different relational databases like MySQL, Netezza, Oracle into HDFS and Hive using Sqoop
Extensiveexperienceof handling and converting various data interchange formats (XML,JSON, AVRO,PARQUET) in distributed framework.
Worked on reading multiple data formats on HDFS using Scala.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Analyzed the SQL scripts and designed the solution to implement using Scala.
Implemented on Hadoop stack and different big data analytic tools, migration from different databases SQL Server2008 R2, Oracle, MYSQL to Hadoop.
Strong in designingspecifications, functional and technical requirements and process flows
Experience in dealing with distributed systems, large - scale non-relational data stores, data modeling and multi-terabyte data warehouses
Hands on experience in application development and database management using the technologies JAVA, RDBMS, Linux/Unix shell scripting and Linux internals
Experience in deploying applications in heterogeneous Application Servers TOMCAT, WebLogic, IBM WebSphere and Oracle Application Server.
Experience in Designing, Installing, Configuring and Administrating Hadoop Cluster of major Hadoop distributions - Cloudera, Hortonworks&Apache hadoop
Experience in designing, building and implementing complete Hadoop ecosystem comprising of MapReduce, HDFS, Hive, Impala, Pig, Sqoop, Oozie, HBase, Spark
Worked on Multi Clustered environment, setting up Production and QA Hadoop Cluster, Benchmarking the Hadoop Cluster on Amazon AWS, Rackspace and EC2 Cloud environments
Good handson experience in writing shell scripts in Linux/Unix
Experience in developing use cases, activity diagrams, sequence diagrams and class diagrams using UML Rational Ross and MS Visio
Extensively implemented various types of ETL/EDW Migrationprojects using MapReduce, Pig, Hive, Sqoop
Experience with Apache Storm and streaming real-time CEP solutions.
Understanding of Cloud architectures and computationally intensive development environments.
Experience with Docker and Zookeeper, HDFS/Hadoop and cluster management.
Expertise with Git, Maven and the Agile development process.
Experience in using Oozie, ControlM and Autosys workflow engine for managing and scheduling Hadoop Jobs
Worked with big data teams to offload ETL jobs from Teradata and Netezzato Hadoop.
Experience in importing streaming logs and aggregating the data to HDFS using Flume
Built ingestion framework using Kafka for streaming logs and aggregating the data into HDFS using Camus
Knowledge in stream processing technologies like Apache Spark and Storm
Experience in setting up monitoring tools like Ganglia and Nagiosfor Hadoop and HBase
Involved in Analysis, Design, Coding and Development of Java custom Interfaces
Hands on experience on SDLC under agile environment
Exceptional ability to quickly master new concepts and technologies.
Proficient in communicating with people at all levels of hierarchy in the organization, very good at post implementation support and a team player with strong analytical & problem solving skills

TECHNICAL SKILLS

Hadoop/ Big Data: HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Flume, Oozie, Cassandra, HBase, Zookeeper, Spark, Impala, Kafka, storm

Languages/Simulators: C, C++, Java, Python, SQL, UNIX Shell Scripting, Scala

Operating Systems: Windows Variants, Mac, UNIX, LINUX

Database: MySQL, Oracle

IDE Tools: Eclipse, Net Beans, SQL Developer, MS Visual Studio

Version Control: Git, Svn

Software Tools: MS Office Suite(Word, Excel, Project), MS Visio

Web Technologies: HTML, CSS, XML, PHP

Monitoring Tools: Ganglia, Nagios, Cloudera Manager

NoSQL Databases: Cassandra, HBase

PROFESSIONAL EXPERIENCE

Senior Hadoop Engineer

Confidential, Florida

Responsibilities:

Developed Simple to complex Map/reduce Jobs using Hive and Pig for performing analytics on data.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Involved in hadoopcluster task like commissioning & decommissioning Nodes without any effect for running jobs on the data.
Wrote Map Reduce jobs to discover trends in data usage by users.
Involved in running hadoop streaming jobs which helps to process terabytes of text data.
Introduced Oozie workflow to develop job processing scripts.
Analyzed large data sets with the help of running Hive queries and Pig scripts.
Customized parser loader application of Data migration to HBase.
Loaded cache data into HBase using Sqoop for performing the various queries on the data.
Created lots of external tables on Hive pointed to HBase tables.
Analyzed HBase data in Hive by creating external partitioned and bucketed tables so that efficiency is maintained.
Worked on reading multiple data formats on HDFS using Scala.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Analyzed the SQL scripts and designed the solution to implement using Scala.
Worked extensively with Sqoop for importing metadata from Oracle.
Involved in creating Hive tables, and loading and analyzing data using hive queries.
Performed all Linux operating system, disk management and patch management configurations, on Linux instances in AWS.
Provided Low-latency computations by caching the working dataset in memory and then performing computations at memory speeds using Spark.
Implemented in loading and transforming of large data sets of different types of data formats like structured, semi structured and unstructured data.
For performing easily combine batch, interactive, and streaming jobs in the same application is done using Spark.
Analyzed large data sets with the help of running Hive queries and Pig scripts.
Customized parser loader application of Data migration to HBase.
Loaded cache data into HBase using Sqoop for performing the various queries on the data.
Worked extensively with Sqoop for importing metadata from Oracle.
Involved in creating Hive tables, and loading and analyzing data using hive queries.
Wrote Hive Queries and UDF's.
Developed Hive queries to process the data and generate the data cubes for visualizing.

Environment: Hadoop,HDFS,MapReduce,Hive, Scala, Pig, Oozie, HBase, Spark, Kafka, Shell Scripting, MySQL, DB2, Oracle

Confidential, San Rafael, CA

Senior hadoop consultant

Responsibilities:

Developed custom MapReduce programs in java to perform daily transformation of JSON data to text format and store them in HDFS with respect to business requirement
Designedtheimplemented Hive Tables, Partition strategy, Hcatalog usage and performance tuning of Hive queries
Developed Pig scripts and UDF’s for data cleansing and denormalization of multiple datasets
Designed and implemented ETL workflow which includes data ingestion from different databases/Dataware house into HDFS using Sqoop, Transformation and Analysis in Hive/Pig, Preprocessing the raw data using Map reduce
Wrote Sqooppipeline to efficiently Transfer data from MySQL, DB2, Oracle Exadata, Netezza to Hadoop Environment.
Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios
Used HDFS API to pull data from HDFS and then SOLR API to push data into solr for indexing
Assisted Big data infra team on building Hadoop CDH clusters for development and production environment
Strong working knowledge on spark streaming, RDD's, Spark SQL and Scala
Processing data through Spark by using Scala and Spark sql
Developed Spark code using Scala and Spark-SQL for faster testing and processing of data
Worked on Collecting and aggregating large amounts of log data using Apache Flume and staging data using HDFS for further analysis
DevelopedPOC for Kafka Rest API to collect events from front end
Worked with HadoopBusiness team to determine the use case discovery, Technical specifications and Documentation
Worked with different file formats and compression techniques in Hadoop like Avro, Sequence, Lzo and Snappy
Worked on historical data to eliminate the occurrences of duplicatedata in HDFS using hive
Worked with big data Analyst’s & Data science team in troubleshooting Map reduce job failures and issues with Hive, Pig
Copied the data from one cluster to other cluster by using DISTCP and automated the procedure using shell scripts
Automated shell scripts, MapReduce programs, hive jobs and created workflow using Ooziescheduler.

Environment: Hadoop,HDFS,MapReduce,Hive, Pig, Oozie, HBase, Spark, Kafka, Shell Scripting, Java,MySQL, DB2, Oracle, Scala

Confidential, Austin, Texas

Big data consultant

Responsibilities:

Developed theMapReduce jobs in java and for data cleansing and preprocessing
Moved data from MS Sql Server &Oracle to HDFS and vice-versa using SQOOP
Worked on collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
Worked with different file formats and compression techniques in Hadoop to determine standards
Developed data pipeline using Pig & Hive from EDW source. These pipelines had customized UDF’sto extend the ETL functionality
Developed hive queries and UDF’s to analyze/transform the data in HDFS
Developed hive scripts for implementing control table logic in HDFS
Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE
Developed Pig scripts and UDF’s as per the Business logic
Analyzing/Transforming data in HDFS with Hive and Pig
Developed Oozie workflows and scheduled through a scheduler on a monthly basis.
Involved with Big data team in End to End implementation of ETL logic.
Effective coordination with offshore team and managed project deliverable on time.
Worked on QA support activities like test data creation and Unit testing activities.

Environment: Hadoop, MapReduce MRv1, Hive,Oozie, Pig, Sqoop,Java, Eclipse IDE, Shell Scripting, MS Sql Server, Oracle

Confidential, San Mateo, CA

Hadoop Consultant

Responsibilities:

Managed and analyzed Hadoop Log Files
Managed jobs using Fair Scheduler
Configured Hive Metastore to use MySQL database to establish multiple user connections to hive tables
Imported data into HDFS using Sqoop
Experience in retrieving data from databases like MySQL and Oracle into HDFS using Sqoop and ingesting them into HBase
Developed Hive Queries to analyze the data in HDFS to identify issues and behavioral patterns
Worked on shell scripting to automate jobs.
Used PigLatin to analyze datasets and perform transformation according to business requirements
Configured Nagios for receiving alerts on critical failures in the cluster by integrating with custom Shell Scripts
Configured the Ganglia monitoring tool to monitor both Hadoop and system specific metrics
Worked on implementing Flume to import streaming data logs and aggregating the data to HDFS through Flume.
Implemented MapReduce programs to perform joins using secondary sorting and distributed cache
Generated daily and weekly Status Reports to the team manager and participated in weekly status meeting with Team members, Business analysts and Development team

Environment: Apache Hadoop 0.20.203, MapReduce, Hive, Apache Maven, Java, Eclipse IDE, Sqoop, Ganglia, Nagios, Shell Script, Pig, Flume, Maven.

Confidential

System Analyst

Responsibilities:

Key responsibilities includerequirements gathering, designing and developing the Java applications
Implemented design patterns and Object Oriented Java design concepts to build the code
Participated in planning and development of UML diagrams like Use Case Diagrams, Object Diagrams, Class Diagrams and Sequence Diagrams to represent the detail design phase
Identified and fixed transactional issues due to incorrect exceptional handling and concurrency issues due to unsynchronized block of code
Created Java application module for providing authentication to the users for using this application and to synchronize handset with the Exchange server
Performed unit testing, system testing and user acceptance test
Involved in Analysis, Design, Coding and Development of custom Interfaces
Gathered requirements from the client for designing the Web Pages
Gathered specifications for the Library site from different departments and users of the services
Assisted in proposing suitable UML class diagrams for the project
Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers
Designed and implemented the UI using HTML and Java
Strong knowledge on MVC design pattern
Worked on database interaction layer for insertions, updating and retrieval operations on data
Implemented Multi-threading functionality using Java Threading API

Environment: Java,JDBC, HTML, SQL, Oracle, IBM Rational Rose, Eclipse IDE, LDAP

We provide IT Staff Augmentation Services!

Senior Hadoop Engineer Resume

FloridA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship