Sr. Hadoop Developer Resume Daytona beach, FL - Hire IT People

PROFESSIONAL SUMMARY

Around 9 years of programming experience involved in all phases of Software Development Life Cycle (SDLC)
Over 5+ Years of Big Data experience in building highly scalable data analytics applications.
Strong experience working with Hadoop ecosystem components like HDFS, Map Reduce, Spark, HBase, Oozie, Hive, Sqoop, Pig, Flume and Kafka
Good handson experiencing working with various hadoop disrtibutions mainly Cloudera (CDH), Hortonworks (HDP) and Amazon EMR.
Good understanding of Distributed Systems architecture and design principles behind Parallel Computing.
Expertise in developing production ready Spark applications utilizing Spark - Core, Dataframes, Spark-SQL, Spark-ML and Spark-Streaming API's, SciKitLearn, SparkML(MLlib) and Tensorflow.
Strong experience troubleshooting failures in spark applications and fine-tuning spark applications and hive queries for better performance.
Worked extensively on Hive for building complex data analytical applications.
Strong experience writing complex map-reduce jobs including development of custom Input Formats and custom Record Readers.
Sound Knowledge in map side join, reduce side join, shuffle & sort, distributed cache, compression techniques, multiple hadoop Input & output formats.
Worked with Apache NiFi to automate the data flow between the systems and managed flow of information between system.
Good experience working with AWS Cloud services like S3, EMR, Redshift, Athena etc.,
Deep understanding of performance tuning, partitioning for optimizing spark applications.
Worked on building real time data workflows using Kafka, Spark streaming and HBase.
Extensive knowledge on NoSQL databases like HBase, Cassandra and Mongo DB.
Solid experience in working with csv, text, sequential, avro, parquet, orc, json formats of data.
Extensive experience in performing ETL on structured, semi-structured data using Pig Latin Scripts.
Designed and implemented Hive and Pig UDF's using Java for evaluation, filtering, loading and storing of data.
Experience in using Hadoop ecosystem and processing data using Tableau.
Good knowledge in the core concepts of programming such as algorithms, data structures, collections.
Developed core modules in large cross-platform applications using JAVA, JSP, Servlets, Hibernate, RESTful, JDBC, JavaScript, XML, and HTML.
Extensive experience in developing and deploying applications using Web Logic, Apache Tomcat and JBOSS.
Development experience with RDBMS, including writing SQL queries, views, stored procedure, triggers, etc.
Strong understanding of Software Development Lifecycle (SDLC) and various methodologies (Waterfall, Agile).

TECHNICAL SKILLS

Programming Language: Java/J2EE, JSP, Servlets, AJAX, EJB, Struts, Spring, JDBC, JavaScript, PHP and Python.

Databases: MYSQL, SQL, DB2 and Teradata

Web services: REST, AWS, SOAP, WSDL, Servers Apache Tomcat, WebSphere, JBoss

Operating Systems: Unix, Linux, Windows, Solaris

IDE tools: My Eclipse, Eclipse, NetBeans

QA Tools: Crashlytics or Fabrics

Web UI: HTML, JavaScript, XML, SOAP, WSDL

PROFESSIONAL EXPERIENCE

Confidential, Daytona beach, FL

Sr. Hadoop Developer

Responsibilities:

Developed Spark applications using Scala utilizing Data frames and Spark SQL API for faster processing of data.
Developed highly optimized Spark applications to perform various data cleansing, validation, transformation and summarization activities according to the requirement
Data pipeline consists Spark, Hive and Sqoop and custom build Input Adapters to ingest, transform and analyze operational data.
Developed Spark jobs and Hive Jobs to summarize and transform data.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Involved in converting Hive/SQL queries into Spark transformations using Spark DataFrames and Scala.
Used different tools for data integration with different databases and Hadoop.
Analyzed the SQL scripts and designed the solution to implement using Scala.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Built real time data pipelines by developing kafka producers and spark streaming applications for consuming.
Ingested syslog messages parse them and streams the data to Kafka.
Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HDFS.
Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
Analyzed the data by performing Hive queries (Hive QL) to study customer behavior.
Helped Dev ops Engineers for deploying code and debug issues.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
Scheduled and executed workflows in Oozie to run various jobs.
Experience in using Hadoop ecosystem and processing data using Amazon AWS.

Environment: Hadoop, HDFS, HBase, Spark, Scala, Hive, MapReduce, Sqoop, ETL, Java, PL/SQL, Oracle 11g, Unix/Linux.

Confidential, Union, New Jersey

Sr. Hadoop Developer

Responsibilities:

Developed multi-threaded Java based input adaptors for ingesting click stream data from external sources like ftp server and S3 buckets on daily basis.
Created various spark applications using Scala to perform various enrichment of these click stream data combined with enterprise data of the users.
Implemented batch processing of jobs using Spark Scala API.
Worked with Apache NiFi to automate the data flow between the systems and managed flow of information between systems.
Developed Sqoop scripts to import/export data from Oracle to HDFS and into Hive tables.
Stored the data in columnar formats using Hive.
Involved building and managing NoSQL Database models using HBase.
Worked in Spark to read the data from Hive and write it to Hbase .
Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with Hive QL queries.
Worked with multiple file formats like Avro, Sequence, Parquet and Orc.
Converted existing MapReduce programs to Spark Applications for handling semi structured data like JSON files, Apache Log files, and other custom log data.
Loaded the final processed data to HBase tables to allow downstream application team to build rich and data driven applications.
Experience in using Hadoop ecosystem and processing data using Amazon AWS.
Worked with a team to improve the performance and optimization of the existing algorithms in Hadoop using Spark, Spark -SQL, Data Frame.
Worked with Apache Ranger for enabling data security across the Hadoop ecosystem.
Implemented business logic in Hive and written UDF’s to process the data for analysis.
Used Oozie to define a workflow to coordinate the execution of Spark, Hive and Sqoop jobs.
Addressing the issues occurring due to the huge volume of data and transitions.
Designed, documented operational problems by following standards and procedures using JIRA.

Environment: Java 6, MongoDB, Apache Web server, HTML, JDBC, NoSQL, meteor.js, Eclipse, UNIX, CSS3, XML, JQuery, Oracle.

Confidential, Reston, VA

Hadoop Developer

Responsibilities:

Involved in requirement analysis, design, coding and implementation phases of the project.
Used Sqoop to load structured data from relational databases into HDFS.
Loaded transactional data from Teradata using Sqoop and created Hive Tables.
Worked on automation of delta feeds from Teradata using Sqoop and from FTP Servers to Hive.
Performed Transformations like De-normalizing, cleansing of data sets, Date Transformations, parsing some complex columns.
Worked with different compression codecs like GZIP, SNAPPY and BZIP2 in MapReduce, Pig and Hive for better performance.
Worked with Apache NiFi to automate the data flow between the systems and managed flow of information between systems
Have used Ansible for automation of frameworks.
Handled Avro, JSON and Apache Log data in Hive using custom Hive SerDes.
Worked on batch processing and scheduled workflows using Oozie.
Implemented installation and configuration of multi-node cluster on the cloud using Amazon Web Services (AWS) on EC2.
Worked in agile sprint methodology environment.
Have used the Knox gateway for having Hadoop security between the users and operators.
Used cloud computing on the multi-node cluster and deployed Hadoop application on cloud S3 and used Elastic Map Reduce (EMR) to run Map-reduce.
Used Hive-QL to create partitioned RC, ORC tables, used compression techniques to optimize data process and faster retrieval.
Implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access.

Environment: Apache Hadoop, HDFS, Cloudera Manager, Java, MapReduce, Eclipse Indigo, Hive, HBASE, PIG, Sqoop, Oozie, SQL, Spring.

Confidential, Dallas, TX

Hadoop Developer

Responsibilities:

Communicating with business customers effectively to gather the required information for the project.
Worked Extensively on Cloudera Distribution.
Involved in loading data into HDFS from Teradata using Sqoop
Experienced in moving huge amounts of log file data from different servers
Worked on implementing complex data transformations using MapReduce framework.
Involved in generating structured data through MapReduce jobs and have stored them in Hive tables and have analyzed the results through Hive queries based on the requirements.
Worked on performance improvement by implementing Dynamic Partitioning and Buckets in Hive and by designing managed and external tables.
Worked on migrating data relational data base to Big data technologies like Cassandra.
Worked on development of PIG Latin scripts and have used ETL tools and Informatica for some pre-aggregations
Worked on MapReduce programs to cleanse and pre-process data from various different sources.
Worked on Sequence files and Avro files in map Reduce programs.
Created Hive Generic UDF’s for implementing business logic. And have worked on incremental imports to Hive Tables.
Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HDFS.
Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
Worked with Apache NiFi to automate the data flow between the systems and managed flow of information between systems
Worked with Talend for integrating data from different data systems to Hadoop.
Used Kerbos authentication for proving authentication access to Distributed Systems.
Analyzed the data by performing Hive queries (Hive QL) to study customer behavior.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
Worked in agile sprint methodology environment.
Loaded processed data into HBase tables using HBase Java api calls.

Environment: Hadoop, HDFS, HBase, Spark, Scala, Hive, MapReduce, Sqoop, ETL, Java, PL/SQL, Oracle 11g, Unix/Linux.

Confidential

Manager

Responsibilities:

Maintain sites and work force
Liaising with clients and reporting on progress to staff and the public.
Supervising construction workers and hiring subcontractors.
Buying materials for each phase of the project.
Monitoring build costs and project progress.
Checking and preparing site reports, designs and drawings.
Maintain quality Control checks.
Day to day problem solving and dealing with any issues that arise.
Working on-site at clients’ businesses or in a site office

Confidential

Java Developer

Responsibilities:

Implemented the presentation layer with HTML, CSS and JavaScript
Developed web components using JSP, Servlets and JDBC
Implemented secured cookies using Servlets.
Wrote complex SQL queries and stored procedures.
Implemented Persistent layer using Hibernate API
Implemented Search queries using Hibernate Criteria interface.
Used CSS for good User Interface.
Provided support for loans reports for CB&T
Designed and developed Loans reports for Evans bank using Jasper and iReport.
Involved in fixing bugs and unit testing with test cases using Junit.
Object Oriented Analysis and Design using UML include development of class diagrams, Sequence diagrams and state diagrams and implemented these diagrams in Microsoft Visio.
Maintained Jasper server on client server and resolved issues
Actively involved in system testing.
Fine tuning SQL queries for maximum efficiency to improve the performance
Designed Tables and indexes by following normalizations.
Involved in Unit testing, Integration testing and User Acceptance testing
Utilizes Java and SQL day to day to debug and fix issues with client processes.

Environment: Java, Servlets, HTML, Java Script, JSP, Hibernate, Junit Testing, Oracle DB, SQL, Jasper Reports, iReport, Maven, Jenkins.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Daytona Beach, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship