Senior Hadoop Developer Resume Iowa City, IA - Hire IT People

SUMMARY:

Talented and accomplished Software Engineer with 8 years of IT experience in developing applications using BigData, AWS, Java,SQL and Spark.
9+ years of experience with Big Data tools like MapReduce, YARN, HDFS, Hbase, Impala,Hive, Pig, Oozie,AWS,, ApacheSpark for ingestion, storage, querying, processing and analysis of data.
Performance tuning in Hive&Impala using multiple methods limited to dynamic partitioning, bucketing, indexing, files compressions.
Hands on experience withdata ingestion tools Kafka, Flume and workflow management tools Oozie and Zena.
Hands on experience handling different file formats like JSON, AVRO, ORC, Parquet and compression techniques like snappy, zlib and lzo.
Hands on experience in Hadoop Ecosystem components such as Hadoop, Spark, HDFS, YARN, TEZ, Hive, Sqoop, Flume, MapReduce, SCALA, Pig, OOZIE, Kafka, NIFI, Storm, HBASE.
Experience on analyzing data in NOSQL databases like Hbase and Cassandraand its Integration with Hadoop cluster.
Hands on experience with Spark Core, Spark SQL and Data Frames/Data Sets/RDD API.
Experience in using Kafka and Kafka brokers to initiate spark context and processing live streaming information with the help of RDD.
Developed Java applications using various IDE's like Spring Tool Suite and Eclipse.
Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
Operated on Java/J2EE systems with different databases, which include Oracle, MySQL and DB2.
Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2 instances.
Capable of processing large sets of structured, semi - structured and unstructureddata and supporting systems application architecture.
Extensive development experience in sparkapplications for datatransformations and loading into HDFS using RDD, DataFrames and Datasets.
Extensive knowledge on performance tuning of Spark applications and converting Hive/SQL queries into Sparktransformations.
Hands-on experience with AWS (AmazonWebServices), using ElasticMapReduce (EMR), creating and storing data in S3buckets and creating ElasticLoadBalancers(ELB) for Hadoop front end WebUI's.
Extensive knowledge on creating Hadoop cluster on multiple EC2 instances in AWS and configuring them through ambari and using IAM (Identity and AccessManagement) for creating groups, users and assigning permissions.
Extensive programming experience in JavaCore concepts like OOPS, Multithreading, Collections and IO.
Experience using Jira for ticketing issues and Jenkins for continuous integration.
Extensive experience with UNIX commands, shellscripting and setting up CRON jobs.
Experience in software configuration management using Git.
Good experience in using Relational databases Oracle&MySQL.
Able to assess businessrules, collaborate with stakeholders and perform source-to-target datamapping, design.
Successfully working in fast-paced environment, both independently and in collaborative team environments.

TECHNICAL SKILLS:

Operating Systems: Win 95, 98, 2000/XP and UNIX, Linux

Languages: SQL, HTML, CSS, JAVASCRIPT, JAVA

Database: RDBMS, Oracle, DB2, SQL Server, MS Access, Database: MySQL, Oracle, PostgreSQL, MS Access, R Language, Hive, Spartan

Utilities: MS Word, Excel, Macros, Access, Power Point

Hadoop Technologies: HDFS, Hive, Pig, Scoop, Oozie, HDFS, Map Reduce, HBase

PROFESSIONAL EXPERIENCE:

Senior Hadoop Developer

Confidential - Iowa City, IA

Responsibilities:

Strong understanding and practical experience in developing Spark applications with Scala.
Developed Spark scripts by using Spark shell commands as per the requirement.
Developed Scala scripts, UDF's using both Data frames/SQL and RDD in Spark for Data Aggregation.
Exploring with Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame and pair RDD's
Experience in developing SparkSQL applications both using SQL and DSL
Extensively worked with parquet file format and gained practical knowledge in writing spark and hive applications to meet the parquet requirements.
Experience in using various compression techniques along with Parquet file format.
Experience in managing datasets and gained good experience in creating the test datasets for development purpose
Experience in building dimensional and fact tables using Spark Scala applications
Practical knowledge on writing applications in Scala to interact with the Hive through the Spark application.
Extensively used Hive partitioned tables, map join, bucketing and gained good understanding of dynamic partitioning.
Performed POC on writing the spark applications in Scala, Python and R programming language
Good hands on experience with Hive to perform data queries and analysis as a part of the QA
Practical experience in using Pig to perform the QA by calculating the statistics of the final output.
Experience in designing both time driven and data driven automated workflows using Oozie
Experience in writing Sqoop scripts to import data from exadata to HDFS
Good exposure to MongoDB, it's functionality and use-cases
Gained good exposure to Hue interface for monitoring the job status, managing the HDFS files, tracking the scheduled jobs and managing the Oozie workflows
Performed optimizations and performance tuning in Spark and Hive
Developed Unix script to automate data load into HDFS
Strong knowledge on HDFS commands to manage the files and also gained good understanding in managing the file system through the Spark Scala applications.
Extensive usage of alias for Oozie and HDFS commands
Experienced in managing and reviewing Hadoop log files.
Experience in log controlling for Spark applications and extensive use of log4j to log the respective phases of the application accordingly
Good knowledge on GIT commands, version tagging and pull requests
Performed unit testing and also integration testing after the development and participated in code reviews.
Experience in writing the Junit test cases for testing the Spark and SparkSQL applications
Practical experience with developing applications in IntelliJ and Maven
Good exposure to Agile environment. Participated in daily standups, Big Room Planning, Sprint meetings and Team Retrospectives
Interact with business analysts to understand the business requirements and translate them to technical requirements

Environment: Hadoop 2.6.0-cdh5.7.0, Java 1.8.0 92, Spark 1.6.0, SparkSQL, R programming, Python, Scala 2.10.5, MongoDB, Apache Pig 0.12.0, Apache Hive 1.1.0, HDFS, Sqoop, Oozie, Maven, IntelliJ, GIT, UNIX Shell scripting, Oracle 11g/10g, Log4j, Linux, Agile development

Hadoop/Spark Developer

Confidential - Atlanta, GA

Responsibilities:

Developed MapReduce jobs to process documents
Responsible for SOLR implementation and setup collections in SolrCloud.
Involved in Hadoop cluster setup and configuring Hadoop Ecosystems.
Developing Scripts and Batch Job to schedule various Hadoop Program
Write code to parse the external documents before copying to HDFS.
Developed Spark scripts by using Scala as per the requirement.
Developed HBase ingestion for documents and tuning
Developed web application to interact with SOLR for searching documents and ingest using SOLRJ api
Developed Spark jobs using Scala for processing locomotive events
Responsible for interacting with business partners and gather requirements and prepare technical design documents.
Developed service oriented architecture (SOA) based design of the application
Responsible for writing detail design documents and class diagrams and sequence diagrams.
Developed composite components using JSF 2.0.
Coordinating with the Onsite team and Clients.
Preparing the Unit Test Cases and executing the same.
Involved in the Integration testing, User Acceptance Support.
Involved in the Production Support.
Collaborate with product/business users, data scientists and other engineers to define requirements to design, build and tune complex solutions.
Involved in business requirement gathering, analysis and preparing design documents
Involved in preparing SOLR collection and schema creation.
Developed a data pipeline using Kafka, Spark and Hive to ingest, transform and analyzing data.
Involved in debugging and fine tuning the SOLR cluster and queries
Involved in importing documents data from external system to HDFS
Developed Spark streaming applications to process real time events, ingest emails and instant messages into HBase and Elasticsearch.
Managing and allocating tasks for onsite and offshore resources
Involved in setting up Kerberos and authenticating from web application
Involved in the refactoring the existing application to improve the performance of the application.
Interacting with client to map the legacy data with SCOPE specific data.
Developed Service Java Classes interface between application and external systems
Have written SQL query for creating the batch table.
Involved in Build Process and run the deployment procedure in the UNIX Environment on regular basis.
Monitoring the log files on regular basis in UNIX environment.

Environment: Hortonworks Data Platform (HDP 2.3), Hadoop, HDFS, Spark, Kafka, Hive, SOLR 5.2.1, HBase, Sqoop, Solr, Sun Solaris, Elasticsearch 2.0.0, RSA, Primefaces, JSF, RAD 8/8.5, AngularJS, Websphere Application Server 8/8.5, Java 1.7, Subversion, EJB 3.0, Oracle 11g.

Hadoop Developer

Confidential - Arlington, VA

Responsibilities:

Installed Hadoop, MapReduce, HDFS, and developed multiple MapReduce jobs in PIG and HIVE for data.
Used IMPALA to read, write and query the Hadoop data in HDFS and configured KAFKA to read and write messages from external programs.
Used PIG as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
Created Stored Procedures to transform the Data and worked extensively in SQL for various needs of the transformations while loading the data.
Developed various Python scripts to find vulnerabilities with SQL Queries by doing SQL injection, permission checks and performance analysis.
Involved in converting Hive/SQL queries into Spark transformations using Spark data frames.
Expertise in implementing Spark Scala application using higher order functions for both batch and interactive analysis requirement.
Responsible for loading bulk amount of data in HBase using MapReduce by directly creating H-files and loading them.
Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.
Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
Exporting of result set from HIVE to MySQL using Sqoop export tool for further processing.

Environment: Cloudera, Hadoop, HDFS, Hive, Impala,Spark Sql, Python, Sqoop, Oozie, Storm, Spark, Scala, MySQL, Shell Scripting

Hadoop Developer

Confidential, CA

Responsibilities:

Create the project using HIVE, BIGSQL, PIG
Involved in data modeling in Hadoop.
Creating Hive tables and working on them using Hiveql.
Written Apache PIG scripts to process the HDFS data.
Involved in data modeling in Hadoop.
Automated tasks using UNIX shell scripts.

Environment: HADOOP, HDFS, MAPREDUCE, HIVE, PIG, Scala, Python, HBASE, OOZIE, yarn, Spark, Core Java, Oracle, SQL, UBUNTU/UNIX, eclipse, Maven, JDBC drivers, Mainframe, MySQL, Linux, AWS, XML, CRM, SVN, PDSH, Putty, BigInsights

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Iowa City, IA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship