Sr. Hadoop Developer Resume Philadelphia, PA - Hire IT People

SUMMARY

IT Professional with 8+ years of referable experience in distributed file systems like HDFS and HBase in BigData environment.
Excellent understanding of the complexities associated with BigData with expertise in developing modules and codes in MapReduce, Hive, Pig, Sqoop, Apache Flume and Apache Spark to address those complexities
Highly skilled in Analysis, Design, Development and BigData in Scala, Spark, Hadoop, Pig and HDFS environment and experience in JAVA, J2EE.
Experience in using HCatalog for Hive, Pig and HBase.Experienced with NOSQL databases like HBASE and Cassandra.
Good experience in installing, configuring, and administrating Hadoop cluster of major Hadoop distributions Hortonworks, Cloudera.
Strong work experience on Kafka Streaming to fetch the data real time or near real time.
Expert in data processing like collecting, aggregating, moving from various sources using Kafka.
Good experience in developing solutions to analyze largedatasets efficiently.
Experience in setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
Familiar with various Relational databases - MS SQL and Teradata
Experience with Oozie Workflow Engine in running workflow jobs with actions that run Impala, Hadoop MapReduce and Pig jobs.
Hands on experience in Import/Export ofdatausing HadoopDataManagement tool SQOOP.
Goo experience on EC2 (Elastic Compute Cloud) cluster instances, setup data buckets on S3 (Simple Storage Service), set EMR (Elastic MapReduce).
Comprehensive knowledge in Debugging, Optimizing and Performance Tuning of DB2, Oracle and MYSQL databases.

TECHNICAL SKILLS

Languages & Hadoop Components: HDFS, Sqoop, Flume, Hive, Pig, MapReduce, YARN, Oozie, Kafka, Spark, Impala, Storm, Hue, Zookeeper, Java, SQL.

BigData Platforms: Hortonworks, Cloudera, Amazon

Databases & NOSQL Databases: Oracle, MYSQL, Microsoft SQL Server, HBase and Cassandra

Operating Systems: Linux, UNIX, Windows

Development Methodologies: Agile/Scrum, Waterfall

IDE's: Eclipse, Net Beans, GitHub, Jenkins, Maven, IntelliJ, Ambari

Programming Languages: C, C++, JSE, XML, JSP/Servlets, Spring, HTML, JavaScript, jQuery, Web services, Python, Scala, PL/SQL & Shell Scripting

PROFESSIONAL EXPERIENCE

Confidential - Philadelphia, PA

Sr. Hadoop Developer

Responsibilities:

Applied transformations on the data loaded into Spark Dataframes and done in memory data computation to generate the output response.
Developed multiple POCs using Spark Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL.
Worked on migrating MapReduce programs into Spark transformations using Spark and Scala, initially done using Python (PySpark)
Developed Scala scripts using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Used hive to analyze the partitioned data and compute various metrics for reporting.
Import the data from different sources like HDFS into Spark Data frames.
Scheduled and executed workflows in Oozie to run Hive and Pig jobs
Extensively worked on Spark Context, Spark -SQL, Data Frame and Pair RDD's.
Reduced the latency of spark jobs by tweaking the spark configurations and following other performance and Optimization techniques.
Used Oozie workflow engine to manage interdependentHadoopjobs and to automate several types ofHadoopjobs such as Java map-reduce, Hive, Pig.
Used Hive, spark SQL Connection to generate Tableau BI reports.
Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
Created Hive Generic UDF's to process business logic that varies based on policy.
Developed various data connections from data source to SSIS, Tableau Server for report and dashboard development
Developed solutions utilizing the Hadoop ecosystem such Hadoop, Spark, Hive, HBASE, Pig, Sqoop, Oozie, Ambari, Zookeeper etc.
Wrote MapReduce programs with Java API to cleanse structured and unstructured data.
Worked on loading the data from MySQL & Teradata to HBase where necessary using Sqoop.

Environment: Scala, spark, Kafka, Hive, HortonWorks, Oozie, Play framework, Akka, Git, ElasticSearch, Logstash, Kibana, Kerberos

Confidential

Sr. Hadoop Developer/Admin

Responsibilities:

Installed, configured, upgraded, and applied patches and bug fixes for Prod, Lab and Dev Servers.
Install, configure and administer HDFS, Hive, Ranger, Pig, HBase, Oozie, Sqoop, Spark and Yarn.
Involved in upgrading Cloudera Manager Upgrade from Cloudera Manager 5.5 to Cloudera Manager 5.6.
Involved in capacity planning, load balancing and design of Hadoop clusters.
Involved in setting up alerts in Cloudera Manager for the monitoring health and performance of Hadoop Clusters.
Involved in installing and configuring security authentication using Kerberos security.
Creating and dropping of users, granting and revoking permissions to users/Policies as and when required using Ranger.
Commission and decommission the data nodes from cluster.
Write and modify UNIX shell scripts to manage HDP environments.
Involved in installed and configured Apache Flume, Hive, Sqoop and Oozie on the Hadoop cluster.
Create directories and setup appropriate permissions for different applications or users.
Backup tables in HBase to HDFS using export utility.
Involved in creating users, user’s groups and allotting the roles of the users and creating the home directory for the user.
Installation, Configuration and administration of HDP on Red Hat Enterprise Linux 6.6
Used Sqoop to import data into HDFS from Oracle database.
Detailed analysis of system and application architecture components per functional requirements.
Review and monitor system and instance resources to insure continuous operations (i.e., database storage, memory, CPU, network usage, and I/O contention)
On call support for 24x7 Production job failures and resolve the issue in timely manner.
Developed UNIX scripts for scheduling the delta loads and master loads using Auto sys Scheduler.
Troubleshoots with problems regarding the databases, applications and development tools.

Environment: Hadoop, HDFS, Hive, Cloudera Manager, Sqoop, Flume, Oozie, CDH5, MongoDB, Cassandra, HBase, Hue, Kerberos and Unix/Linux

Confidential - Dallas, TX

Java/Hadoop Developer

Responsibilities:

All the fact and dimension tables were imported from SQL Server into Hadoop using Sqoop.
Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Involved in extracting customer’s BigData from various data sources into Hadoop HDFS (this included data from mainframes, databases and logs data from servers).
Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
Developed Tableau visualizations and dashboards using Tableau Desktop.
Developed Tableau workbooks from multiple data sources using Data Blending.
Involved in managing and reviewing Hadoop log files.
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Implemented Partitioning, Bucketing in Hive for better organization of the data.
Developed python UDFs in Pig and Hive.
Used Apache Kafka to gather log data and fed into HDFS.
Data Ingestion using Sqoop from various sources like Informatica, Oracle
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
Installed and configured various components of Hadoop ecosystem and maintained their integrity.
Implemented Fair Scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Implemented automatic failover Zookeeper and zookeeper failover controller.
Developed Java map-reduce programs to encapsulate transformations.
Participated in Performance tuning in database side, transformations, and jobs level.

Environment: Hadoop, HDFS, Map Reduce, Sqoop, Hive, Pig, Oozie, HBase, CDH4, Cloudera Manager, MySQL, Eclipse

Confidential - Roseville, CA

Hadoop Developer

Responsibilities:

Responsible for building data solutions in Hadoop using Cascading frameworks.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Worked hands on with ETL process.
Explored with the spark, improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's and Spark Yarn.
Imported the data from different sources like HDFS/HBase into Spark RDD.
Developed Spark Code using Scala and Spark-SQL /streaming for faster testing and processing of data.
Developed Kafka Producer and consumers, HBase clients, spark and Hadoop map reduce jobs along with components on HDFS and Hive.
Upgraded the Hadoop Cluster from CDH3 to CDH4. Integrate the HIVE with existing applications.
Configured Ethernet bonding for all Nodes to double the network bandwidth.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Teradata into HDFS using Sqoop.
Used Python and Shell scripts to automate the end-to-end ELT process
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Installed Oozie workflow engine to run multiple Hive and Pig jobs.
Developed Hive queries to process the data and generate the data cubes for visualizing.
Performed data quality checks on data as per the business requirement.
Performed data validation on target table in compared to the source table.
Achieved high throughput and low latency for ingestion jobs leveraging the Sqoop
Transformed the raw data and loaded into stage and target tables.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Teradata, Cloudera Manager, Pig, Sqoop, Oozie, Python

Confidential

Java/J2EE Developer

Responsibilities:

Involved in designing and developing modules at both Client and Server Side.
Developed the UI using JSP, JavaScript and HTML.
Responsible for validating the data at the client side using JavaScript.
Interacted with external services to get the user information using SOAP web service calls
Developed web components using JSP, Servlets and JDBC.
Technical analysis, design, development and documentation with a focus on implementation and agile development.
Developed a Web based reporting system with JSP, DAO and Apache Struts-Validator using Struts framework.
Designed the controller using Servlets.
Accessed backend database Oracle using JDBC.
Developed and wrote UNIX Shell scripts to automate various tasks.
Developed user and technical documentation.
Developed business objects, request handlers and JSPs for this project using Java Servlets and XML.
Developed core spring components with some of the modules and integrated it with the existing struts framework.
Actively participated in testing and designed user interface using HTML and JSPs.
Implemented the database connectivity to Oracle using JDBC, designed and created tables using SQL.
Implemented the server side processing using Java Servlets.
Installed and configured the Apache Web server and also deployed JSPs and Servlets in Tomcat Server.

Environment: Java, Servlets, JSP, JavaScript, JDBC, Unix Shell scripting, HTML, Eclipse, Oracle 8i, WebLogic.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Philadelphia, PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship