Hadoop Developer Resume
Columbus -, OH
SUMMARY:
Searching for the opportunity to bring 5 years of programming, technology, and engineering expertise in developing applications while incorporating critical thinking, problem solving, and leadership.
TECHNICAL SKILLS:
Hadoop Core Services: HDFS, MapReduce, Hadoop YARN
Hadoop Data Services: Apache Hive, Pig, Sqoop, Spark
Hadoop Distributions: Hortonworks, Cloudera
Hadoop Operational Services: Apache Zookeeper, Oozie
Cloud Computing Services: AWS (Amazon Web Services), Amazon EC2
IDE Tools: Eclipse, NetBeans, Jira.
Programming Languages: Java, Python, C#
Operating Systems: Windows (XP,7,8,10), UNIX, LINUX, Ubuntu
Reporting Tools /ETL Tools: Powerview for Microsoft Excel, Tableau
Databases: Oracle, MySQL, DB2, Derby, Database (HBase, Cassandra)
Web Technologies: HTML5, JavaScript
Environment: al Tools: SQL Developer, Win SCP, Putty, JIRA
Version Control Systems: Git, Tortoise, SVN
EXPERIENCE:
Hadoop developer
Confidential, COLUMBUS - OH
Responsibilities:
- Evaluate, extract/transform data for analytical purpose within the context of Big data environment.
- In - depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames.
- Developed spark application by using python (pyspark) to transform data according to business rules.
- Used Spark-SQL to Load data into Hive tables and Written queries to fetch data from these tables.
- Experienced in performance tuning of Spark Applications for setting correct level of Parallelism and memory tuning.
- Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins.
- Sourced Data from various sources into Hadoop Eco system using big data tools like sqoop.
- Worked with Oracle and Teradata for data import/export operations from different data marts.
- Involved in creating hive tables, loading with data and writing hive queries.
- Worked extensively with data migration, data cleansing, data profiling.
- Worked in tuning Hive to improve performance and solved performance issues in Hive with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs.
- Implemented partitioning, dynamic partitioning and bucketing in hive.
- Model hive partitions extensively for data separation.
- Expertise in hive queries, created user defined aggregated function worked on advanced optimization techniques.
- Automation tools like oozie was used for scheduling jobs.
- Export the analyzed data to all the relational databases using sqoop for visualization and to generate reports.
Environment: Apache Hadoop, Hive, Map Reduce, SQOOP, Spark, Python, Cloudera Manager CM 5.1.1,HDFS, Oozie, Putty.
Hadoop developer
Confidential, BENTONVILLE- AR
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Managed data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data
- Written multiple MapReduce programs in Java for Data Analysis.
- Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files.
- Load data from various data sources into HDFS.
- Experienced in migrating HiveQL to minimize query response time.
- Implemented Avro and parquet data formats for apache Hive computations to handle custom business requirements.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Experienced in handling different types of joins in Hive.
- Responsible for performing extensive data validation using Hive.
- Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
- Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations.
- Worked on different file formats and different Compression Codecs.
- Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Oozie, Java, Linux, Maven, Teradata, Zookeeper,Git, autosys, Hbase,Java.
JAVA developer
Confidential
Responsibilities:
- Analysis, design and development of Application based on J2EE using Struts and Hibernate.
- Involved in interacting with the Business Analyst and Architect during the Sprint Planning Sessions.
- Developed Rich user interface using RIA, HTML, JSP, JSTL, JavaScript, jQuery, CSS, YUI, AUI using Liferay portal.
- Worked on new Portal theme for the website using Liferay and customize for the look and feel.
- Involved in developing the user interface using Struts tags, core java development involving concurrency/multi-threading, struts-hibernate integration, database operation tasks.
- Implemented core java functionalities like collections, multi-threading, Exception handling.
- Performed Code optimization and rewriting the database queries to resolve performance related issues in the application.
- Involved in writing SQL, PL/SQL stored procedures using PL/SQL Developer.
- Used Eclipse as IDE for application development.
- Supported production deployments and validated the flow of the application after each deployment
- Used PL/SQL for cresting triggers, packages, procedures and functions.