Bigdata/hadoop Developer Resume
Cary, NC
PROFESSIONAL SUMMARY:
- Hadoop Developer with 7+ years of experience in BigData Analytics, Hadoop and Java development.
- Hands on experience in major BigData components like Apache Kafka, Apache Storm, Kerberos, HDFS, Hive, Sqoop, Oozie and Hbase.
- Hands on experience in designing, testing and deploying Map Reduce applications in a Hadoop Ecosystem.
- Hands on experience in installing, configuring the Hadoop ecosystem components such as MapReduce, HDFS, Pig, Hive, Sqoop, Flume, Knox, Tez, Storm, Kafka, Oozie, HBase using Ambari and Ambari Blueprints.
- Experienced in Big data solutions and Hadoop ecosystem related technologies. Well versed with Big Data solution planning, designing, development and POC's.
- Developed enterprise applications usingScala.
- Experience in working with various Hadoop infrastructures such as Map Reduce, Hive, PIG, SQOOP, Flume, HBase.
- Excellent understanding knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map - Reduce programming paradigm.
- Experience in NoSQL Column-Oriented Databases like Hbase, Cassandra and its Integration with Hadoop cluster.
- Experience in strong and analyzing data using HiveQL, HBase and custom Map Reduce programs in Java.
- Experience with development and design of solutions using Java in a test driven approach.
- Excellent Object Oriented Programming (OOP) skills with C++ and Java and in-depth understanding of data structures and algorithms.
- Hands-On experience on SQL, MySQL.
- Strong statistical, mathematical and predictive modeling skills and experience.
- Experience in Web Services using XML, HTML and SOAP
- Strong command over relational databases: MySQL, Oracle, SQL Server and MS Access.
- Experience working with NoSQL databases such as HBase, MongoDB, Cassandra.
- Have knowledge on Python.
- Experience working with Hadoop clusters using Cloudera, Horton Works distributions.
- Integrated data by using Talend ETL tool.
- Implemented event sourcing using akka
- Experienced in integrating Java-based web applications in a UNIX environment.
- Experience in writing SQL queries, stored procedures, functions, triggers and packages.
- Worked with HTML, Servlets, JSP, Struts, ANT, JUnit, Java Script, EJB, JSF, XML, XSD schemas, Hibernate, Spring and Ajax
- Ability to work independently and with a group of peers in a results-driven environment.
- Strong experience with XML related technologies like XML, XSL, XSLT, and XHTML
TECHNICAL SKILLS:
Hadoop Ecosystem: HDFS,MapReduce,Hive,Impala,Pig,Sqoop,Flume,Oozie,Zookeeper,Ambari,Hue,Spark,Strom, Ganglia.
Project Management / Tools / Applications: All MS Office suites(incl. 2003), MS Exchange & Outlook, Lotus Domino Notes, Citrix Client, SharePoint, MS Internet Explorer, Firefox, Chrome, Apache, IIS
Web Technologies: HTML, XML, CSS, JavaScript
NoSQL Databases: HBase, Cassandra, Mangodb
Databases: Oracle 8i/9i/10g, MySQL
Languages: Java, Scala, SQL, PL/SQL, Ruby, Shell Scripting
Operating Systems: UNIX(OSX, Solaris), Windows, Linux(Cent OS, Fedora, Red Hat)
IDE Tools: Eclipse, NetBeans
Application Server: Apache Tomcat
WORK EXPERIENCE:
BigData/Hadoop Developer
Confidential - Cary, NC
Responsibilities:
- Worked extensively in creating MapReduce jobs using to power data for search and aggregation
- Worked extensively with Sqoop for importing metadata from Oracle
- Extensively used Pig for data cleansing
- Worked with the Teradata analysis team using Big Data technologies to gather the business requirements.
- Created partitioned tables in Hive.
- Developed the Pig UDF'S to pre-process the data for analysis.
- Worked with business teams and created Hive queries for ad hoc access.
- Evaluated usage of Oozie for Workflow Orchestration
- Developed Python scripts.
- Mentored analyst and test team for writing Hive Queries.
- Designed the application based on J2EE Architecture and designed the front-end based on Struts and Tiles framework.
- Generated reports and did predictions using BI Tool called Tableau, Integrated data by using Talend
- Building framework for storing and processing input data from various resources.
- Maintaining job status and configuration in relational table (MySQL) for tracking and storing them in HBase.
- Worked with no SQL database that is Cassandra.
- Purging records older than business defined days and archiving those records into a file.
- Involved in various life cycle phase from requirement analysis to implementation.
- Performing data analytics to derive profile-attributes using business rules.
- Performing data standardization on the successful validated data and storing them in Hive.
- Performing mandatory data and field validation on the incoming feed.
- Worked with HTML, Servlets, JSP, Struts, ANT, JUnit, Java Script, EJB, JSF, XML, XSD schemas, Hibernate, Spring and Ajax
- Worked on code review comments after lead review and fix issues within.
Environment: Hadoop, MapReduce, HDFS, Spark, Hive, Java, Scala, Oozie, MySql, J2EE Servlet, JSP, XML, Spring 3.0, Struts 1.1, Hibernate 3.0.
BigData/Hadoop Developer
Confidential- Minneapolis, MN
Responsibilities:
- Developed Java MapReduce programs on log data to transform into structured way to find user location, age group, spending time.
- Developed optimal strategies for distributing the web log data over the cluster, importing and exporting the stored web log data into HDFS and Hive using Sqoop.
- Collected and aggregated large amounts of web log data from different sources such as webservers, mobile and network devices using Apache Flume and stored the data into HDFS for analysis.
- Extending Hive and Pig core functionality by writing custom UDFs.
- Implemented Java scripts for handling front-end popup's and validations.
- Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
- Developed MR jobs using AWS for more secure firewall and scalability.
- Monitored multiple Hadoop clusters environments using Ganglia.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Developed PIG scripts for the analysis of semi structured data.
- Developed and involved in the industry specific UDF.
- Used Java Mail API, Web services, Tortoise SVN and CVS for multiple operations in development purposes
- Developed Python scripts.
- Data base tuning, SQL, Java, Websphere.
- Analyzed the web log data using the Hive QL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
Environment: Hadoop 1x, HDFS, Map Reduce, Hive 10.0, Pig, Sqoop, Ganglia, Hbase, Shell Scripting, JavaScript, JDK, JSP, Servlets, EJB, JBoss, SOA/Web Services, SVN, XML, SAML, Spring, JQuery, ANT, Hibernate, Ireport tool, SQL developer.
Hadoop Developer
Confidential
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in gathering requirements and participating in the Agile planning meetings in-order to finalize the scope of each development.
- Developed simple to complex MapReduce programs to analyze the datasets as per the requirement.
- Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Configured periodic incremental imports of data from MySQL into HDFS using Sqoop.
- Responsible for migrating tables from traditional RDBMS into Hive tables using Sqoop and later generate required visualizations and dashboards using Tableau.
- Responsible for Load, aggregate and move large amounts of log data using Flume.
- Involved in loading data from UNIX file system to HDFS.
- Worked on loading and transformation of large datasets of structured, semi structured and unstructured data into Hadoop ecosystem.
- Responsible to manage data coming from different data sources.
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Implemented business logic by writing custom UDF's in Java and used various UDF's from Piggy banks and other sources.
- Used Hive and created Hive tables and involved in data loading and writing custom Hive UDF's.
- Created Partitions, Dynamic Partitions and Buckets for granularity and optimization using HiveQL.
- Involved in identifying job dependencies to design workflow for Oozie and resource management for YARN.
- Responsible for maintaining and implementing code versions using CVS for the entire project.
- Coordinated with testing teams to resolve issues during QA testing.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Impala, Cassandra, Cloudera Manager, Sqoop, Flume, Oozie, ZooKeeper, Java (jdk 1.6), Java, MySQL, Eclipse, Tableau.
Java Developer
Confidential
Responsibilities:
- Developed the application under JEE architecture, developed Designed dynamic and browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
- Deployed & maintained the JSP, Servlets components on Web logic 8.0
- Developed Application Servers persistence layer using, JDBC, SQL.
- Used JDBC to connect the web applications to Data Bases.
- Implemented Test First unit testing framework driven using Junit.
- Developed and utilized J2EE Services and JMS components for messaging communication in Web Logic.
- Configured development environment using Web logic application server for developers integration testing.
Environment: Java/J2EE, SQL, Oracle 10g, JSP 2.0, EJB, AJAX, Java Script, Web Logic 8.0, HTML, JDBC 3.0, XML, JMS, log4j, Junit, Servlets, MVC, My Eclipse
