Sr. Hadoop Developer Resume
Rocky Hills, CT
SUMMARY
- 7+ years of professional experience in Software Development and Requirement Analysis in Agile work environment with 3+ years of Big Data Eco Systems experience in ingestion, storage, querying, processing and analysis of Big Data.
- Experience in dealing with Apache Hadoop components like HDFS, MapReduce, HiveQL, HBase, Pig, Sqoop, Ozzie, Mahout, Cassendra, Mongo db, Big Data and Big Data Analytics.
- Good understanding/knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, Task Tracker, NameNode, DataNode, Secondary Namenode, and MapReduce concepts.
- Software development in Java Application Development, Client/Server Applications, Internet /Intranet based database applications and developing, testing and implementing application environment using C++, J2EE, JDBC, JSP, Servlets, Web Services, Oracle, PL/SQL and Relational Databases.
- Worked with various data sources such as Flat files and RDBMS - Teradata, SQL server 2005 and Oracle. Extensive work in ETL process consisting of data transformation, data sourcing, mapping, conversion an
- Exceptional ability to quickly master new concepts and capable of working in groups as well as independently.
- Excellent interpersonal skills and the ability to work as a part of a team.
- Experience in debugging, troubleshooting production systems, profiling and identifying performance bottlenecks.
- Excellent SQL programming, Visual Analytics skills.
- Hands on experience on Apache Storm, Rabbit MQ, Kafka.
- Hands on experience on BD PaaS.
- Good Knowledge of Apache Spark and Scala.
- Hands on experience on AWS using S3 and running instances on Ec2
- Experience in providing support to customers and other parties
- Has good knowledge of virtualization and worked on VMware Virtual Center
- Excellent working knowledge of different statistical analysis tools like SPSS and Microsoft Excel.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper and Flume.
- Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
- In-depth understanding of Data Structure and Algorithms.
- Experience in managing and troubleshooting Hadoop related issues.
- Expertise in setting up standards and processes for Hadoop based application design and implementation.
- Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
- Experience in managing Hadoop clusters using Cloudera Manager.
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
- Hands on experience in VPN, Putty, winSCP, VNCviewer, etc.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
TECHNICAL SKILLS
Programming Languages: C++, Java, Hive, Pig
HADOOP/BIG DATA: HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Flume, Ozzie, Zoo keeper, Apache Cassandra, MogoDb
Web/Application Servers: Apache Tomcat, Sun Java Application Server
Web Design: HTML, CSS and XML
Frameworks: Struts, Spring, Hibernate
Scripting: BASH, JavaScript, ksh
OPERATING SYSTEMS: Windows, Linux, UNIX.
Debugging: Eclipse, Net Beans and Visual studio
OO Modeling: UML
Protocols: SNMP, FTP, SFTP, JDBC, ODBC.
DATABASE: Oracle 10g, DB2, My SQL, PL/SQL, Mongo DB, Couch DB
WEB TECHNOLOGIE: Struts, Junit, MR Unint, ODBC, JDBC, XML, XSL, XSD, CSS, JavaScript, Hibernate, Spring, Ajax, Jquery, JSP, Servlets, Java Swings, Java Beans, EJB, MVC, Java Mail, HTML
PROFESSIONAL EXPERIENCE
Confidential, Rocky Hills CT
Sr. Hadoop Developer
Responsibilities:
- Worked on importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into MAPR.
- Used Pig and Hive to analyze data.
- Worked on POC on Apache Storm to copy already existing project.
- Wrote Java programs to implement business logic using Apache Storm.
- Worked on a reading agent to read data from various sources and pass it to Rabbit MQ.
- Used Apache Impetus Real Time Streaming Analytix for a POC on.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Written Unix scripts to generate automated customized Hive tables.
- Designed and presented plan for POC on Apache Storm.
- Worked on transferring data from Oracle,DB2 and MySQL into Hive.
- Involved in loading data from Unix to MAPRFS
- Handled importing data from Teradata into MAPRFS.
- Worked on validation data.
Confidential, New York City, NY
Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Analyzed data using Hadoop components Hive and Pig.
- Worked hands on with ETL process.
- Hands on Hadoop programs testing using JUnit/MRUnit testing
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Provided quick response to ad hoc internal and external client requests for data and experienced in creating ad hoc reports.
- Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Involved in loading data from UNIX file system to HDFS.
- Responsible for creating Hive tables, loading data and writing hive queries.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Extracted the data from Teradata into HDFS using the Sqoop.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Installed Ozzie workflow engine to run multiple MapReduce, Hive and Pig jobs which run independently with time and data availability.
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior like shopping enthusiasts, travelers, music lovers etc.
- Exported the patterns analyzed back into Teradata using Sqoop.
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
Environment: Hadoop Cluster, HDFS, Hive, Pig, Sqoop, Linux, Hadoop Map Reduce, Hbase, LINUX Shell Scripting.
Confidential, Topeka KS
Hadoop Developer
Responsibilities:
- Used Solid Understanding of Hadoop HDFS, Map-Reduce and other Eco-System Projects
- Installation and Configuration of Hadoop Cluster
- Working with Cloudera Support Team to Fine tune Cluster
- Working Closely with SA Team to make sure all hardware and software is properly setup for Optimum usage of resources
- Developed a custom FileSystem plugin for Hadoop so it can access files on Hitachi Data Platform.
- Plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly. The plugin also provided data locality for Hadoop across host nodes and virtual machines.
- Wrote data ingesters and map reduce programs.
- Developed Map Reduce jobs to analyze data and provide heuristics reports.
- Good experience in writing data ingesters and complex MapReduce jobs in java for data cleaning and preprocessing and fine tuning them as per data sets.
- Extensive data validation using HIVE and also written Hive UDFs
- Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce way. lots of scripting (python and shell) to provision and spin up virtualized hadoop clusters
- Adding, Decommissioning and rebalancing nodes
- Created POC to store Server Log data into Cassandra to identify System Alert Metrics
- Rack Aware Configuration
- Configuring Client Machines
- Configuring, Monitoring and Management Tools
- HDFS Support and Maintenance
- Cluster HA Setup
- Applying Patches and Perform Version Upgrades
- Incident Management, Problem Management and Change Management
- Performance Management and Reporting
- Recover from Name Node failures
- Schedule Map Reduce Jobs - FIFO and FAIR share
- Installation and Configuration of other Open Source Software like Pig, Hive, HBASE, Flume and Sqoop
- Integration with RDBMS using Sqoop and JDBC Connectors
- Working with Dev Team to tune Job Knowledge of Writing Hive Jobs
ENVIRONMENT: Windows 2000/ 2003, UNIX, Linux, Java, Apache HDFS Map Reduce, Pig Hive HBase Flume Sqoop, Cassandra, NOSQL
Confidential, Newark NJ
Software Developer
Responsibilities:
- Designed and developed a UI, which presents the engineer a form to submit solution to particular problem.
- Designed and developed a UI, which allows the end user to query on the problem, makes a JDBC connection to the database and retrieve the details regarding the call number and also the current status of the submitted problem.
- Developed class diagram and object diagram for a clear depiction of various classes, objects and their functionalities.
- Designed and developed Servlets, which presents the end user with form to submit the details of the problem.
- Developed Servlets used to store user information in the Database, which makes a JDBC connection to the database and inserts the details into to the database.
- Executed SQL statements for the effective retrieval and storage of data from the Database
- Involved in the Unit Testing of the Application.
Environment: Java 6, HTML, JavaScript, JSP 2.2, Spring, AJAX, Hibernate 3, WebLogic Application Server 10g, XML, Eclipse 3.7, MS SQL Server 5.5, Maven 3.0, JUnit, ANT, Rational Clear Case, Log4J, WebLogic Server 10g.