hadoop Developer Resume
Kent, WA
SUMMARY:
- 7+ years of IT experience which includes involvement in creating, actualizing and configuring Hadoop ecosystem and expertise in delivering solutions to Network Optimization.
- Having 3+ years of experience as a Hadoop developer with extensive knowledge on Hive, Pig, Sqoop, Flume, Spark, Scala, HBase, Oozie, ZooKeeper and Cassandra.
- 2 years’ experience in installing, configuring, monitoring, testing and troubleshooting Hadoop ecosystem components.
- Experience in developing custom Map - Reduce & YARN programs in java and python to process huge data as per the requirement.
- Extensive knowledge in importing and exporting data using Sqoop from RDBMS (Relational Data Base Systems) to HDFS.
- Hands on experience in writing custom UDF’s (User Defined Functions), UDTF (User Defined Table-Generating Functions ) and UDAF (User Defined Aggregating Functions ) for HIVE and PIG.
- Worked on various file formats like Json, CSV, Avro, Sequence file, Text files and XML files.
- Good knowledge on No-SQL databases like HBase, Cassandra and MongoDB.
- Experience in administration of clusters using Ambari and Cloudera.
- Good working knowledge in creating workflows using Oozie for Cron Jobs.
- Involvement in making dashboard sheets and producing reports utilizing Tableau by associating with tables in Hive and HBase.
- Experience in breaking down information utilizing Pig Latin Scripts and Hive Query Language.
- Sound knowledge in cloud integration with Microsoft Azure, Amazon EC2, Amazon S3 and EMR.
- Worked on backend application projects related to ETL, data migration and Developed shell scripts for automation on DBA tasks.
- Good Linux Administration skills like IP Addressing, Subnetting, Ethernet Bonding and Static IP.
- Hands on knowledge in commissioning and decommissioning the nodes in Hadoop cluster and familiar with security measurements like integration with Kerberos.
- Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
- Experience in monitoring and managing 200+ node Hadoop cluster
- Hands on knowledge on VMware, VSphere and familiar with Cisco Routing and Switching concepts.
- Good knowledge on Windows Server 2008 and 2012.
- Good Knowledge in implementing applications using Java, Python, J2EE, JSP and web based development tools.
- Ability to synthesize high-tech process with intense Conceptual, Business and Analytical skills to present capacity solutions and result-oriented analytic performance and command skills.
TECHNICAL SKILLS:
Hadoop Technologies: MapReduce, HDFS, YARN, Hive, Pig, Sqoop, Flume, Zookeeper, HBase, Spark, Kafka, Oozie
Programming Languages: Java, Python, SQL, PL/SQL, Shell Scripting, C, UNIX Shell Scripting, HTML
Frameworks: Hibernate 2.x/3.x, spring 2.x/3.x, Struts 1.x/2.x
Database Systems: Oracle, MySQL, Postgress, Teradata, HBase, Cassandra, Mongo DB
Web Technologies: Web Logic, Web Sphere, HTML5, CSS, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, SOAP and Rest Web Services
IDE Tools: Eclipse, NetBeans, RAD
Visualization Tools: Tableau
Monitoring Tools: Apache Tomcat Monitoring & Reporting tools, Ambari, Cloudera Manager, Ganglia, Nagios
Build Tools: Maven, ANT, Jenkins
Operating Systems: Windows XP/NT, Windows Server 2008,2012, Linux, UNIX, Mac
PROFESSIONAL EXPERIENCE:
Confidential, Kent, WA
Hadoop Developer
Responsibilities:
- Performed joins, group by and other operations in MapReduce by using Java and PIG.
- Wrote and executed PIG scripts using Grunt shell.
- Used Rest API to Access HBase data to perform analytics.
- Design and Implementation of ETL process in Hadoop.
- Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations previously storing the data into HDFS.
- Created UDF’s to encode the client sensitive data and stored into HDFS and performed evaluation employing PIG.
- Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
- Designed the web-based structure for business analytics and data visualization in Hadoop ecosystem integrated Tableau on Hadoop frame work to visualize and analyze data.
- Experienced in analyzing Cassandra database and correlate it with other open-source NoSQL databases to find which one of them better suites the stream requirements.
- Exported the analyzed data to the relational databases employing Sqoop for visualization and to generate reports for the BI team.
- Worked on Oozie workflow engine for job scheduling.
- Worked on custom Pig Loaders and storage classes to reinforce variety of data formats like JSON and XML file formats.
- Experienced on storing and transforming of huge sets of structured, semi structured and unstructured data.
- Performed joins, group by and other operations in MapReduce by using Java and PIG.
- Effectively participate the team in achieving the big data tasks, delivering the projects in time and learned the optimal way to process any kind of tasks.
Environment: Apache Hadoop 2.2.0, HDP2.2, Ambari, MapReduce, Hive, HBase, HDFS, Cassandra, PIG, Sqoop, Oozie, Java 1.7, UNIX, Shell Scripting, XML.
Confidential, Bloomfield, CT
Hadoop Developer
Responsibilities:
- Provide technical designs, architecture, Support automation, installation and configuration tasks and upgrades and planning system upgrades of Hadoop cluster.
- Developed data pipeline using Flume and Sqoop to ingest customer behavioral data and financial histories into HDFS for analysis.
- Maintained Hadoop clusters for dev/staging/production. Trained the development, administration, testing and analysis teams on Hadoop framework and Hadoop eco system.
- Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
- Successfully integrated Hive tables and MongoDB collections and developed web service that queries Mongo DB collection and gives required data to web UI.
- Collected and aggregated large amounts of web log data from different sources such as web servers, mobile and network devices using Apache Flume and stored the data into HDFS for analysis.
- Implemented HBase Co-processors to notify Support team when inserting data into HBase Tables.
- Developed the UNIX shell scripts for creating the reports from Hive data.
- Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc.
- Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
- Define business and technical requirements, design Proof of Concept for evaluating afms agencies data evaluation criteria and scoring and select data integration and information management.
- Integrating Big data technologies and analysis tools into the overall architecture.
Environment: Hadoop, Cassandra, HBase, HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Java, JSP, RMI, JNDI, JDBC, Tomcat, Apache, Shell Scripting.
Confidential,
Hadoop Administrator
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop. Worked hands on with ETL process using Pig.
- Worked on data analysis in HDFS using MapReduce, Hive and PIG jobs.
- Worked on MapReduce programming and Hbase.
- Having knowledge on Installation and configuration of cloudera Hadoop cluster.
- Worked on setting up of environment and re-configuration activities.
- Involved in creating external table, partitioning, bucketing of table.
- Ensuring adherence to guidelines and standards in project process.
- Facilitating testing in different dimensions.
- Used Crontab for automation of scripts.
- Wrote and modified stored procedures to load and modifying of data according to business rule changes.
- Worked on production support environment.
- Extracted the data from Teradata into HDFS using Sqoop.
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Kerberos security was implemented to safeguard the cluster.
- Worked on a stand-alone as well as a distributed Hadoop application.
Environment: Apache Hadoop, Cloudera, Pig, Hive, SQOOP, Flume, Java/J2EE, Oracle 11G, Crontab, JBoss 5.1.0Application Server, Linux OS, Windows OS, AWS.
Confidential,
JAVA/J2EE Developer
Responsibilities:
- Designing and developing front-end, middleware and back-end applications.
- Played key role in design and development of new application using J2EE, Servlets, and Spring technologies/frameworks using Service Oriented Architecture (SOA).
- Implemented the Web Services functionality in the application to allow external applications to access data.
- Prepared use-case diagrams, class diagrams and sequence diagrams as part of requirement specification documentation.
- Developed an API to write XML documents from a database. Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
- Involved in the development of presentation layer and GUI framework in JSP. Client Side validations were done using JavaScript.
- Used JDBC to connect to the backend database and developed stored procedures.
- Developed Java Servlets to control and maintain the session state and handle user requests.
- Implemented Search queries using Hibernate Criteria interface.
- Closely worked with QA, Business and Architect to solve various Defects in quick and fast to meet deadlines
Environment: Java, Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB Mule, Junit,WAS7,Jquery, Ajax, SAX.
Confidential,
Network Engineer
Responsibilities:
- Developed project user instruct documents that help in knowledge supply to new testers and quick fix repository document which gives effective resolution of any issues occurred previously through shortening the number of null defects.
- Established/implemented firewall rules, Validated rules with vulnerability scanning tools.
- Installed Cent OS using Pre-Execution environment boot and Kick-start method on multiple servers. Monitoring System Metrics and logs for any problems.
- Worked with File System includes UNIX file System and Network file system. Planning, scheduling and implementation of O/s. patches on both Solaris & Linux.
- Supported Windows Server Infrastructure systems including Windows Server 2003, 2008 and Windows Server 2003r2 IIS, TCP/IP, DNS, VPN, Network Routes, Bandwidth usage, Disk management, High Availability systems, and diagnosis of hardware issues.
- Active Directory User Management for new employee joins and providing access to the resources and file permissions.
- Remotely supported clients and client networks via remote connection through VPN connections.
- Running Cron-tab to back up Data. Applied Operating System updates, patches and configuration changes.
Environment: Windows 2003,r2,2008 server, Unix Shell Scripting, Red Hat Linux, Cent OS, MS Access, NoSQL, Linux/Unix, Putty Connection Manager, Putty, SSH.