Hadoop Developer And Data Architect Resume
Atlanta, GeorgiA
SUMMARY
- Seven years of experience wif emphasis on Big Data Technologies, Development and Design of Java based enterprise applications.
- Extensive experience in development of BigData projects using Hadoop, Mapreduce, Pig, Hive, Sqoop, Flume, Oozie.
- Good experience working wif Hortonworks Distribution and Cloudera Distribution.
- Experience in installation, configuration, supporting and managing Hadoop clusters.
- Implemented standards and processes for Hadoop based application design and implementation.
- Responsible for writing MapReduce programs using Java.
- Logical Implementation and interaction wif HBase.
- Developed MapReduce jobs to automate transfer of data from HBase.
- Performed data analysis using Hive and Pig.
- Loaded streaming log data from various webservers into HDFS using Flume.
- Successfully loaded files to Hive and HDFS from Oracle and SQL Server using SQOOP.
- Assist wif teh addition of Hadoop processing to teh IT infrastructure.
- Worked in Multiple Environment in installation and configuration.
- Document and explain implemented processes and configurations in upgrades.
- Support development, testing, and operations teams during new system deployments.
- Evaluate and propose new tools and technologies to meet teh needs of teh organization.
- Experience in using Sqoop, Oozie and Cloudera Manager.
- Good Knowledge on Hadoop Cluster architecture and monitoring teh cluster.
- Implemented stand - alone installation, file system management, backups, process control, user administration and device management in a networked environment.
- An excellent team player and self-starter wif good communication skills and proven abilities to finish tasks before target deadlines.
TECHNICAL SKILLS
Programming Languages: Java 1.4, C++, C, SQL, PIG, PL/SQL.
Java Technologies: JDBC.
Frame Works: Jakarta Struts 1.1, JUnit and JTest, LDAP.
Databases: Oracle8i/9i, NO SQL (HBase),MY SQL,MS SQL server.
IDE’s & Utilities: Eclipse and JCreator, NetBeans.
Web Dev. Technologies: HTML, XML.
Protocols: TCP/IP, HTTP and HTTPS.
Operating Systems: Linux, MacOS, WINDOWS 98/00/NT/XP.
Hadoop ecosystem: Hadoop and MapReduce, Sqoop, Hive, PIG, HBASE, HDFS, Zookeeper, Lucene, Sun Grid Engine Administration
PROFESSIONAL EXPERIENCE
Confidential, Atlanta, Georgia
Hadoop developer and data architect
Responsibilities:
- Analyzing teh data and using PIG, HIVE for teh loading of teh data into HDFS.
- Vast use of Shell scripting for teh loading of data into HDFS.
- Worked wif teh QA and Production team in teh data loading process.
- Worked on TWS for teh scheduling of teh jobs.
- Data processing involved working on ANT BUILDS.
- Data loading involved creating Hive tables and partitions based on teh requirement.
- Worked on various types of SERDE
- SSA (standard source adapters), standard set of java and python libraries which Confidential &T is building to ensure consistency of load and extract job code for manageability, scalability and maintenance efficiency.
- Worked on HCatalog which allows PIG and Map Reduce to take advantage of teh SerDE data format transformation definitions that we write for HIVE
- Worked on different UDFs, these are used in teh MES solution to provide a way for DA’s and developers to encrypt, or decrypt, data.
- Worked on different file formats(ORCFILE,RCFILE,SEQUENCEFILE,TEXTFILE) and different Compression Codecs (GZIP,SNAPPY,LZO).
Environment: Hadoop, HDFS, Pig, Sqoop,Horton works distribution, Shell Scripting, Ubuntu, Linux Red Hat, JSON, python
Confidential, Austin, TX
Hadoop Developer
Responsibilities:
- Worked on analyzing, writing Hadoop Mapreduce jobs using Java API, Pig and Hive.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in loading data from edge node to HDFS using shell scripting.
- Worked on installing cluster, commissioning & decommissioning of datanode, namenode high availability, capacity planning, and slots configuration.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
- Implemented best income logic using Pig scripts and UDFs.
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning teh performance Pig queries.
- Worked wif application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Responsible to manage data coming from different sources.
- Load and transform large sets of structured, semi structured and unstructured data
- Experience in managing and reviewing Hadoop log files.
- Job management using Fair scheduler.
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Supported in setting up QA environment and updating configurations for implementing scripts wif Pig and Sqoop.
Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.
Confidential, Philadelphia, PA
Hadoop Developer
Responsibilities:
- Involved in review of functional and non-functional requirements.
- Facilitated noledge transfer sessions.
- Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced indefining jobflows.
- Experienced in managing andreviewingHadooplog files.
- Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Got good experience wif NOSQL database.
- Supported Map Reduce Programs those are running on teh cluster.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
- Gained very good business noledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
- This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Mapreduce-based large-scale parallel relation-learning system
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarked Hadoop/HBase clusters for internal use
- Setup Hadoop cluster on Amazon EC2 using whirr for POC.
- Wrote recommendation engine using mahout.
Environment: Java 6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, HBase, Linux,, MapReduce, HDFS, Hive, Java (JDK 1.6),HadoopDistribution of HortonWorks, Cloudera, MapReduce, DataStax, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.
Confidential, Newark, NJ
Hadoop Developer
Responsibilities:
- Responsible for architecting Hadoop clusters.
- Assist wif teh addition of Hadoop processing to teh IT infrastructure.
- Perform data analysis using Hive and Pig.
- Load log data into HDFS using Flume, Kafka.
- Monitoring Hadoop cluster using tools like Nagios, Ganglia and Cloudera Maneger.
- Automation script to monitor HDFS and HBase through cronjobs.
- Plan, design, and implement processing massive amounts of marketing information, complete wif information enrichment, text analytics, and natural language processing.
- Prepare multi-cluster test harness to exercise teh system for performance and failover.
- Develop high-performance cache, making teh site stable and improving its performance.
- Create a complete processing engine, based on Cloudera's distribution, enhanced to performance.
- Administrative support for parallel computation research on a 24-node Fedora/ Linux cluster.
- Build and support standard-based infrastructure capable of supporting tens of thousands of computers in multiple locations.
- Negotiated and managed projects related to designing and deploying tis architecture.
Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, Altova XmlSpy, Putty and Eclipse.
Confidential, Albany, NY
J2EE Developer
Responsibilities:
- Involved in Presentation Tier Development using JSF Framework and ICE Faces tag Libraries.
- Involved in business requirement gathering and technical specifications.
- Implemented J2EE standards, MVC2 architecture using JSF Framework.
- Implementing Servlets, JSP and Ajax to design teh user interface.
- Extensive experience in building GUI (Graphical User Interface) using JSF and ICE Faces.
- Developed Rich Enterprise Applications using ICE Faces and Portlets technologies.
- Experience using ICE Faces Tag Libraries to develop user interface components.
- Used JSF, JSP, Java Script, HTML, and CSS for manipulating, validating, customizing, error messages to teh User Interface.
- Used EJBs (Session beans) to implement teh business logic, JMS for communication for sending updates to various other applications and MDB for routing priority requests.
- All teh Business logic in all teh modules is written in core Java.
- Wrote WebServices using SOAP for sending and getting data from teh external interface.
- Developed a web-based reporting for monitoring system wif HTML and Tiles using Struts framework.
- Middleware Services layer is implemented using EJB (Enterprise Java Bean - stateless) in WebSphere environment.
- Used Design patterns such as Business delegate, Service locator, Model View Controller, Session façade, DAO.
- Funds Transfers are sent to another application using JMS technology asynchronously.
- Involved in implementing teh JMS (Java messaging service) for asynchronous communication.
- Involved in writing JMS Publishers to post messages.
- Involved in writing MDB(Message Driven Beans) as subscribers.
- Created Stored procedures using PL-SQL for data modification (Using DML insert, update, delete) in Oracle
- Interaction wif Oracle database is implemented using Hibernate.
Environment: J2EE, EJB, JSF, ICE Faces, EJB, WebServices, XML, XSD, Agile, Microsoft Visio, Clear Case, Oracle 9.me/10.g, Weblogic8.1/10.3,RAD, LOG4j, Servlets, JSP, Unix.
Confidential
J2EE Developer
Responsibilities:
- Involved in designing teh application and prepared Use case diagrams, class diagrams, sequence diagrams.
- Developed Servlets and JSP based on MVC pattern using Struts Action framework.
- Used Tiles for setting teh header, footer and navigation and Apache Validator Framework for Form validation.
- Using Resource and Properties files for i18n support.
- Involved in writing Hibernate queries and Hibernate specific configuration and mapping files.
- Used Log4J logging framework to write Log messages wif various levels.
- Involved in fixing bugs and minor enhancements for teh front-end modules.
- Used JUnit framework for writing Test Classes.
- Used Ant for starting up teh application server in various modes.
- Used Clear Case for version control.
- Used SDLC Life Cycle
Environment: Java JDK1.4, EJB2.x, Hibernate 2.x, Jakarta Struts 1.2, JSP, Servlet, JavaScript, MS SQL Server 7.0, Eclipse3.x, Websphere 6, Ant, Windows XP, Unix, Excel Macro Development.
Confidential
J2EE Developer
Responsibilities:
- Involved in Requirement Analysis, Development and Documentation.
- Used MVC architecture (Jakarta Struts framework) for Web tier.
- Participation in developing form-beans and action mappings required for struts implementation and validation framework using struts.
- Development of front-end screens wif JSP Using Eclipse.
- Involved in Development of Medical Records module. Responsible for development of teh functionality using Struts and EJB components.
- Coding for DAO Objects using JDBC (using DAO pattern).
- XML and XSDs are used to define data formats.
- Implemented J2EE design patterns value object singleton, DAO for teh presentation tier, business tier and Integration Tier layers of teh project.
- Involved in Bug fixing and functionality enhancements.
- Designed and developed excellent Logging Mechanism for each order process using Log4J.
- Involved in writing Oracle SQL Queries.
- Involved in Check-in and Checkout process using CVS.
- Developed additional functionality in teh software as per business requirements.
- Involved in requirement analysis and complete development of client side code.
- Followed Sun standard coding and documentation standards.
- Participation in project planning wif business analysts and team members to analyze teh Business requirements and translated business requirements into working software.
- Developed software application modules using disciplined software development process.
Environment: Java, J2EE, JSP, EJB, ANT, STRUTS1.2, Log4J, Weblogic 7.0, JDBC, MyEclipse, Windows XP, CVS, Oracle.