Hadoop Developer Resume
Kansas City, KS
SUMMARY
- IT Professional with overall experience of 10 years with 3 years’ experience in Big Data projects using Hadoop and other open source tools/technologies.
- Hands on experience in Designing and developing parallel processing apps using Hadoop components Hadoop 2.0, YARN, Oozie, HDFS, MapReduce, HBase, Solr, Zoo Keeper, Hive, Sqoop, Pig, Apache Crunch and Spark.
- Extensive knowledge of Apache, Cloudera, Horton Works distributions and Hadoop Architecture and all components like HDFS, NameNode, JobTracker, TaskTracker, DataNode and MapReduce concepts.
- Hands - on experience in scheduling jobs on Oozie. Monitoring and supporting Production issues.
- Build/support/maintain the continuous integration & Continuous Delivery pipelines for software delivery. Experience in creating Deployment guidelines, authoring cookbooks.
- Good exposure to the design, development of Apache SPARK in Big data ecosystem.
- Varied experience in working on different domains like Healthcare, Telecom, Airlines domains.
- Ingested final datasets to HP Vertica by data modelling. Integration of Vertica data to front end reporting apps SAP BO and Tableau for detailed, advanced & custom reporting & analytics.
- Experience with databases such as Oracle 9i, PostgreSQL, MySQL Server and writing the SQL queries Triggers & Stored Procedures
- Responsible for guiding the full lifecycle of a Big Data Solution (Hadoop) solution, including requirements analysis, technical architecture design, application design and development, test and deployment.
- Experience with writing Hadoop MapReduce programs using Java with combination of Git for source control and Maven for Project/build management.
- Experienced with complete life cycle of ETL, Analysis and reporting tools like Informatica, IBM Cognos, Tableau & SAP BO for Data ETL, Analysis and visualizations.
- Experienced in preparing Data lakes, building data ingestion frames works to get data from heterogeneous data platforms.
- Implemented Unit Testing using JUNIT testing during the projects.
- Experienced in Waterfall, Agile approaches, including Extreme Programming, Pair Programming, Test-Driven Development methodologies.
- Communicated to diverse communities of clients at offshore and onshore, dedicated to client satisfaction and quality outcomes. Extensive experience in coordinating the Offshore Development activities
- Strong analytical, problem solving and troubleshooting skills, willingness and ability to quickly adapt to new environments and learn new technologies.
- Possess effective command over written, verbal communication andpresentation skills
TECHNICAL SKILLS
Big Data components: Hadoop/Big Data HDFS, Yarn, MapReduce, HBase, Pig, Hive, Sqoop, Oozie, Spark SQL and Zookeeper. Apache, Cloudera and Horton Works Distributions.
Java: Java 1.7, MapReduce, HBase API, Java EE, Maven, JUnit, XML & Solr.
Operating Systems: Windows, Solaris, Linux & Unix.
RDBMS: Oracle 10g, 11g, MS SQL Server, My SQL
Source Code Control: Git, CVS, Microsoft VSS.
Tools: /Utilities: Eclipse, IntelliJ, Maven, Oracle SQL Developer, JIRA, Confluence, Visio.
Languages/Other: C, C++, SQL, VB scripting and Shell scripting
PROFESSIONAL EXPERIENCE
Confidential - Kansas City, KS
Hadoop Developer
Environment: CDH Hadoop, HBase, Apache Crunch, Solr, Oozie, HDFS, Map Reduce, Hive, Pig, Eclipse, Java, Git, Maven, Junit, Powermock, Chef, Linux, OS X, Jenkins/Crucible/Jira, Cloudera CDH 4.5.
Responsibilities:
- Processed data using Hadoop technologies HDFS, MapReduce, Crunch, Solr, HBase,Hive and Pig.
- Built, Manage distributed Indexes in Apache Solr to support search and aggregate operations.
- Deep understanding of data retrieval, sharding, caching, and retrieval optimization.
- Analyzed Performance bottlenecks and implemented solar config changes to improve query time.
- Strong experience in developing, debugging and tuning Map Reduce jobs in Hadoop environment.
- Experience in writing ETL jobs using PIG Latin and Hive/Sparksql to adhoc generate reports. Injected final Results into HP Vertica Database for reporting layers consumption BO and Tableau.
- Validated final data sets by comparing RDBMS source systems and writing SQL, Hive and Solr queries.
- Designed and created AVRO schemas and implemented AVRO serialization to serialize object data to and from HBase.
- Create Hive tables and analysing large flat files in Hive using HiveQL.
- Writing, testing, and running MapReduce pipelines using Apache Crunch. .
- Joining and Data Aggregation using Apache Crunch.
- Developed multi-core CPU pipeline applications to analyze large data sets.
- Created custom MapReduce programs using Hadoop for big data processing.
Confidential - Dallas, TX
Java & Hadoop Developer
Responsibilities:
- Gathered the business requirements from the Business Partners and Subject Matter Experts.
- Responsible to manage data coming from different sources.
- Supported Java- Map Reduce Programs those are running on the cluster.
- Involved in HDFS maintenance and loading of structured and unstructured data.
- Wrote MapReduce job using Java API.
- Installed and configured Pig and also written PigLatin scripts.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Developed Scripts and Batch Job to schedule various Hadoop Program.
- Wrote Hive queries for data analysis to meet the business requirements.
- Designed and created Hive tables, partitions and buckets for optimal Hive QL performance.
- Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
Environment: Horton works, Java Hadoop, MapReduce, HDFS, Hive, Pig, Linux, Unstructured data and server logs.
Confidential, Green Bay, WIHadoop Developer
Environment: Hadoop Horton Works distribution, Mapreduce, Solr, HDFS, Hive, Oozie, Flume, Oozie, Unstructured Data, CSV files, Linux, Java, Eclipse.
Responsibilities:
- Worked in Multi Clustered Hadoop Echo-System environment
- Analysing new requirements, preparation of feasibility study documents for major enhancements
- Designed and Developed MapReduce jobs in Java, PIG Defined job flows, managing and reviewing Hadoop log files.
- Used Apache Lucene/Solr search engine server to help speed up the search of the transaction logs. Created an XML schema for the Solr search engine based on the Database schema.
- Load and transform large sets of unstructured data from UNIX system to HDFS.
- Prepared Data lake using Hadoop ecosystem. Ingested data from Mainframes, RBDBS systems.
- Experienced in running Hadoop streaming jobs to process terabytes of CSV format data.
- Supported Map Reduce Programs those are running on the cluster
- Involved in loading data from UNIX file system to HDFS.
- Exporting data to Oracle DB using Sqoop
- Written PIG Scripts to analyse Server logs
- Created tables, loading with data and writing HIVE queries which will run internally in map
- Participated in Project Planning and prioritization between, new requirements, enhancements, Bugs raised from the field and Customer queries and Preparation of Release Scheduling
- Performed Single Point of Technical Contact for different application teams and DEV, QA, Line Managers.
- Provided non-business and weekend/holidays technical support to various teams to support critical cases from the field
- Coordinating with business and technical manager to gather new requirements and converted them into Functional Specification documents
Sr Software Engineer
Responsibilities:
- Enhanced the existing systems using Java, J2EE, JSP, Spring, Hibernate, RESTFul WebServices and Java Beans
- Designed and Developed DN2 cross connections, DB2 batching, EKSOS and Q3Stack modules using Spring, Hibernate, JMS, JavaScript, Servlets, CSS and XML
- Enhanced Cross connections vs path finding process using java multi-threading techniques for DN2, and DB2 nodes
- Enhanced Q3 stack values against EKSOS NM sync-up using threading concepts
- Involved in requirements gathering and designing, server side coding using Spring and Hibernate, DAO, Actions, Filters, Handlers and JSP’s using HTML, CSS, JavaScript and Ajax
- Worked on performance tuning the database by tuning queries, creating indexes, stored procedures
- Involved in design for major enhancements to the existing systems
- Written Approach Notes documents for all NM upgrades and Client specific Re-Branding activities
- Helping manager in handling risk assessment and subsequently created contingency and mitigation plans
- Ensured the resolution of queries, Incidents, and bugs within agreed SLA time frame
- Ensured client satisfaction by giving support during odd hours & holidays
- Planned and implemented project plan to facilitate the definition of project scope, goals and deliverables, and defined project tasks
- Tracked project deliverables at all milestones defined for the project.
- Always set and met realistic deadlines. Forecasts changes and communicates current and projected issues
Environment: Java, JSP, Struts, spring, Hibernate, Oracle8i, WebLogic 9, Eclipse, Linux, and Solaris.
Confidential
Software Engineer
Responsibilities:
- Participated in re-design of the application using Java, JSP, Servlets, RESTFul WebServices, Java Beans, XML, AdvantNet SNMP and MySQL technologies.
- Written Implementation proposals with design alternatives for ENUM+ and IPWorks 5.0 upgrade work packages and configured MySQL Cluster with 4 Solaris Systems and Integrated with IPWorks.
- Using existing architecture designed and developed new ENUM+ objects.
- Designed and developed ENUM+ objects storage in MySQL cluster synchronizing with DNS Server using java multi-threading concepts
- Performed extensive work for MySQL Cluster migration (InnoDB database engine to MyISAM database engine)
- Was involved in fixing critical Trouble Reports in different work packages and Solved legacy problems came from 4.2 (a previous version of IPWorks), got solved quickly like SNMP alarms.
- For my outstanding work I was awarded with "Feather in My Cap” award.
Environment: Core Java, Java Beans, JSP, Solaris, Apache Tomcat and MySQL, AdventNet SNMP