Hadoop Consultant Resume
New, YorK
SUMMARY
- Proactive IT developer with 8+ years of working experience in designing and building high performance scalable systems using Big data technologies and java.
- Highly dedicated and results oriented Hadoop developer with 4 years of strong end to end experience in Hadoop development using different big data tools.
- Experience in different Hadoop distributions like Cloudera and Horton works (HDP).
- Experience in Hadoop 2.0 (MRV2) YARN architecture.
- Expertise in importing and exporting data into HDFS and hive using Sqoop and vice versa.
- Experience in using Flume to load log data from multiple sources directly into HDFS.
- Experience in using different file formats like AVRO, Parquet, RC, Sequence and CSV.
- Experience in writing Map Reduce programs in Java and python.
- Expert in data cleansing operations using Pig Latin transformations.
- Experience in migration projects to transform and develop using Big Data utilities.
- Experience in optimizing and improving query performance in Hive and Map Reduce.
- Experience in extending Hive and pig core functionality by writing custom UDF’s.
- Hands on experience in NOSQL databases like HBase, Cassandra, MongoDB, Neo4J.
- Experience in developing spark jobs using Java and Scala.
- Experience in using Hive on spark for data movement to upstream search.
- Expert in automation of jobs with time driven and data driven workflows using Oozie.
- Experience in indexing data using Lucence libraries in Apache Solr or Cloudera Search.
- Ability to implement distributed messaging queue using Apache Kafka.
- Knowledge on Amazon Web services (AWS) projects.
- Experience in integrating Hive and impala with visualization tools like Qlikview.
- Experience in migrating EDW (Enterprise Data Warehouse) into Big Data.
- Ability to setup migration from mainframes to HDFS using spark and Informatica jobs.
- Extensive knowledge on Data ingestion, data processing, Batch analytics.
- Elementary knowledge on Machine learning, NLP models like K mean clustering.
- Knowledge on Talend integration with big data technologies.
- Experience in tools and utilities like Eclipse, WSAD, RAD, Ant, Maven.
- Experience in Restful web services, XML, SOAP.
- Command in Java, Unix shell scripting, Linux, JIRA, SQL Developer.
- Strong working experience in Agile development, Scrum Master.
TECHNICAL SKILLS
Hadoop Distributions: Cloudera, HortonWorks
Hadoop Technologies: HDFS, Hive, Pig, Oozie, Map Reduce
Ingestion Tools: Sqoop, Kafka Streaming
In - memory/MPP/Search: Impala, Solr, Spark
Column oriented Data Stores: HBase, Cassandra
Graph DB’s: Neo4J
BI Tools: Qlikview, Tableau
Cloud Platform: Amazon Web Services ( AWS)
Relational DB’s: SqlServer, Teradata, Oracle, PostgreSQL
Programming Languages: Java, Scala, Java Script, Python
ETL: Informatica, Talend
Operating Systems: Windows,UNIX,Cent OS
Application Servers: JBoss,Tomcat
Web Services: REST, SOAP
Frameworks: Hibernate,Spring,JMS,Struts
Web Technologies: ASP.Net,HTML,JQuery,Servlets
PROFESSIONAL EXPERIENCE
Confidential, New York
Hadoop Consultant
Responsibilities:
- Worked on data ingestion to HDFS using Sqoop from SQL and Teradata.
- Experience in applying transformations and aggregation’s using Pig Latin.
- Expert in performing business analytical scripts using Hive SQL.
- Experienced in automating jobs with event driven and time based jobs using Oozie workflow.
- Experience in creating and scheduling autosys jobs for event based and time based jobs.
- Worked on converting Binary format data from DB2 to text format to HDFS and Hive using Spark.
- Worked on integrating impala to visualization tools like Qlikview.
- Developed analytical scripts on top of impala and fetch data to data scientists to perform Machine learning algorithms for recommendations.
- Worked closely with data scientist in data cleansing to perform K-Mean Clustering On logs data.
- Involved in fetching data from DB2 to Hive and perform indexing using Apache Solr.
- Worked on POC data movement from Kafka, storm to HDFS, Hive.
- Worked on various file formats like Avro, Parquet, Text.
- Worked on developing custom spark code to move data from staging zone to refined zone.
- Involved in building analytics on top of data lake using Hive, Spark, Neo4j.
- Experienced in supporting Hadoop developed projects in production.
Environment: CDH5.4, HDFS, Hive 1.10, Sqoop 1.4.5, Pig 0.12, Spark 1.3, Oozie 4.0, Impala, Solr, Map Reduce, Avro, Parquet, Teradata, DB2, Qlikview, Kafka, Java, python.
Confidential, Little Rock, Arkansas
Hadoop Developer
Responsibilities:
- Extracted the data from Teradata & MySQL into HDFS using Sqoop export/import.
- Developed Sqoop jobs with incremental load to populate Hive External tables.
- Expertise in using design patterns in Map Reduce to convert business data into custom format.
- Experienced with handling different compression codec's like LZO, GZIP, and Snappy.
- Expert in optimizing performance in hive using partitions and bucketing concepts.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs.
- Experience on working hive dynamic partition to overcome hive locking mechanism.
- Developed UDFs in Java as and when necessary to use in HIVE queries
- Developed crontab for scheduling and orchestrating the ETL process.
- Involved in indexing hive data using Solr and prepare custom tokenizer formats for querying.
- Involved in designing a real time computation engine using Kafka.
- Worked on POC to setup spark streaming data to Solr and perform indexing on it.
- Experienced with writing build jobs using Maven and integrate that with Jenkins.
Environment: HDP2.1, Hive 1.13, Sqoop 1.4.1, Pig0.12, Spark 1.3, crontab, Tez0.4.0, Solr4.7.2, Map Reduce, ORC, SQL, Tableau, Java, storm, python.
Confidential, Hartford, Connecticut
Hadoop Developer
Responsibilities:
- Experience in developing Shell Scripts for system management and for automating routine tasks.
- Worked on Hadoop Map Reduce tasks in Java to convert JSON format logs to text formats.
- Involved in creating Hive tables, loading with data and writing Hive queries
- Analyzed the data in HBase to get real time analytics using Java API.
- Imported bulk amount of data into HBase using Map reduce Integration.
- Implemented performance tuning by using Map Joins, resizing the Mappers/reducers etc.
- Implemented Hive/Pig UDFs for common operations
- Performing real time analytics in Time Series data using HBase and Hadoop eco system.
- Experienced with performing ETL operations using Pig Latin operations and scripts.
- Parsed JSON and XML files in PIG using Pig Loader functions and extracted meaningful information from Pig Relations by providing a regex using the built-in functions in Pig.
- Experienced with processing Avro data files using Avro tools and Map reduce programs.
- Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
- Integrated spring schedulers with Oozie client to schedule nightly cron jobs.
- Being a part of a POC effort to help build new Hadoop clusters.
- Gained good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data
- Experienced with writing Low level and high level design documents according to business requirements using Visio.
Environment: CDH4.4, HDFS, Hive, Sqoop, HBase, Shell Scripting, Impala, Solr, Map Reduce, Java, Python.
Confidential, Atlanta, Georgia
Senior Application Developer
Responsibilities:
- Responsible for gathering business and functional requirements from the users
- Analyzed Use Case Diagrams and created various UML Diagrams such as class and sequence diagrams
- Implemented Business components using spring core and Navigation using Spring MVC.
- Implemented Persistent layer using Hibernate core interfaces.
- Implemented internationalization using Spring MVC interceptors.
- Leveraged the homegrown framework to handle exceptions
- Implemented Message driven beans to get log events from Queue.
- Worked on Action classes, Request processor, Business Delegate, Business Objects, Service classes and JSP pages
- Designed the presentation tier components by customizing the Struts framework components such as configuring web modules, request processors and error handling components.
- Developed JSP pages using Struts custom tags.
- Developed the components for parsing XML documents using SAX and DOM parsers.
- Implemented design patterns such as DAO, Session Facade and Value Objects.
- Implemented the Web Services functionality in the application to allow access by external applications to the data.
- Utilized Apache Axis for the web service framework and created and deployed clients using SOAP and WSDL.
- Developed and implemented several test cases using JUnit and performed load testing.
- Used Hibernate as ORM tool and defined the mapping and relationship of each table in database.
- Coordinated with QA team to ensure the quality of the application.
Environment: JAVA, J2EE, Hibernate, Spring MVC, JMS, Hibernate, RESTFUL, SQL, PL/SQL
Confidential
Application developer
Responsibilities:
- Developed Sequence, Use Case Diagrams and Process Flow Diagrams using Rational Rose.
- Implemented design patterns like Session Façade, Singleton, Factory, Service Locator and DAO.
- Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using JDBC.
- Implemented Business components using Struts Action class.
- Implemented Pl/SQL stored procedures, functions, triggers for persistence layer.
- Extensively used Log4j for logging throughout the application.
- Produced web service using WSDL/SOAP standard.
- Used SVN for source code versioning and code repository.
- Developed Stateless Session EJB Beans to for server side processing.
- Involved in design and implementation of front end controller using Struts Framework.
- Implemented validation utilities using struts validation framework.
- Experienced with writing Ant scripts for build process.
- Experienced in creating UI screens using JSP, JavaScript, HTML, and CSS.
Environment: Core java, JSP, Java Script, HTML, CSS, SOAP, SQL, PL/SQL
