We provide IT Staff Augmentation Services!

Hadoop Developer/business Intelligence Data Architect Resume

3.00/5 (Submit Your Rating)

SUMMARY

  • 8+ years of IT experience in Software Development with 3+ years’ work experience as Big Data /Hadoop Developer with good knowledge of Hadoop framework.
  • Expertise in Hadoop architecture and various components such as HDFS, YARN, High Availability, Job Tracker, Task Tracker, Name Node, Data Node, and MapReduce programming paradigm.
  • Experience with all aspects of development from initial implementation and requirement discovery, through release, enhancement and support (SDLC & Agile techniques).
  • Keen in building knowledge on emerging technologies in the Analytics, Information Management, Big data and related areas and in providing best business solutions
  • Hands on experience with the Hadoop stack (MapReduce, HDFS, Sqoop, Pig, Hive, Hbase, Oozie and Zookeeper)
  • Well versed in configuring and administering the Hadoop Cluster using major Hadoop Distribution like Apache Hadoop and Cloudera.
  • Hands on experience with importing and exporting data from Relational databases to HDFS, Hive and HBase using Sqoop.
  • Analyzed large amounts of data sets writing Pig scripts and Hive queries.
  • Evaluation of new technologies in Big data, Analytics and NO Sql space
  • Exclusive experience in Hadoop Ecosystem and its components like HDFS, Map Reduce, Yarn, Hive, Sqoop, Kafka, Spark, Oozie, Azkaban, Airflow.
  • Strong Experience in working with Databases like Teradata and proficiency in writing complex SQL, PL/SQL for creating tables, views, indexes, stored procedures and functions.
  • Experienced in transporting, and processing real time event streaming using Kafka and Spark Streaming.
  • Experienced in creating and analyzing Software Requirement Specifications (SRS) and Functional Specification Document (FSD).
  • Excellent working experience in Scrum / Agile framework, Iterative and Waterfall project execution methodologies.
  • Expertise in Creation of Cursors, Functions, Procedure, Packages and Triggers as business requirement using PL/SQL.
  • Experienced in working on RDBMS, OLAP, and OLTP concepts.
  • Excellent understanding of Data modeling (Dimensional & Relational).
  • Capable of organizing, coordinating and managing multiple tasks simultaneously.
  • Experienced to work with multi - cultural environment with a team and also individually as per the project requirement.
  • Excellent communication and inter-personal skills, self-motivated, organized and detail-oriented, able to work well under deadlines in a changing environment and perform multiple tasks effectively and concurrently.
  • Strong analytical skills with ability to quickly understand client’s business needs. Involved in meetings to gather information and requirements from the clients.

TECHNICAL SKILLS

Hadoop/Big Data ecosystems: HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, Impala, Zookeeper and Cloudera Manager, Zookeeper, Spark, Scala

NoSQL Database: HBase, Cassandra

Tools: and IDE: Eclipse, NetBeans, Toad, Putty, Maven, DB Visualizer, VS Code, Qlik Sense, Qlik View

Languages: SQL, PL/SQL, JAVA, Scala, Python

Databases: Oracle, SQL server, MySQL, DB2, PostgreSQL, Teradata

Tracking Tools and Control: SVN, GIT

ETL Tools: OFSAA, IBM DataStage

Cloud Technologies: AWS, Azure

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Developer/Business Intelligence Data Architect

Responsibilities:

  • Full life cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
  • Conferring with data scientists and other qlikstream developers to obtain information on limitations or capabilities for data processing projects
  • Designed and developed automation test scripts using Python
  • Creating Data Pipelines using Azure Data Factory.
  • Automating the jobs using Python.
  • Creating tables and loading data in Azure MySql Database
  • Creating Azure Functions, Logic Apps for Automating the Data pipelines using Blob triggers.
  • Analyze SQL scripts and design the solution to implement using Pyspark
  • Developed Spark code using Python(Pysaprk) for faster processing and testing of data.
  • Used SparkAPI to perform analytics on data in Hive
  • Optimizing and tuning Hive and spark queries using data layout techniques such as partitioning, bucketing or other advanced techniques.
  • Data Cleansing, Integration and Transformation using PIG
  • Involved in exporting and importing data from local file system and RDBMS to HDFS
  • Designing and coding the pattern for inserting data into Data lake.
  • Moving the data from On-Prem HDP clusters to Azure
  • Building, installing, upgrading or migrating petabyte size big data systems
  • Fixing Data related issues
  • Loading data to DB2 data base using Data Stage.
  • Monitoring the functioning of big data and messaging systems like Hadoop, Kafka, Kafka Mirror makers to ensure they operate at their peak performance at all times.
  • Created Hive tables, and loading and analyzing data using hive queries
  • Communicating regularly with the business teams to ensure that any gaps between business requirements and technical requirements are resolved.
  • Reading and translating data models, data querying and identifying data anomalies and provide root cause analysis.
  • Support "Qlik Sense" reporting, to gauge performance of various KPIs/facets to assist top management in decision-making.
  • Engage in project planning and delivering to commitments.
  • POC’s on new technologies(Snowflake) that are available in the market to determine the best suitable one for the Organization needs

Confidential

Data Warehouse Architect - Hadoop/SQL Developer

Responsibilities:

  • Involved in full life cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
  • Moving data from Oracle to HDFS using Sqoop
  • Data profiling on critical tables from time to time to check for the abnormalities
  • Created Hive Tables, loaded transactional data from Oracle using Sqoop and Worked with highly unstructured and semi structured data.
  • Developed MapReduce (YARN) jobs for cleaning, accessing and validating the data.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables
  • Scripts were written for distribution of query for performance test jobs in Amazon Data Lake.
  • Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
  • Apache Hadoop installation & configuration of multiple nodes on AWS EC2 system
  • Developed Pig Latin scripts for replacing the existing legacy process to the Hadoop and the data is fed to AWS S3.
  • Designed and developed automation test scripts using Python
  • Integrated Apache Storm with Kafka to perform web analytics and to perform click stream data from Kafka to HDFS.
  • Analyzed the SQL scripts and designed the solution to implement using Pyspark
  • Implemented HiveGenericUDF's to incorporate business logic into HiveQueries.
  • Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS.
  • Uploaded streaming data from Kafka to HDFS, HBase and Hive by integrating with storm.
  • Supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services (AWS) cloud performed Export and import of data into s3.
  • Involved in designing the row key in Hbase to store Text and JSON as key values in Hbase table and designed row key in such a way to get/scan it in a sorted order.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • Creating Hive tables and working on them using Hive QL.
  • Designed and Implemented Partitioning (Static, Dynamic) Buckets in HIVE.
  • Developed multiple POCs using PySpark and deployed on the YARN cluster, compared the performance of Spark, with Hive and SQL and
  • Developed syllabus/Curriculum data pipelines from Syllabus/Curriculum Web Services to HBASE and Hive tables.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.

Confidential

Hadoop Analyst

Responsibilities:

  • Participated in SDLC Requirements gathering, Analysis, Design, Development and Testing of application developed using AGILE methodology.
  • Developing Managed, external and partition tables as per the requirement.
  • Ingested structured data into appropriate schemas and tables to support the rule and analytics.
  • Developing custom User Defined Functions (UDF's) in Hive to transform the large volumes of data with respect to business requirement.
  • Developing Pig Scripts, Pig UDF's and Hive Scripts, Hive UDF's to load data files.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from edge node to HDFS using shell scripting
  • Implemented scripts for loading data from UNIX file system to HDFS.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Actively participated in Object Oriented Analysis Design sessions of the Project, which is based on MVC Architecture using Spring Framework.
  • Developed the presentation layer using HTML, CSS, JSPs, BootStrap, and AngularJS.
  • Adopted J2EE design patterns like DTO, DAO, Command and Singleton.
  • Implemented Object-relation mapping in the persistence layer using hibernate framework in conjunction with spring functionality.
  • Generated POJO classes to map to the database table.
  • Configured Hibernate's second level cache using EHCache to reduce the number of hits to the configuration table data.
  • ORM tool Hibernate to represent entities and fetching strategies for optimization.
  • Implementing the transaction management in the application by applying Spring Transaction and Spring AOP methodologies.
  • Written SQL queries and stored procedures for the application to communicate with Database
  • Used Junit framework for unit testing of application.
  • Used Maven to build and deploy the application.

Confidential

Java Developer

Responsibilities:

  • Participated in gathering business requirements, analyzing the project and creating use Cases and Class Diagrams.
  • Interacted coordinated with the Design team, Business analyst and end users of the system.
  • Created sequence diagrams, collaboration diagrams, class diagrams, use cases and activity diagrams using Rational Rose for the Configuration, Cache & logging Services.
  • Implementing Tiles based framework to present the layouts to the user. Created the WebUI using Struts, JSP, Servlets and Custom tags.
  • Designed and developed Caching and Logging service using Singleton pattern, Log4j.
  • Coded different action classes in struts responsible for maintaining deployment descriptors like struts-config, ejb-jar and web.xml using XML.
  • Used JSP, JavaScript, Custom Tag libraries, Tiles and Validations provided by struts framework.
  • Wrote authentication and authorization classes and manage it in the front controller for all the users according to their entitlements.
  • Developed and deployed Session Beans and Entity Beans for database updates.
  • Implemented caching techniques, wrote POJO classes for storing data and DAO’s to retrieve the data and did other database configurations using EJB 3.0.
  • Developed stored procedures and complex packages extensively using PL/SQL and shell programs.
  • Used Struts-Validator frame-work for all front-end Validations for all the form entries.
  • Developed SOAP based Web Services for Integrating with the Enterprise Information System Tier.
  • Design and development of JAXB components for transfer objects.
  • Prepared EJB deployment descriptors using XML.
  • Involved in Configuration and Usage of Apache Log4J for logging and debugging purposes.
  • Wrote Action Classes to service the requests from the UI, populate business objects & invoke EJBs.
  • Used JAXP (DOM, XSLT), XSD for XML data generation and presentation

We'd love your feedback!