Hadoop Developer Resume
SUMMARY:
- 9+ years of IT experience with extensive knowledge in software development life cycle(SDLC) involving requirements gathering, Design, Architecting, Analysis, Development, Maintenance, implementation.
- 4+ years of exclusive experience in Hadoop and its components like HDFS, Map Reduce, Hive, Apache Pig, Hcatalog, Kafka, Sqoop, HBase, Flumes, Zookeeper, Spark and Oozie.
- Proficient in data architecture/ DW/ Bigdata/ Data Integration/ Data Governance/ Metadata Management using custom, open source or off the shelf tools.
- Experience of implementing Data Lake concept.
- Involved in project bidding on Hadoop platform.
- Experience in distributed message broker like Apache Kafka.
- Working experience in creating complex data ingestion pipelines, data transformations, data management data data governance in a centralized enterprise data hub.
- Experience in Distributed processing framework like Mapreduce, Spark and Tez.
- Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
- Architecting, solutioning and modelling data integrity platform using sqoop, flume, kakfa, spark.
- Hands on experience using Sqoop to import data into HDFS from various RDBMS and vice versa.
- Good knowledge on Kafka for streaming real - time feeds from external rest applications to Kafka topics.
- Experience working with Cloudera, Hortonworks and Confidential Big Insight Distribution.
- Experience using numpy, pandas packages in python.
- Experience in indexing log files using Elasticsearch.
- Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review.
- Experience in Data Warehouse Concepts and implementations of large scale Datawarehouse.
- Knowledge of implementing ELK stack (Logstash, Elasticsearch and Kibana).
TECHNICAL SKILLS:
Technologies: Hadoop, Hive, Pig, HBase, Sqoop, AWS, Hcatalog, Flume, Zookeeper, Kafka, Spark, Elasticsearch, Ambari, Zeppelin, Hue, Oozie, logstash, Kibana, Maven, GitHub, Jenkins.
ETL Tool: Datastage 7.5X2 (PX) and 8.5
Database: Oracle 9i, DB2, Teradata.
Languages: C, SQL, Scala, Python.
Tool: & Utilities Control M, SQL Developer, WebSphere MQ, Eclipse, Intellij, Pycharm
PROFESSIONAL EXPERIENCE:
Confidential
Hadoop Developer
Modules Environment: Hortonworks 6.2, Sqoop, vertabelo, Hive, GitHub, Source tree.
- Analysis the various bank applications and structure of data before data modeling and the data transformation required before loading of the data.
- Designing for all the LOB of the bank as per the Risk requirement.
- Create, manage logical, physical and conceptual models for Risk data store.
- Develop jobs for data ingestion based on the business requirement into Hadoop platform.
- Prepare High level design documents based on the understanding of Requirement Elicitation to technical specification for the different modules.
- Develop scripts for managing and scheduling jobs for data ingestion into Hadoop Cluster.
- Reviewing project/task statuses/issues with the business and ensuring completion of project on time.
Confidential
Hadoop Admin & Developer
Modules Environment: Centos 6.7, Cloudera 5.7, Solr, Sqoop, Kafka, Spark, Hive, R, GitHub, Jenkins, Maven, D3.
Responsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Handled importing of data from various data sources, performed transformations, loaded data into HDFS.
- Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, CSV. etc.
- Developed Spark applications to perform all the data transformations coming from multiple sources.
- Created Spark applications to perform various data cleansing, validation, transformation and summarization activities according to the requirement.
- Explored MLlib algorithms in Spark to understand the possible Machine Learning functionalities that can be used for our use case.
- Used both Hive context as well as SQL context of Spark to do the initial testing of the Spark job.
- Have experience in executing Hive Queries using Spark SQL that integrates Spark environment.
- Worked on designing the D3 baseboards and Mentoring the young team.
Confidential
Hadoop Developer
Modules Environment: Centos 6.7, Cloudera 5.5, hive, Sqoop, Kafka, Pig, Flume, Hcatalog, Hue, Oozie
Responsibilities:
- Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution.
- Written Sqoop scripts to ingest data from different RDBMS data Source.
- Used Flume and Sqoop extensively in gathering and moving data files from Application Servers to Hadoop Distributed File System (HDFS).
- Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
- Implemented the hive partitions, hive joins, hive bucketing.
- Involved in converting Hive/SQL queries into Spark transformations using Spark Data Frames.
- Analyzed the spark SQL scripts and designed the solution to implement.
- Used Spark SQL to process the huge amount of Structured data.
- Set-up zeppelin and developed spark jobs for data analysts.
- Implemented near real time data pipeline using framework based on Kafka, Spark.
- Written Oozie workflow for scheduling jobs and for writing pig scripts and hive QL.
- Experience in building/tuning real-world systems under scale/performance constraints.
- Prioritize daily workflow and demands on quality, time and resources.
Confidential
Developer
Module Environment: Centos 6.7, Kafka, Elasticsearch, Log stash, Kibana
Responsibilities:
- Designing and implementing data processing pipelines for different kinds of data sources, formats and content.
- Responsible for setup and intake of logs using Apache Kafka.
- Written log stash scripts to process logs from Kafka to Elasticsearch.
- Developed Elasticsearch scripts for indexing and searching log data.
- Developed visualization dashboards using Kibana.
Confidential
Sr Developer
Modules Environment: Unix, Teradata and DataStage 8.5
Responsibilities:
- Interacted with end user community to understand the business requirements and in identifying data sources.
- Preparing and maintained HDD document related to MBI.
- Design and develop bteq scripts in Teradata based on business requirement.
- Design and development of ELT module using DataStage.
- Working on scheduling toll control-m for automating the ETL process.
- Documented ETL test plans, test cases based on design specification for unit testing, system testing and prepared test data for testing.
Confidential
Sr Developer
Modules Environment: Unix, Teradata and DataStage 8.5
Responsibilities:
- Worked and coordinated with subject matter experts and business data quality for data requirement documents and technical documents.
- Preparing and marinating AAA document.
- Design and develop bteq scripts in Teradata based on business requirement.
- Design and development of ELT module using Datastage.
- Prepared visio design document for ETL job flow.
- Working on scheduling toll control-m for automating the ETL process.
- Involved in unit testing, integration testing and UAT .
Confidential
Sr Developer
Modules Environment: Unix, DB2 and Datastage 7.5
Responsibilities:
- Worked and coordinated with subject matter experts and business data quality for data requirement documents and technical documents.
- Preparing and marinating AAA document.
- Design and development of ELT module using Datastage.
- Prepared visio design document for ETL job flow.
- Working on scheduling toll control-m for automating the ETL process.
- Involved in unit testing, integration testing and UAT.
Confidential
Responsibilities:
- Responsible for the creation, execution, and maintenance of DB2 databases that tracked access of employee and assisted in report creation.
- Conducted and participated sessions with the Project managers, Business Analysis Team, Finance and Development to team gather, analyze and document the Business and reporting requirements.
- Generated daily, weekly, and monthly inventory reports utilizing Cognos.
- Provided technical assistance in identifying, evaluating, and developing cost effective systems and procedures that met business requirements.
- Created ad hoc reports for management to support key decision-making.
- Involved in documenting the project steps and presenting to the team members.
- Maintained weekly department reporting utilizing Cognos and Excel.