Sr Hadoop Developer Resume
Bellevue, WA
PROFESSIONAL SUMMARY:
- Over 9+ years of professional IT experience in Business Analysis, Design, Data Modeling, Development and Implementation of various client server and decision support system environments with focus on Big Data, Data Warehousing, Business Intelligence and Database Applications.
- Over 5+ Years of experience in dealing with Apache Hadoop components like HDFS, MapReduce, HIVE, HBase, PIG, SQOOP, NIFI, OOZIE, Spark Scala, Apache Presto, DynamoDB, Python, Kafka, Confidential Azure, ADF, ADLS, Azure Blob, AWS, S3, and Big Data Analytics.
- 3 years of Experience in handling real time data using Kafka.
- 3 years of Experience in handling spark and Scala frameworks.
- Experience in working with cloud infrastructure like Amazon Web Services (AWS) and Rackspace.
- Hands on experience handling Dynamo DB and Apache Presto.
- Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure - KDC server setup, creating realm /domain, managing.
- Working knowledge of Amazon’s Elastic Cloud Compute (EC2) infrastructure for computational tasks and Simple Storage Service (S3) as Storage mechanism.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, HBase and Hive by integrating with Storm
- Created Informatica BDM Jobs for data comparison between tables across different databases, identify and report discrepancies to the respective teams.
- Involved in daily SCRUM meetings to discuss the development/progress and was active in making scrum meetings more productive.
- Experienced in using Integrated Development environments like Eclipse, NetBeans and IntelliJ.
- Experienced in building real time data streaming solutions using Apache Kafka build data pipelines to store Big datasets into postgres and sqlserver tables.
- Developed a real-time solution using Apache Kafka and Spark with Scala to parse Real time data coming from event logs and store into SSI and postgres tables to generate reports.
- Experience in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions.
- Experience in understanding the client’s Big Data business requirements and transform it into Hadoop centric technologies.
- Experience in understanding customers multiple data sets, which include Behavioral data, customer profile data, usage data and product data.
- Analyzing the clients existing Hadoop infrastructure and understanding the performance bottlenecks and provide the performance tuning accordingly.
- Experience in writing DB2 PL/SQL static and Dynamic Stored Procedures
- Experienced with fast and general engine for large-scale data processing in Spark SQL.
- Strong experience in analyzing large amounts of data sets writing Pig scripts and Hive queries.
- Experience in configuration and maintaining Oracle Exadata and Oracle Golden Gate.
- Experience in using Confidential Azure, ADF, ADLS, Azure Blob.
- Good experience in building pipelines using Azure Data Factory (ADF) and moving the data into Azure Data Lake (ADL) Store.
- Complete end-to-end design and development of Apache NIFI flow, which acts as the agent between middleware team and EBI team and executes all the actions mentioned above.
- Developed different process Workflows using Apache NIFI to Extract, Transform and Load raw data into HDFS and then process it to Hive tables.
- Created Hive tables by using Apache NIFI and loaded the data into tables using Hive Query Language.
- Implemented NIFI process for automation of data movement between desperate data sources and systems, making data ingestion fast, easy and secure.
- BNFS Payment level 1 certified and gained domain knowledge of the Banking and Financial Services.
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
- Ensured Quality assurance as per client guidelines.
TECHNICAL SKILLS:
Big Data Ecosystems: Hadoop, HDFS, Sqoop, Hive, Pig, Apache NIFI, HBaseScala, oozie, Spark SQL, Kafka, Apache storm, ZookeeperOracle GG, Oracle ODI, Control-M, Confidential Big data Platform, U-SQL, HD Insight, Confidential Azure, ADF, ADLSAzure Blob, AWS, S3, Oracle GG.
IDE s: Eclipse, IntelliJ IDEA
Programming Languages: Java, Scala, Python, Cobol, JCL, PL/SQL
V ersion Tools: Bitbucket, GIT, Jenkins, SVN,Maven
MySQL,MSSQLServer,IBM: DB2,Oracle,Postgres, Exadata, Teradata
Environment: s / Platforms: Unix, Cent OS, Windows
Tools: Cognos, PL/SQL stored procedures, TOAD, Sqldeveloper, Informatica BDM, HP Quality Center, Rally, JIRA
Methodologies: Agile, Scrum, Kanban Board
Distributed Platforms: Hortonworks, Cloudera
PROFESSIONAL EXPERIENCE:
Confidential, Bellevue, WA
Sr Hadoop Developer
Responsibilities:
- Interacting with business stakeholders in understanding business requirements and translate them into functional and technical specifications.
- Develop and exhibit proof of concept (POC) for the projects.
- Developed POC’s and test cases to benchmark and verify data flow for Big Data applications and NIFI applications.
- Develop custom software components using Hive, Spark or Scala (e.g. specialized UDFs) and analytics applications.
- Develop Realtime data processing applications by using Scala and implemented Apache Spark Streaming from various streaming sources like Kafka , RabbitMQ .
- Develop programs in Bigdata applications using Hadoop Ecosystem (HDFS, SQOOP, HIVE, OOZIE, SPARK, NIFI) and Scala .
- Create programs using NIFI workflows for various data ingestion into Hadoop Data Lake from Oracle, Teradata, MySQL, Postgres.
- Develop Workflows using Apache NIFI for date ingestion pipeline to fetch data from External API’s and RabbitMQ into AWS S3 Data lake bucket.
- Develop ETL jobs Using the data Integration tool Informatica BDM for building Data warehouses and Data Marts.
- Develop programs for date ingestion pipeline to fetch data from Various relational source (Eg: Postgres, Teradata) into AWS S3 Data lake bucket using Hadoop Ecosystem (SQOOP).
- Create Programs in Pig Latin scripts and automating Pig command line transformations for data joins in Data lake.
- Developing programs from Simple to complex Map/reduce Jobs using Hive, Sqoop and Pig and scheduling this program through apache Oozie.
- Develop API programs though Java, Python and Unix bash scripting for the Big Data applications.
- Schedule meeting with functional team and product owners to discuss about informatica Big Data Edition technical upgrade plan, additional functionalities and Bug fixes in the new version.
- Work closely with the Quality and User acceptance team in fixing the issues which occurred as part of the integration testing.
- Providing Demos to all the stakeholders and business teams upon completion of critical milestones in the project.
- Writing and executing system Unit test cases, and test scripts of the project.
- Applying programs for Data validations, on PII, PCI Data and implementing Encryption mechanism using Voltage tool.
- Following CICD (Continues Integration/Continuous Deployment) process using Git, Jenkins, Maven, bitbucket etc. for promoting Programs from one environment to another.
- Actively participating in the daily SCRUM meetings to produced quality deliverable within time.
- Using Informatica BDM (Bigdata Management) CLI, automating the process of initiating, controlling the workflows and deploying, promoting the components across environments.
Environment : Hadoop, HDFS, Hive, Sqoop, oozie, PIG, HBase, Kafka, Storm, Zookeeper, Apache NIFIAWS, S3, Oracle, Control-M, SQL developer, Postgres, MySQL, Teradata, Golden Gate, Scala, PythonSpark SQL, Azure Data Factory, Azure Data lake, Azure Blob, Jenkins, Git, Bitbucket, Oracle GG, HP QCInformatica BDM
Confidential, Baltimore, MD
Big Data Developer
Responsibilities:
- Analyzing requirements and updating possible approaches and solutions on requirements.
- Analyze the impact of the requirements to the existing system and document the same.
- Develop detail design as per the requirement received from business and get the same approved.
- Involved in creating Hive Tables, loading with data and writing hive queries to process the data.
- Imported data using Sqoop to load data from oracle to HDFS on regular basis.
- Imported data using Golden Gate to load data from Oracle, DB2 to HDFS on regular basis.
- Load and transform large sets of structured, semi structured and unstructured data.
- Conducted Data analysis and created Data Mappings
- Developed test cases for Unit Testing.
- Interaction with various members of team to resolve defects.
Environment : Hadoop, HDFS, Hive, Sqoop, DB2, Oracle, SQL developer, Golden Gate, Spark Data Frames, Spark SQl, Exadata
Confidential
Big Data Analyst
Responsibilities:
- Exploring the U-SQL functions to perform operations on the data
- Worked on ADF to automate the movement and transformation of data.
- Involved in creating U-SQL tables, loading data and writing U-SQL queries for data analysis.
- Actively participated in Daily status meetings with technical collaborators and clients.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
Environment: Confidential Big Data Platform, U-SQL, Azure Data Lake, HDInsight, Hive, Pig, Hadoop, HDFS
Confidential
Technical Lead
Responsibilities:
- Analyzing the business functionality of the existing application and restructuring that to a new technology for enriching the user experience apart from providing enhanced.
- Installed and configured Hadoop MapReduce, HDFS, Developed multiple map reduce jobs for data preprocessing.
- Defined Migration plans to import & export data from RDBMS to HDFS.
- Extracted data from rational databases using Scoop.
- Developed PIG scripts for data processing according to business rules.
- Involved in creating Hive Tables, loading with data and writing Hive queries to process the data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Conducted Data analysis from Financial Transactions.
- Developed test cases for Unit Testing.
- Interaction with various members of team to resolve defects.
- Maintaining good coordination with the team.
- Project delivery under stringent timelines and high pressure.
Environment: Hadoop , MapReduce, HDFS, Hive, Sqoop, Java, MySQL, Oracle,Spark SQL,Spark Data Frames.
Confidential
Technical LeadResponsibilities:
- Involved in interacting with business people, Client meeting, analysis, code modification, testing.
- Enhancement of the business logic.
- Impact analysis for proposed enhancements / modifications to a module / program / database.
- Developing test strategies and test cases.
- Providing support to the users and testing teams during Integration, UAT & Regression testing by resolving the issues encountered during testing.
- Preparing the implementation plan for the project roll-out.
- Providing Production support by monitoring daily processing jobs till the end of warranty period.
- Developed test cases for Unit Testing.
- Actively participated in Weekly status meetings with technical collaborators and colleagues.
Environment: COBOL, JCL, DB2, IMS, VSAM, Manage Now (tool used by Amex to monitor the tickets), BMCADM, Info man, Change man, Ezytrieve.
Confidential
Technical LeadResponsibilities:
- Gathering functional requirements from the onshore coordinator.
- Interacting with onsite coordinator on analyzing requirements and updating possible approaches and solutions on requirements.
- Analyze the impact of the requirements to the existing system and document the same.
- Develop detail design as per the requirement received from business and get the same approved.
- Calculate the effort requirement for the changes, risk analysis, issues/delays in completing the tasks.
- Perform code review, testing, test case review and test results review for the created/modified components.
- Software implementation and post install support.
- Preparation of status report and conducts weekly status meeting with onshore coordinator.
- Knowledge transition to the new resources joining in the team.
- Have been part of SDLC-Coding, code reviews, testing and ensuring coding standards for development project.
- Impact analysis for proposed enhancements / modifications to a module / program / database.
Environment: COBOL, JCL, DB2, IMS, VSAM, Manage Now (tool used by Amex to monitor the tickets), BMCADM, Info man, Change man, Ezytrieve.