We provide IT Staff Augmentation Services!

Sr Hadoop Developer/lead Resume

4.00/5 (Submit Your Rating)

Sunnyvale, CA

SUMMARY

  • Over 13+ years of professional IT experience in Technical architecture, proposing solutions, designing, and implementing solutions using BigData Hadoop technology stack.
  • Worked in presales, architecture consulting with enterprise architect teams.
  • 2 years of hands - on experience on SparkCore, SparkSQL, SparkStreaming, Kafka using SCALA. Experience in Spark, and in-depth knowledge on Spark-SQL, RDD's, Lazy transformation and actions. Created dataingetion pipeline for real-time data analytics using Flume, Kafka, SparkStreaming
  • Developed Kafka Producers and Kafka Consumers from scratch as per the business requirements. Implemented offset management to keep track of messages aloing with offsets in persistant storage. Create, delete Kafka topics, keep publishing/consuming, check message Lag under unit teting phase.
  • Handled different file formats Parquet, ProtoBuff (Protocol Buffer) and Avro, Sequence, JSON, XML, CSV.
  • Worked on compression Codec Snappy, Gzip in Spark, Hive and PIG development.
  • 3+ years of hands-on experience in HDFS, PIG, HIVE, HBASE, SQOOP, Oozie, MapReduce.
  • Created Hive Queries, applied performance tuning techniques for processing high volume of data. SQOOP importing and exporting data from HDFS/Hive/HBase to RDBMS and vice-versa. Automated Hive, PIG, and SQOOP Jobs using Oozie scheduling.
  • Have understanding of Hadoop 2.x YARN architecture - Resource manager, Node manager, Containers, Application Master, Map task, and Reduce task. Setup Multi-node, High Availability cluster, and Kerberos security using Cloudera manager.
  • Install & configure Hadoop components, servivices, and managing data nodes using Cloudre manager. Having exposure on troubleshooting root cause of the issues with help of logs, fix in quick time. Hands on experience using Cloudera (CDH 5.6.x), MapR, and AWS Hadoop Distribution frameworks.
  • Experienced in SBT (Simple build tool) scripting & configuration with Intellij and Eclipse POM file configuration for Maven repository to build and deploy SCALA and Java based applications. SSH, Putty for connecting secure cluster
  • Designed operation architecture for migrating existing infrastructure to Amazon web services - AWS cloud platform, EC2, S3, EBS, RDS, AWS Lambda, AWS API Gateway, AWS Code commit, AWS VPC, AWS IAM Roles and Security groups.
  • Experienced with scripting languages JavaScript, PowerShell, and Python in some projects. Knowledge on Cassandra, MongoDB, Storm, Fink, Akka, and Reactive programming. Have executed projects using Java/J2EE using IBM websphere, Eclipse, Intellij IDEA tools
  • Designed and implemented websites using Microsoft C#, ASP.Net, WCF and Entity framework. Experienced in writing Stored Procedures, Functions, Triggers for SQL Server and MySQL.
  • Worked on DevOps (Continuous Integration, Continuous Delivery, Continuous Deployment) using Team Foundation Server, and Jenkins for one of the projects. Worked on OOPs concepts, and Design patterns for designing software development projects.
  • Have understanding on Execution Architecture, Development Architecture and Infrastructure Architecture, used these architecutres as reference models for designing project architectures. Created tests scenarios, and reviewed test cases to satisfy Business flows.
  • Worked on project management tools Microsoft project plan, Jeera for project/resource planning. Implemented code analysis, cyclomatic complexity, code coverage features using SonarQube. Executed projects using various software development methodologies Agile/SCRUM and Waterfall
  • Provided technical solutions in project architectures to accomplish non-functional requirements. Proposed cost effective and reusable solutions against problem statements provided by Clients. Have worked with USA, UK, European, and ASIA clients for various project implementations
  • Coordinate business analysts, developers, technical teams to define project requirements & road map. Have good exposure in all phases of Software Development Life Cycle

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop 2.x, Spark 1.6/2.x, HBase 0.98.02.0.4 , MapReduce v2.6.x, YARN 2.6.x, 1.5.2, Hive 0.12/0.14.0 , PIG 0.12/0.14.0 , Sqoop 1.2.1/ 1.4.4 , Oozie 4.0.1, HDFS 2.6.x, Zookeeper 3.4.6

Amazon Web Services: EC2, EBS, S3, AWS RDS, AWS VPC, AWS lambda, AWS API gateway, Amazon, Code Commit, AWS Security groups and IAM roles.

DevOps Tools: TFS, Jenkins

Programming Language: Core Java, Scala, SQL, C/C#

Front-End Technology: HTML5, CSS, Java Script, ASP.Net, XML, AJAX

Databases: MySQL 5.0, SQL Server, HBASE

Distribution Frameworks: CDH 5.6.x, MapR, AWS

Streaming Technologies: Kafka 0.9.x, 0.10.x, Spark streaming, Flume

PROFESSIONAL EXPERIENCE

Confidential, Sunnyvale, CA

Sr Hadoop Developer/Lead

Responsibilities:

  • Evaluating client needs and translating their business requirement to functional specifications thereby onboarding them onto the Hadoop ecosystem.
  • Implemented pipeline using Spark streaming direct streaming approach, and exactly once semantic.
  • Analyzed existing jobs on Teradata to Hadoop replication framework, and created few more jobs for new business requirements
  • Created HBase tables to store ConsumerRecord, Offsets and Messages. And, also created Hive tables on top of HBase source table for querying conditional data extraction.
  • Implemented manual offset management with external persistence storage (HBase), and extract offset and unique combination of message payload from ConsumerRecord and store these detail in HBase.
  • Identify offset range for missing messages from HBase offset tracking table and reprocess those messages for the given offset range as part of Reconciliation feature.
  • Transform ConsumerRecord to custom class to extract required fields from complete message
  • Processed every JSON message from Kafka broker into HDFS as well as HBase.
  • Created Python script to generate dynamic test data-based configuration values and load the test data into Hive partitioned table on the Cluster.
  • Responsible for creating, modifying and deleting topics (Kafka Queues. Implemented SparkStreaming and Kafka ingest the data in real time and apply RDD transformations in SCALA
  • Worked in creating Hive tables, loading the data and writing hive queries that will run internally in a map reduce way.

Environment: Hortonworks, MapR, Hive, HBase, Spark 2.x, Kafka 0.10.0.1, Java, SCALA

Confidential

Sr Hadoop/Spark Developer

Responsibilities:

  • Analyzed the client problem statement and business requirements and created technical solution plan.
  • Development architectures for defining required service components.
  • Worked with client and business team for understanding functional, and non-functional requirements
  • Administrated the work such as Hadoop services installation and troubleshoot operational issues and resolved.
  • Created Linux shell script for configuring, starting, stopping and restarting Hadoop services
  • Multi-node Hadoop cluster (CDH5.x using Cloudera manager 5.7.x) setup and configuration
  • Created Kafka producer for publishing live streaming data to Kafka broker.
  • Created Spark streaming to offload data from Kafka, processed data and transfer it on HDFS.
  • Written RDDs and Spark SQL in SCALA to query and filtering data using Dataframes.
  • Worked on Kerberos security setup on CDH multi-node cluster.
  • Worked on performance tuning for improving throughput and performance.
  • Analyzed latest releases of Kafka, and Spark and created roadmap to upgrade existing architecture.
  • Explored Kafka Streams to replace Flume, and Kafka components

Environment: CDH 5.x, Zookeeper, Sqoop, Hive, HBase, Flume, Spark 1.6.5, Kafka 0.8.x, SCALA

Confidential

Sr Hadoop Developer

Responsibilities:

  • Proposed Hadoop technology stack, and the operational architecture for system implementation.
  • Analyzed requirements to design development architecture with required Hadoop components
  • Worked on Data ingestion and data processing process flows to manage real time streaming data
  • Kafka has been used for real-time data streaming, high throughput, low-latency, fault-tolerance system
  • Data ingestion part is implemented using Spark Streaming, Flume, and Kafka combination
  • Worked on data ingestion using SQOOP for importing structured data source into HBase
  • Data processing part is implemented using SparkCore, SparkSQL based on the pre-defined algorithm
  • Created HIVE tables with partitioning and bucketing using SerDe options specific to data source format
  • Written Hive queries using Map side, Skewed joins for extracting required result efficiently
  • Worked on NOSQL Database HBase for random reads/writes to huge volumes of data

Environment: CDH 5.x, Zookeeper, Hive, HBase, Spark 1.x, Kafka, SCALA

Confidential

Big Data Developer

Responsibilities:

  • Worked on technical discussions, architecture design, estimations, and reviews
  • Discussed over nonfunctional requirements and incorporated in proposed architecture
  • Proposed Hadoop technology stack, and the operational architecture for system implementation
  • Worked on PIG and Hive for data processing and transformation, and analytics
  • Created detailed design technical document with proposed architecture and integrations
  • Created high level and low design documents based on functional specs provided
  • Designed multi-node cluster on Amazon cloud with Amazon EMR, Amazon S3, Amazon data pipeline

Environment: Apache Pig, Talend, EC2, AWS EMR, AWS S3

Confidential

Technical Manager

Responsibilities:

  • Working as solution manager for various PMI Affiliate markets in Latin America, Asia Pacific regions
  • Worked on RFPs for PMI clients as Technical SME for estimations, technical solutions, Architectures
  • Manage and launching campaign web sites email campaigns, Digital magazines.
  • Responsible for technical solutions, requirements gathering, estimations for new enhancements
  • Analyze business requirements and create design document using functional flows, and process flows
  • Provide technical solutions to align with standard security policies, and safeguard consumer data
  • Manage and develop web sites with responsive design to compatible for desktop, mobile, and tablets
  • Evaluated market leading cloud platforms and migrated project infrastructure to Amazon cloud.
  • Worked on BigData analytics research to provide statistical reports requested by Client

Environment: Visio, C#, Asp.Net, WCF, AJAX, MVC, Angular JS, JQuery, Bootstrap, HTML5

Confidential, Syracuse, NY

Technical Architect

Responsibilities:

  • Detailed technical design, and review with all stakeholders to align all functional flows and modules
  • Consulting 3rd party vendors for their services, offerings, pricing to estimate overall project budget
  • Proposed technical solutions for seamless integration with finalized 3rd party services
  • Reviewed project architecture changes with Confidential Enterprise Architectural team.
  • Created technical design document with the help of functional requirements provided by Client
  • Daily sync-up with Client’ enterprise architects for brainstorming on the technical solutions and design
  • Regular sync-up with Client’ business team and Database team for documenting changes
  • Co-ordinate offshore development team for implementing proposed technical architecture and solution

Environment: Visio, Asp.Net, WCF, C#, IPDetection, Questline, Straita Email services, Pitney Bowes

Confidential

Technical Lead

Responsibilities:

  • Participated in scrum calls, planning, and retrospective meetings
  • Prioritize the stories, estimate and plan tasks for each story, created tasks and assigned to the team
  • Created tasks for each story and assigned to the team, review code after task completion
  • Deliver planned user stories and present demo to stakeholders post every sprint release
  • Continuous integration, check-in policies and ensure auto deploy error free code to build server.
  • Review and Deliver respective user stories for each sprint release and presents to stakeholders.

Environment: Visual Studio 2008, C#, SQL Server 2008, Team Foundation Server 2008 were used for

Confidential, Richmond, VA

Technical Lead

Responsibilities:

  • Participated in scrum calls, planning, and retrospective meetings
  • Analyzed & Implemented Team Foundation Server at enterprise level
  • Worked on Code analysis (FxCop), Code coverage features to implement in the project
  • Team Foundation Server Administration, explored branching strategies implemented braches for every releases, features, bug fixes.
  • Designed and implemented release/ deployment scripts using Power shell
  • Implemented check-in policies, automated unit test cases and automate build features

Environment: Visual Studio 2005, C#, ASP.Net, PowerShell, Team Foundation Server 2008

Confidential

Senior Developer

Responsibilities:

  • Reengineering Ruby on rails application into .Net application involving Ajax and Google maps. Analyzed and populating Confidential SOA roadmap with tools and technologies
  • Explored MOSS 2007 underlying components and evaluates future implementation.
  • Tested BPEL interoperability for BizTalk Server 2006, IBM WID and Oracle JDeveloper.
  • Developed SOA based Invoice application using ESB with Microsoft BizTalk and Oracle JDeveloper.
  • Gained knowledge on BPM process by evaluating various tools IBM WID and Oracle BPA suite.
  • Explored BPM tools, and submitted feature comparisons for Pega, IBM, and Oracle products
  • Developed python based simple CRM application using python web frameworks like Django and Turbogears.

Environment: C#, ASP.Net, JavaScript, BizTalk, BPEL, ESB, SharePoint, JDeveloper, Oracle Application Server, IBM WID, IBM Rational Application Developer, MySQL, SQL Server.

We'd love your feedback!