We provide IT Staff Augmentation Services!

Hadoop Developer / Cloud Architect Resume

5.00/5 (Submit Your Rating)

Englewood Cliffs, NJ

PROFILE:

  • Over 10 years of technical experience in development, designing and implementing IT projects.
  • Extensive experience Hadoop/Spark developer experienced in ingestion, storage, querying, processing and analysis of big data.
  • Experienced with Spark Streaming, SparkSQL Kafka & AWS Kinesis for real - time data processing.
  • Excellent Programming skills at a higher level of abstraction using Scala, Java and Python.
  • Extensive experience in working with various distributions of Hadoop like enterprise versions of Cloudera and good knowledge on Amazon’s EMR (Elastic MapReduce).
  • Hands on experience in working and designing of Row keys & Schema Design with NOSQL databases like HBase, Cassandra and DynamoDB (AWS).
  • Experience in using D-Streams, Accumulator, Broadcast variables, RDD caching for Spark Streaming.
  • Experienced in implementing scheduler using Oozie, Crontab and IBM TWS.
  • Extensive experience in importing and exporting streaming data into HDFS using stream processing platforms like Flume and Kafka messaging system.
  • Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
  • Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle.
  • Well-versed in spark components like Spark SQL and Spark streaming.
  • Used Scala and Python to convert Hive/SQL queries into RDD transformations in Apache Spark.
  • Expertise in writingSparkRDD transformations, Actions, Data Frames, Case classes for the required input data and performed the data transformations usingSpark-Core.
  • Experience in integrating Hive queries into Spark environment using Spark SQL.
  • Expertise in performing real time analytics on big data using HBase and Cassandra.
  • Experience in working with different file formats like Parquet, Avro, XML, JSON, CSV, ORCFILE and other compressed file formats Codecs like gZip, Snappy, Lzo.
  • Experienced in working with monitoring tools to check status of cluster using Cloudera manager and Ganglia.
  • Experience in developing data pipeline using Pig, Sqoop, and Flume to extract the data from weblogs and store in HDFS.
  • Accomplished developing Pig Latin Scripts and using Hive Query Language for data analytics.
  • Exposure to Data Lake Implementation using Apache Spark and developed Data pipe lines and applied business logics using Spark.
  • Experience in writing Complex SQL queries, PL/SQL, Views, Stored procedure, Triggers, etc.
  • Worked on GUI Based Hive Interaction tool like Hue for querying the data.
  • Experience in optimizing MapReduce algorithms using Mappers, Reducers, combiners and partitioners to deliver the best results for the large datasets.
  • Experienced in migrating data from various sources using PUB-SUB model in Kafka producers, consumers and preprocess data using Storm topologies.
  • Have competency in using Chef and Puppet configuration and automation tools. Configured and administered CI tool like Jenkins for automated builds.
  • Working knowledge of Amazon’s Elastic Cloud Compute(EC2) infrastructure for computational tasks and Simple Storage Service (S3) as Storage mechanism.
  • Knowledge of installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera distributions and on Amazon web services (AWS).
  • Designed and managed public/private cloud infrastructures using Amazon Web Services (AWS) which include EC2, S3, CloudFront, Elastic File System, RDS, VPC, Direct Connect, Route53, Cloud Watch, CloudTrail, CloudFormation, Elastic Load Balancer, Auto-Scaling and IAM policies.
  • Experienced in using build tools like Maven, Ant, SBT to build and deploy applications into the server.
  • Hands-on knowledge in Core Java concepts like Exceptions, Collections, Data-structures, I/O. Multi-threading, Serialization and deserialization of streaming applications.
  • Experience in Linux and using Shell, Perl, Python scripting to automate processes.
  • Worked on various programming languages using IDEs like Eclipse, NetBeans, and Intellij.
  • Extensive experience in using MVC architecture, Struts, Hibernate for developing webapplications using Java, JSPs, JavaScript, HTML, jQuery, AJAX, XML and JSON.
  • Experience in design and development of web based applications using HTML, DHTML, CSS, JavaScript, JQuery, JSF, Struts, JSP and Servlets.
  • Experienced in ticketing tools like JIRA for tracking issues, bugs related to code and GitHub for various code reviews and Worked on various version control tools like CVS, GIT, SVN.
  • Experience in maintaining an Apache Tomcat MYSQL, LDAP, LAMP, Web service environment.
  • Good working experience in importing data using Sqoop and performed transformations on it using Hive, Pig and Spark.
  • Designed ETL workflows on Tableau, Deployed data from various sources to HDFS.
  • Experience in working with different data sources like Flat files, XML files and Databases. Various domainexperiences like ERP, Software quality process.
  • Expertise in designing and implementing applications using IBM MQ Series, IIB, Message Broker, IBM B2B Integrator & Java and related technologies.
  • Experience in complete Software Development Life Cycle (SDLC) in both Waterfall and Agile methodologies.
  • Good understanding of all aspects of Testing such as Unit, Regression, Liberation, Agile,.
  • Experience with best practices of Web services development and Integration (REST andSOAP).
  • Experience in automated scripts using Unix shell scripting to perform database activities.
  • Working experience with Linux lineup like AIX, Redhat, Ubuntu and CentOS.
  • Good analytical, communication, problem solving skills and adore learning new technical, functional skills.
  • Experienced in Agile Scrum waterfall and Test-Driven Development methodologies.
  • Experience in IT Application Architecture Design, Solution Design and Optimization, requirement analysis, end-to-end solution and application integration design, Sizing, On-premises, Cloud & Hybrid Application Infrastructure Design, UML and Interface Conceptual Design documentation.
  • Possesses excellent analytical, problem-solving and leadership skills and has a keen interest in the emerging technologies
  • Enjoys working as a team player and can complete a task with minimum supervision
  • Excellent communication skills and a motivated, organized team player with high aptitude for learning and leadership skills.
  • Follow all the standards in coding as per organization.

TECHNICAL PROFICIENCIES:

Big Data Ecosystem: HDFS, YARN, Hive, Pig, Impala, Sqoop, Storm, Flume, Spark, Apache Kafka, Zookeeper, Oozie.

NO SQL Databases: HBase, Cassandra, Amazon DynamoDB.

Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5), Apache & MapR

Platforms: UNIX/AIX, Linux, Windows, Oracle, SAP, AWS, WebServices (SOAP/REST), Cloud Computing, Enterprise Application Integration

Cloud Platform: EC2, VPC, Identity and Access Manager (IAM), EC2 Container Service, Elastic Beanstalk, Lambda, S3, CloudFront, Glacier, RDS, DynamoDB, ElastiCache, Redshift, Direct Connect, Route53, CloudWatchCloudFormation, CloudTrail, OpsWorks, AWS IoT, SNS, API Gateway, SES, SQS, SWF

Languages: Java, C, C++. Scala, Python, XHTML, HTML, AJAX, SQL, PL/SQL, Pig Latin, HiveQL, JavaScript, Shell Scripting, VBS

Java Technologies: Hibernate, Java Beans, Swing, Core Java, Multithreading

Middleware Technologies: Apache Kafka, AWS Kinesis, IBM MQ Series, Message Broker

Internet Applications: JSF, JSP, J2EE, Servlets, JDBC

Framework: MVC, JSF

Web/Application Servers: Apache Tomcat, Websphere Application Server, JBoss

Scripting Languages: JavaScript, Shell Script (ksh, csh, bash), CSS, JQuery, Nodejs, VB Script, Perl

Markup Languages: HTML, XML

XML: DOM, DTD, SAX

Database Apps: Oracle, MSSQL Server, MySQL, DynamoDB, Redshift

Methodologies: SDLC, Design Patterns

Version Control Tools: GitHub, VSS (Visual Source Safe), SVN

Tools: IntelliJ,Eclipse, NetBeans, Toad, PL/SQL Developer.

PROFESSIONAL EXPERIENCE:

Confidential, Englewood Cliffs, NJ

Hadoop Developer / Cloud Architect

Responsibilities:

  • Analysis of functional requirement from Global CD team, preparation of project proposal
  • Studying existing infrastructure landscape, cloud product matching, design cloud architecture, Proof of Concepts(PoC), design improvements, Cost Estimation and implementation of AWScloud infrastructure recommending application migrations to public, private clouds.
  • Architecture design of scalable Hadoop cluster on AWS for Cloudera framework
  • Utilized Apache Spark with Python & Scala to develop and execute Big Data Analytics
  • Built analytical data pipelines to port data in and out of Hadoop/HDFS from structured and unstructured sources.
  • Designed and implemented security and privacy strategy for data in HDFS. Created data compression and data compaction best practices and techniques.
  • Hadoop cluster design, sizing and AWS cost estimation
  • Designing of Redshift Cluster including node types, compute nodes and cluster type
  • Creating Data Source for application to connect to Redshift Cluster. Securing Redshift cluster using access management.
  • Configuration and administration of Load Balancers and Auto scaling for high availability.
  • Sentiment Analysis for retail brands derived from final scoring table created by the Perfect score
  • Used Amazon IAM service to grant permission to assigned users and manage their roles. Created database users and assigned access as per roles.
  • Designed managed file transfer solution using AWS S3 buckets in different regions. Which helped resolving transfer issue of large files with no latency.
  • Created CloudFormation templates for creating S3 buckets and assigning IAM user policies for any new partner onboarding which reduced partner onboarding from 10 days to 5 hours.
  • Created lifecycle management policies for data archival which helped a lot in reducing cost of storing data for long term basis.
  • Wrote Python scripts to migrate data into from old system to AWS Redshift
  • Scripting for 3rd party data fetch and data load to HDFS cluster with Oozie for multiple workflows
  • Spark scripts for manipulation with joins and multiple conditions.
  • Created auto-scaling groups based on memory and CPU usage to handle excessive or underuse workload without needing manual interference
  • Coordination with multi-location Global team of different functional areas.

Environment: Hadoop Framework, Cloudera, HDFS, Hive, Spark, Sqoop, Oozie, HUE, JSON, SQL Server, Python, Scala, Tableau, AWS EC2, S3, IAM, Lambda

Confidential, Englewood Cliffs, NJ

Technical Analyst: Big Data

Responsibilities:

  • Analysis, design and development of ePOS Reporting System to track multiple KPI of products
  • Created solution architecture, technology architecture and implemented solutions for Big Data initiatives for POS data.
  • Implemented critical application components using technologies including Hive, MapReduce, HDFS, HBase, Kafka, Sqoop, Python
  • Created Partitioned Hive tables and worked on them using HiveQL.
  • Designed and developed Job flows using Oozie, managing and reviewing log files.
  • The data is collected from distributed sources into AVRO models. Applied transformations and standardizations and loaded into HBase for further data processing.
  • Importing log files using Flume into HDFS and load into Hive tables to query data
  • Used Oozie Operational Services for batch processing and scheduling workflows dynamically.
  • Developed Unix shell scripts to load files into HDFS from Linux File System.
  • Wrote Apache Kafka producers and consumer to received and send the data as DStreams into HDFS.
  • Architecture design for Hadoop cluster for Cloudera
  • Data Modeling, ETL Architecture & Data Pipelining Design for Hadoop with HDFS, Hive, and Pig
  • Data load to HDFS cluster and Hive tables
  • Installed and configured Pig, written Pig Latin scripts to convert the data from Text file to Avro format.
  • Developed UDF’s in Java as and when necessary to use in Pig and Hive queries. Developed MapReduce jobs for data cleaning and preprocessing.
  • Data extract for users for Tableau reports
  • Topline analysis reports for Retailers and Products

Environment: Hadoop Framework, Apache Kafka, HDFS, Hive, MapReduce, Sqoop, HUE, SQL Server, Python, Business Objects, Tableau, Excel Marco, VB Scripts

Confidential, Englewood Cliffs, NJ

Technical Architect & SME

Responsibilities:

  • Requirement gathering from all Confidential teams regarding files and systems to be integrated. This was an ongoing effort for different releases of the project.
  • Architecture design for A2A, A2B & B2B file transfer across Confidential Landscape and with 3rd party vendors and systems.
  • Developed IBM MQ Series Objects, configured Datamover(File2File, SFTP, FTP), Queue2File, and File2Queue Adaptors.
  • Developed processes in IBM Message broker(IIB) to transform flat file to idoc and idoc to flat file.
  • Developed flows in IBM B2B integrator and onboarded 400+ partners for EDI file transfer.
  • Configured cluster and DR Environment for IBM B2B integrator & IBM MQ servers.
  • Configuring JMS & Rest based communication channels to connect Europe servers to North America and Brazil, South America and Middle Americas servers and application.
  • Experience with large scale system transferring more than 300K messages per day. Size of messages/files ranging between 1KB to 100GB and supporting solutions on 400+ client servers.
  • Coordinated with various teams(Unix/Wintel/Network teams) to decommission servers from old architecture making sure that there is no impact on the business and tried achieving zero% downtime.
  • Designed central logging system & dashboard for support teams & business users to identifying issues proactively.
  • Worked on web services and developing SOAP envelops and calling remote functions using SOAP protocol.
  • Written customized Shell, Perl and VBS scripting for house cleaning jobs.
  • Written customized Java code using ftp & sftp api for monitoring all 3rd party systems for file transfer errors.
  • Proficient in configuring and using different monitoring tools like IBM Tivoli and BMC QPasa
  • Took part in different test phases(Unit, UAT, Liberation & Regression), production go-live planning & calls.
  • Good understanding of network compliance and security aspects used in big corporates.
  • Using BMC Remedy tool & HP ticketing system to report problem and incident tickets.
  • Managing offshore team of 25 and getting work done in time.
  • Attending regular meetings regarding project updates and performance reporting.
  • Good understanding on how business SLA’s work.
  • Comfortable in talking to CTOs, managers and managing large teams both on-shore and off-shore.

Confidential

Software Engineer

Responsibilities:

  • Developed major UI interface screens in JSF 1.2
  • Proper error handling and handling of page navigation for the error code received through different database calls
  • Written customized HTML logs for error handling using Log4j.
  • Coding/Deployment of different time based utilities using EJB3.
  • Coding and Deployment of Rule Execution Engine using EJB 3.0 Timer services to include real time rule execution capability.
  • Involved in writing XML Generator and Parser.
  • Handled Executing calls with XML received
  • Developed Ajax and Javascript based search modules.
  • Involved in the installation and configuration of the system in JBoss application server.
  • Coordinating with the QA team for development of test scenarios and reviewing of defects reported by them
  • Interacted with Business users to meet the requirements of the interface.
  • Development of class diagrams and sequence diagrams

Environment: Java Core, JSF, AJAX, JSP, SQL Server 2000, Oracle 9i, JBoss4.0.2, EJB3.0 Timer Services, AJAX

Confidential

Software Engineer

Responsibilities:

  • Complete study of available products in market of same nature
  • Complete feasibility study and System Development life cycle
  • Interacted with Business users to meet the requirements of the interface.
  • Requirement gathering, preparation of User Cases and designing of the application
  • Development of class diagrams and sequence diagrams
  • Creation of multiple threads which acts as virtual users and virtual processes.
  • Developed whole module for displaying process map in java swings interface using java 2D and 3D images API.
  • Developed Technical Design for various interfaces which covered the process flow.
  • Coordinating with the QA team for development of test scenarios and reviewing of defects reported by them
  • Coordinating with other module teams for proper integration of features of other modules
  • Designed and developed Model class hierarchy and object hierarchy
  • Developed major user interface modules in Swings.
  • Customized existing graphical components according to project requirement
  • Very comfortable in hardcore coding
  • Used JDBC API for database handling, with databases like SQL Server 2000 and Oracle
  • Also worked with JDBC native drivers
  • Used Java Collection API
  • Exported graphical reports to Excel using Apache class files
  • Created powerful media controls to handle snapshots for OmniFlow process Simulator
  • Controlling hundreds of threads with minimum CPU utilization.

Environment: Java Core, Java Swings, SQL Server 2000, Apache class files, JDBC, Java Swings

Confidential

Software Engineer

Responsibilities:

  • Complete study of International Compliance.
  • Implemented Single Sign On system for Confidential .
  • Developed major UI interface screens in JSF.
  • Designing of tool to develop Data Entry Forms in run time as and when required by user designed in JSF, HTML and JavaScript. This tool support four types of inputs modes namely checkbox, radio button, combo box and textbox. This tool is automated in terms of storing data and retrieving data from database server.
  • Handling of managed Beans and written EJB calls to fetch information needed from SQL server2005/Oracle 10g.
  • Proper error handling and handling of page navigation for the error code received through different database calls
  • Written customized HTML logs for error handling.
  • Coding/Deployment of different time based utilities using EJB3.0 timer services.
  • Involved in writing XML Generator and Parser.
  • Handled Executing calls with XML received
  • Completely involved in installation and deploying of OmniDocs(document archival software) on JBoss application software
  • Involved in the installation and configuration of the system in JBoss application server
  • Coordinating with the QA team for development of test scenarios and reviewing of defects reported by them
  • Interacted with Business users to meet the requirements of the interface.
  • Development of class diagrams and sequence diagrams

Environment: Java Core, JSF(Java Server Faces), JSP, SQL Server 2000, Oracle 9i, JBoss4.0.2, EJB3.0 Timer Services, AJAX

Confidential

Software Engineer

Responsibilities:

  • Complete study of International Compliance, Audit Management and Operational Risk laws
  • Developed major UI interface screens in JSF with Managed Beans.
  • Handling of managed Beans and written EJB’s to fetch information needed from SQL server.
  • Proper error handling and handling of page navigation for the error code received through different database calls
  • Involved in writing XML Generator and Parser.
  • Handled Executing calls with XML received
  • Completely involved in installation and deploying of OmniDocs(document archival software) on JBoss application software
  • Involved in the installation and configuration of the system in JBoss application server
  • Coordinating with the QA team for development of test scenarios and reviewing of defects reported by them
  • Complete study of available products in market of same nature
  • Complete feasibility study and System Development life cycle
  • Interacted with Business users to meet the requirements of the interface.
  • Development of class diagrams and sequence diagrams

Environment: Java Core, JSF (Java Server Faces), JSP, AJAX, SQL Server 2000,Oracle 10g, JBoss 4.0.2

We'd love your feedback!