We provide IT Staff Augmentation Services!

Bigdata Cloud Engineer Resume

3.00/5 (Submit Your Rating)

EXECUTIVE SUMMARY

  • 7+ years of experience as Lead design architect and implementation of Big Data PAAS architecture and Data Services Strategy.
  • Experience in teh development, support and maintenance of distributed systems, workflow and web applications.
  • 4 years of experience in Apache Hadoop, MapReduce, Pig, Hive, HBase, Sqoop.
  • 2+ Years of hands on experience on Apache Spark Core, Spark SQL, Spark Streaming and Apache Kafka.
  • 1 Year of experience on AWS S3, Redshift, Bigdata on EMR cluster.
  • Working experience on Tableau and Power BI Desktop app connecting to Redshift connector with both import and direct query modes
  • In depth understanding/noledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and Map Reduce concepts and experience in working with Map Reduce programs using Apache Hadoop for working with Big Data to analyse large data sets efficiently.
  • Hands on experience in working with Ecosystems like Hive, Pig, Sqoop, Map Reduce, Flume, Oozie.
  • Strong noledge of Pig and Hive's analytical functions, extending Hive and Pig core functionality by writing custom UDFs.
  • Developed custom Map Reduce jobs in java for pre - processing and data cleaning.
  • Experienced in providing technical solutions to teh business on applications dat are developed on Hadoop and its eco systems.
  • Thrives on challenge and works well under pressure, with technical expertise to learn new environments quickly, locate inefficiencies in code, and provide quick solutions.
  • Good Knowledge on Gemfire XD.
  • Knowledge in job workflow scheduling and monitoring tools like oozie and Zookeeper.
  • Basic Knowledge on Kudu, Nifi, Kylin and Zeppelin with Apache Spark
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like Cassandra, HBase and Redis.
  • Experience with querying on data present in Cassandra cluster using CQL
  • Good analytical and development skills in using latest technologies including Core Java, Python, Scala, Oracle, SQL Server 2005 and in-depth noledge of Servlets, JSP, XML, JMS, Java Mail and JDBC.
  • Good noledge in Object-Oriented analysis and design and exposure to various phases of Software Development Life Cycle including designing, developing, testing and rolling out complex Utilities, Pharma, Retail, Insurance and Telecom software's.
  • Esteemed noledge in application development, support and maintenance processes and models and actively involved in providing value added services to teh customer through proactive and process improvement initiatives.
  • In-depth understanding of Data Structures and Design Analysis of Algorithms.
  • Good Interpersonal and Communication skills.
  • Ability to adapt to evolving technology, keen sense of responsibility and accomplishment
  • Excellent Team Player with good problem-solving capability, quick learner and a good initiator.
  • Have advanced analytical, problem solving, negotiation and organizational skills with demonstrated ability to multi-task, organize, prioritize and meet deadline.
  • Ability to multitask and work multiple projects concurrently.
  • Ability to work independently and as part of a team.
  • Have experienced using Waterfall project execution methodologies in all phases of an application development starting from Planning all teh way to teh delivery of teh Product.
  • Result oriented, self-driven, highly motivated, smart and hungry to learn new technologies, methodologies, strategies and processes.

TECHNICAL SKILLS

Frameworks: Apache Hadoop, Apache Spark

Hadoop Ecosystem: MapReduce, HDFS, Hive, Pig, HBase, Zookeeper, Sqoop.

Streaming Technologies: Apache Storm, Spark Streaming, Apache Kafka

Platform Distributions: Cloudera, Hortonworks, MapR, AWS EMR

Relational Database: My SQL, Oracle 9i, SQL Server 2005

NOSQL Database: Cassandra, Redis, MongoDB, KUDU

Languages: JAVA 8.0

Scripting Language: Python

Functional Language: Scala

MVC Frameworks: Struts 2.0

J2EE Technologies: Servlets, JSP2.0, JDBC, EJB.

Web Technologies: HTML, XML.

Data Visualization: Tableau, Qlikview

Applications Servers: Tomcat 5.5.27, Web Sphere 6.1

IDEs: WSAD 5.1, Eclipse, RAD 7.0, Intellij IDEA 11.1, PyCharm

Operating System: MS-DOS, Windows 7, Ubuntu 12.04, Cent OS 6.4

Concepts: Data Structures, Operating System, RDBMS, OOAD

Transport Protocols: JMS, Websphere MQ, TCP/IP, HTTP, FTP

Build Tools: Maven, Ant

Version Control Tools: VSS, SVN, CVS, GitHub

Software Tools: Toad, Jira, Microsoft Visio, Source Tree, PUTTY, WINSCP

PROFESSIONAL EXPERIENCE

Confidential

Bigdata Cloud Engineer

Responsibilities:

  • Design and develop data acquisition, transformations, and data integration schemas for large scale media consumption and advertising analytics.
  • Developing teh solutions by utilizing commercial and open source software’s like Minifi and Nifi to interface big data and relational solutions.
  • Load data from SQL server DB on Azure to Redshift Data warehouse using Spark structured streaming.
  • Worked on S3 life cycle rules management, Redshift inline policy management to load, unload or copy data from S3.
  • Worked on Hive with S3 data store optimizations.
  • Design and implement solutions for metadata, data quality, privacy management.
  • Collaborate with subject-matter-experts to design and enable ad hoc data analysis and a robust data consumption platform
  • Supporting analytics team on data presentation and reporting dat import query or direct data to Power BI using Redshift connector or ODBC driver using SSL
  • Design, develop and deploy repeatable processes to enable end-to-end automation - with emphasis on Continuous Integration and Deployment (CI-CD).
  • Collaborate with architects and engineers to understand technology solution roadmap, and technical requirements with focus on business outcomes
  • Involved in design discussions, provided optimized and cost-TEMPeffective solutions and prepared mapping and design documents.

Confidential, Pittsburgh, PA

Lead Spark Engineer

Responsibilities:

  • Develop and validate teh proposed solution architecture supports teh stated & implied business requirements of teh project.
  • Developed Spark application (Spark 2.0.0, Java 8, Scala 2.11, Apache Kafka 0.8, Yarn, EMR-FS, HBase, Spark-HBase connector, Elastic Search) to read teh transaction’s data and process teh business rules to report teh errors, transaction summary dat leads to market-basket analytics
  • Load and transform large sets of structured and semi structured.
  • Read teh data from Kafka and process teh data using Spark
  • Adopt innovative architectural approaches to leverage in-house data integration capabilities consistent with architectural goals of teh enterprise.
  • Participated in architectural discussions and designed solutions by leveraging big data capabilities
  • Analysed existing processes and prepared functional & requirements documents. .
  • Upgrading entire application jobs from Spark 2.0.0 to Spark 2.2.0 and Kafka 0.8 to 0.10 version
  • ConfiguringSparkStreaming to process teh received weekly data via SFTP/Portal to MapR-FS and store teh streamed data to Kafka topic.
  • Developed multiple spark streaming and core jobs with Kafka as a data pipe-line system
  • Load teh D-Stream data into Spark RDD and do in memory data Computation to generate teh Output response.
  • Worked with NoSQL databases like HBase/ MapR-DB in creating tables to load large sets of JSON data using Spark-HBase connector.
  • Loading teh HBase data into Redshift cluster using Spark Structured streaming
  • Hive external tables on HBase using Insert Overwrite with S3 as data storage.
  • As part of support, responsible for troubleshooting of developed Spark Jobs.
  • Experience in managing and reviewing Yarn Application Logs, Spark Event Logs and Metrics sink CSV files.
  • Working in 2-week Sprint agile methodology using Atlassian stack (Git, Jira, Bitbucket, Bamboo)
  • Improving teh performance and optimization of teh existing jobs in Spark.

Confidential, Chicago, IL

Lead Big Data Developer

Responsibilities:

  • Developed Spark application (Spark Core, Java 8, Spark Streaming, Apache Kafka, Yarn, HDFS) to read incoming files in near real time and process them within few seconds.
  • MR2 Batch job was written to fetch required data from DB and store teh same in CSV (static file)
  • Spark job to move teh files parsed by checker component, into a backup directory.
  • Spark job to process teh files from Vision EMS and AMN Cache to identify teh violations and sending teh same to Smarts as SNMP traps.
  • Automated workflows using shell scripting to schedule(crontab) Spark jobs.
  • Exploring withSparkimproving teh performance and optimization of teh existing algorithms in Hadoop usingSparkcontext,Spark-SQL, Data Frame, pair RDD's,SparkYARN.
  • DevelopedSparkcode andSpark-SQL/Streaming for faster testing and processing of data.
  • Experience in deploying data from various sources into HDFS and building reports using Tableau.
  • Developed a data pipeline using Kafka and Strom to store data into HDFS.
  • Performed real time analysis on teh incoming data.
  • Developed portal screens using JSP, Servlets, and Struts framework.
  • Re-engineered n-tiered architecture involving technologies like EJB, XML andJAVA into distributed applications.
  • Explored teh possibilities of using technologies like JMX for better monitoring of teh system.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load teh data intoSparkRDD and performed in-memory data computation to generate teh output response.
  • Loading data into HBase using Bulk Load and Non-bulk load.
  • Created HBase column families to store various data types coming from various sources.
  • Loaded data into teh cluster from dynamically generated files
  • Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures
  • Created common audit and error logging processes job monitoring and reporting mechanism
  • Troubleshooting performance issues with ETL/SQL tuning.
  • Was hired for teh Hadoop developer role and teh primary responsibilities include Design, implement and maintain applications dat receives a transaction-based and Product mix data generated from McDonald’s restaurants across all US locations.
  • Job duties involved teh design, development of various modules in Hadoop Big Confidential and processing data using Map Reduce, Hive, Pig, Sqoop and Oozie.
  • Design, developed and tested Map Reduce programs on Mobile Offers Redemptions and Send it to teh downstream applications like HAVI. Scheduled dis Java map reduce job through Oozie workflow.
  • Ingested huge amount of XML files into Hadoop by Utilizing DOM Parsers with in Map Reduce. Extracted Daily Sales, Hourly Sales and Product Mix of teh items sold in McDonalds Restaurant’s and loaded them into Global Data Warehouse.
  • Write and test teh Map/Reduce code to do aggregations on identified and validated data. Processed Mobile Offers data for McDonald’s stores in USA across various location and ingested teh data into Hadoop HIVE tables.
  • Scheduled Multiple Map Reduce jobs in Oozie. Involved in extracting teh promotions data for McDonalds stores within USA by writing teh map reduce jobs and automating it with UNIX shell script.
  • Involved in Setting up and Managing training sessions. Currently Responsible for mentoring Peers and leading technical teams.
  • Involved in coordinating and part of teh client meetings for clarity of teh requirements to ingest teh data for McDonalds APMEA markets.
  • Fix teh code review comments; Build teh Jenkins and support for teh code deployment into teh production. Fix teh postproduction defects to perform teh Java Map/Reduce code to work as expected.

Confidential, San Jose, CA

Associate Tech Specialist

Responsibilities:

  • Worked on Multi Clustered environment and setting up Cloudera Hadoop eco-System.
  • Designed and developed multiple Map Reduce jobs in Java for complex analysis. Importing and exporting teh data using Sqoop from HDFS to Relational Database systems and vice-versa. Developed UDF functions for Hive and wrote complex queries in Hive for data analysis.
  • Job duties involved teh design, development of various modules in Hadoop Big Confidential and processing data using Map Reduce, Hive, Pig, Sqoop, Oozie, Kafka and Storm.
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive by integrating with Storm.
  • Configured Flume to transport web server logs into HDFS. Also, used Kite logging module to upload webserver logs into HDFS.
  • Developed UDF functions for Hive and wrote complex queries in Hive for data analysis
  • Performed Installation of Hadoop in fully and Pseudo Distributed Mode for POC in early stages of teh project.
  • Processed large data sets utilizing Hadoop cluster. Teh data dat are stored on HDFS were pre-processed/validated using PIG then teh processed data is stored into Hive warehouse which enabled business analysts to get teh required data from Hive.
  • Used Oozie to automate teh data loading into Hadoop Distributed File System. Designed & implemented Java Map/Reduce programs to support distributed data processing.
  • Developed Hive queries to join click stream data with teh relational data for determining teh interaction of search guests on teh website
  • Developed a well-structured and efficient adHoc environment for functional users.
  • Developed teh entire application implementing MVC Architecture.
  • Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
  • Developed Web Services for data transfer from client to server and viceversa using REST API.
  • Sqoop teh data from RDBMS to HDFS, Hive
  • Monitoring Document based on Investigator ID, Trial No, Country code, Unit no, Patient no, and Visit no with respective Visit report name is generated using HiveQL.
  • Data combined using HIVEQL to create necessary data for event triggering.
  • Export teh refined data to RDBMS using Sqoop and send an email to Investigator regarding teh trail status / Visit report status to alert investigator regularly.

Confidential

Associate Support Analyst / Developer

Key Accountabilities:

  • Investigate and progress both customer and internal issues at a technical level including teh identification of workarounds.
  • Discuss issues with customers.
  • Understanding and adherence to teh Software Development Life Cycle.
  • Understanding and adherence to programming and design standards.
  • Project management of releases
  • Adhere to SOPs relevant to role
  • Designed and developed various modules of teh application with J2EE design architecture and frameworks like Spring MVC architecture and Spring Bean Factory using IOC, AOP concept.
  • Followed agile software development with Scrum methodology.
  • Wrote application front end with HTML, JSP, JSF, Ajax, JQuery and XHTML.
  • ImplementedJAVA/J2EE design patterns such as Factory, DAO, Session Façade and Singleton.
  • Developed teh Form Beans and Data Access Layer classes.
  • XML was used to transfer teh data between different layers.
  • Involved in writing complex sub-queries and used Oracle for generating on-screen reports.
  • Worked on database interaction layer for insertions, updating and retrieval operations on data. Used JMS for messaging.
  • Work with colleagues throughout teh company, both on and off shore
  • Cover for more senior members teh team in times of absence
  • Create and execute involved PL/SQL scripts
  • Expertise areas of teh product suite supported
  • Ability to work independently on all but a few issues, using initiative as necessary
  • Sound understanding of some of teh toolsets used and teh competence to use them
  • Take part in and influence meetings
  • Ability to work independently with support from team

Confidential

Project Engineer

Primary Responsibilities

  • Providing offshore support for customers and handling incidents and service requests.
  • Handling Change requests and Service volume constraints.
  • Use of monitoring system to improve teh performance of system.
  • Proactively Performing checks daily to ensure system is running fine before business hours’ starts
  • Troubleshooting all raised problems and working towards fixing it
  • Managing users by creating roles and profiles.
  • Monitoring teh logs for any errors and warnings and provided corrective action if required.
  • Monitoring teh scheduled jobs and re-running in case of failure.
  • Proactively monitoring teh database's health and taking preventive or corrective action as required.
  • Involved in designing and developing teh User Interface
  • Designed and developed teh Business Layer in application usingJava.
  • Created tables, views and procedures in SQL server 2008.
  • Analyzed teh functional requirement document and create test case template.
  • Performed end-to-end testing in development and integration.
  • Ensured dat teh unit testing is performed for teh same and teh unit test documents are prepared.
  • Utilized Agile Methodologies to manage full life-cycle development of teh project.
  • Implemented MVC design pattern using Struts Framework.
  • Created tile definitions, Struts-config files, validation files &resource bundles for all modules using Struts framework.
  • Developed web application using JSP custom tag libraries, Struts Action classes and Action.
  • DesignedJavaServlets and Objects using J2EE standards.
  • Used JSP for presentation layer, developed high performance object/relational persistence and query service for entire application.
  • Implemented various J2EE Design patterns like Singleton, Service Locator Business Delegate, DAO, Transfer Object, and SOA
  • Developed teh XML Schema and Web services for teh data maintenance and structures.
  • Worked with various Style Sheets like Cascading Style Sheets (CSS).
  • Designed database & created tables, written complex SQL Queries and stored procedures as per teh requirements.
  • Performing database backups (Logical and Physical) and performing restoration and recovery when necessary.
  • Maintained a key learning document to work on non issues before time.
  • Many initiatives have been taken proactively to improve system performance and to make application more reliable to users.
  • Maintained a key focus check list along with business as usual and worked on those areas to fix them before business comes with suggestion.

We'd love your feedback!