We provide IT Staff Augmentation Services!

Sr. Hadoop/spark Consultant Resume

0/5 (Submit Your Rating)

Charlotte, NC

SUMMARY

  • Around 8 years of overall experience in a variety of industries including 3+ years of experience in Big Data Technologies (Apache Hadoop stack and Apache Spark), 4+ years of experience in Java Technologies and 1+ years of experience in Dot Net technologies
  • In - depth knowledge of Apache Hadoop Architecture (1.x and 2.x) and Apache Spark 1.x Architecture
  • Hands on experience with Cloudera and Hortonworks
  • Hands on experience with various Apache Hadoop Ecosystems such as Hadoop, Spark, HDFS, MapReduce, YARN, TEZ,HBase, Pig, Hive, Sqoop, Flume, Oozie, and Kafka
  • Hands on experience in writing MapReduce jobs in Java, Pig, and Python
  • Experience in dealing with SQL in Hadoop with Apache Hive
  • Hands on experience in writing Apache Spark SQL and Spark Streaming programming with Scala and Python
  • Experience in implementing Real-Time streaming and analytics using SparkStreaming and Kafka
  • Experience in data ingestion using Sqoop from RDBMS to HDFS and Hive and vice-versa
  • Proficient in Java/J2EE technologies - Core Java, JSP, Java Beans, Java Servlets, Ajax, JDBC, ODBC, Web Services, Swing, Hibernate, Spring, Struts, XML and XSLT
  • Proficient in Dot Net Technologies - C# .Net, ASP .Net, Entity Framework, WCF, Ajax, and MVC
  • Good Experience in MVC architecture using Spring, Struts and ASP .Net frameworks
  • Performed data analysis using MySQL, SQL Server Management Studio and Oracle
  • Experience with ETL Tool using Informatica, Talend and SSIS
  • Experience with Restful Services and Amazon Web Services
  • Hands on Experience on Amazon’s EC2, EMR and S3
  • Conversant with Web/Application Servers - Tomcat, Websphere, Weblogic and IIS
  • Experience in writing Maven and SBT scripts to build and deploy Java and Scala Applications
  • Implemented unit testing with Junit and MRUnit
  • Expertise in Web Application Development with JSP, HTML, CSS, JavaScript, ASP .Net, C# .Net and JQuery

TECHNICAL SKILLS

Big Data Skills: Hadoop, MapReduce, YARN, TEZ, Spark, Hive, Pig, R, HueSqoop, Flume, Talend, Oozie, Zeppelin, Kafka, HBase, ORC, Avro, Parquet

Programming languages: Java, Python, Scala, ASP .Net, C# .Net, MVC, UML and SharePoint

Tools: Maven, SBT, IntelliJ, Net beans, Eclipse, PyCharm, Pydev, R Studio,Shiny, Revolution R, MS-Office, MS-Visio, MS-Project, Visual Studio,SSMSand BIDS

Operating Systems: Windows, Windows Server, Linux, UNIX, Ubuntu and CentOS

Web: HTML 5, CSS 3, AngularJS, JavaScript, JSON and JQuery

Databases: SQL Server, Oracle, Netezza, MySQL, SSIS, MongoDB (basics), and Cassandra (basics)

Version Control: Team Foundation Server, GitHub, SVN

Development Methodologies: Agile/Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Confidential, Charlotte, NC

Sr. Hadoop/Spark Consultant

Responsibilities:

  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Analyzed large data sets by running Hive queries and Pig scripts
  • Developed Simple to complex MapReduce Jobs using Hive and Pig
  • Involved in running Hadoop jobs for processing millions of records of text data
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing
  • Involved in loading data from LINUX file system to HDFS
  • Responsible for managing data from multiple sources
  • Extracted files from Relational Database through Sqoop and placed in HDFS and processed
  • Experienced in runningHadoopstreaming jobs to process terabytes of xml format data
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources
  • Assisted in exporting analyzed data to relational databases using Sqoop
  • Managed and reviewed Hadoop Log files
  • Load log data into HDFS using Flume
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Used JDBC for database connectivity with MySQL Server
  • Extensive work in ETL process consisting of data transformation, data sourcing, mapping,conversion and loading using Talend
  • Able to understand and migrate the ETL & BI codes cross multiple ETL and BI tools like Talend
  • Experienced in analyzing data using HiveQL and Pig latin and custom MapReduce programs in Java
  • Diverse experience in utilizing Java and python tools in business, web and client server environments including Java platform, JSP, Servlet, Java beans, JSTL, JSP custom tags, EL, JSF and JDBC
  • Deep JVM knowledge of heavy experience with Functional Programming language like Scala
  • Involved in converting Hive/SQL queries intoSparktransformations and actions using Spark SQL(RDDs and Dataframes) in Python and Scala
  • ImplementedSparkSQL queries with Scala for faster testing and processing of data
  • Implemented Spark Streaming to read real-time data from Kafka in parallel and processed in parallel and save the result as parquet format in Hive
  • Did analytics POC to analyze outpatient details with R and SparkR (with Logistic Regression algorithm)
  • Installed Zeppelin in Cloudera Dev environment and executed Spark programs
  • Developed applications using Eclipse
  • Used Hadoop Streaming to write jobs in a Python scripting language
  • Expertise in writing Shell scripts to monitor Hadoop job

Environment: Hadoop, MapReduce, HDFS, Pig, Hive,HBase,Sqoop, Flume, Java, Python, Oracle 10g, MySQL, Ubuntu, Agile, XML, SQL Server, YARN, Cloudera, Teradata, Talend, UNIX Shell Scripting, Oozie, Scala, Spark, R, Maven, SBT, Zeppelin, Eclipse, IntelliJ

Confidential, Charlotte, NC

Sr. Hadoop Consultant

Responsibilities:

  • Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
  • Worked on debugging, performance tuning of Hive & Pig Jobs
  • Worked on tuning the performance Pig queries
  • Involved in loading data from LINUX file system to HDFS
  • Importing and exporting data from different relational databases into HDFS and Hive using Sqoop and performed transformations using MapReduce and Hive
  • Analyzed data by performing Hive Queries and running the Pig Scripts to study the behavior in a particular aspect
  • Experience working on processing unstructured data using Pig and Hive
  • Used UDFs to implement business logic in Hadoop
  • Supported MapReduce Programs those are running on the cluster
  • Gained experience in managing and reviewing Hadoop log files
  • Created HBase tables to store variousdataformats coming from different applications
  • Developed ETL Scripts for Data acquisition and Transformation using Talend
  • Extensive experience with Talend source & connections configuration, credentials management, context management
  • Implemented and assisted with Talend installations and Talend Servers setup which including,MDM server
  • Implemented proof of concept to analyze the streaming data using Apache Spark with Scala and Python; Used Maven and SBT for build and deploy the Spark programs
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
  • Developed simple to complex MapReduce jobs using Java, Pig and Hive
  • Developed application using Eclipse and used build and deploy tool as Maven
  • Exported the analyzed data to the relational databases using Sqoop for visualization

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Sqoop, Flume, Java, Oracle 10g, MySQL, SQL Server, Ubuntu, Agile, SQL Server, YARN, Spark,Hortonworks, Teradata, Talend, UNIX Shell Scripting, Oozie, Maven, Eclipse

Confidential, Memphis, TN

Hadoop Consultant

Responsibilities:

  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
  • Manipulated, transformed, and analyzed data from various types of databases
  • Worked extensively in creating Map Reduce jobs to power data for search and aggregation
  • Extensively used Pig for data cleansing with Tez
  • Created HBase tables to store variousdataformats coming from different applications
  • Designed a data warehouse using Hive
  • Have strong understanding of Dynamic Partitioning in Hive
  • Created partitioned and bucketed tables in Hive to provide nice sample during predictive modeling
  • Worked with business teams and created Hive queries for ad hoc access
  • Created several UDFs in Pig and Hive to give additional support for the project
  • Did Analytics with Hive Queries
  • Worked extensively with Sqoop for importing data from Oracle and Netezza
  • Created ETL Scripts for Data acquisition and Transformation using Talend
  • Can understand and migrate the ETL & BI codes cross multiple ETL and BI tools like Talend
  • Developed application using Eclipse and used build and deploy tool as Maven
  • Evaluated usage of command line Oozie/Hue for Workflow Orchestration
  • Mentored analyst and test team for writing Hive Queries
  • Used R for analytics, predictive modeling and regression analysis
  • Implemented test scripts to support test driven development and continuous integration

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Sqoop, HBase, Flume, Java, Oracle 10g, Netezza, MySQL, Ubuntu, Agile, Cloudera, UNIX Shell Scripting, Oozie, Maven, Eclipse

Confidential

Sr. Technology Analyst

Responsibilities:

  • Led a team of a 5 member development group
  • Worked as a senior developer for the project
  • Created UML class diagrams that depict the code’s design and its compliance with the functional requirements
  • Analysis, Design, Development and Unit Testing of the modules
  • Used Java Mail notification mechanism to send confirmation email to applied companies
  • Also involved in writing JSP’s/JavaScript and Servlets to generate dynamic web pages and web content
  • Developed various Java classes, SQL queries and procedures to retrieve and manipulate the data from backend Oracle database using JDBC
  • Used Enterprise Java Beans as a middleware in developing a three-tier distributed application
  • Developed Session Beans and Entity beans to business and data process
  • Implemented Web Services with REST
  • Implemented field level validations with AngularJS, JavaScript and JQuery
  • Preparation of unit test scenarios and unit test cases
  • Branding the site with CSS
  • Code review and unit testing the code
  • Involved in unit testing using Junit
  • ImplementedLog4Jto trace logs and to track information
  • Involved in project discussions with clients and analyzed complex project requirements as well as prepared design documents
  • Organized and presented technical sessions for a group of 30 project members

Environment: Java, JSP, EJB, JMS, JavaScript, JSF, XML, JBOSS, WebSphere, WebLogic, Hibernate, spring, SQL, PL/SQL, CSS, Log4j, JUnit, Eclipse, Oracle 11g, Load Runner, TFS

Confidential

Technology Analyst

Responsibilities:

  • Interacted with clients to gather functional requirements such as SEO requirements, Captcha implementation, consultation form implementation and etc.,
  • Involved in Analysis, design, development and testing of the modules
  • Developed master pages and static pages
  • Developed consultation form with Captcha functionality and mailing functionality
  • Developed Services with AngularJS
  • Implemented URL Rewrite and Redirection using URLRewriteFilter
  • Implemented English to French Toggling functionality
  • Implemented Google Analytics for all pages by using Google Scripts
  • Implemented Log4j Logging in the application
  • Hosted the web application in Testing Environment and Supported network team to host the same in Live

Environment: Java, JSP, EJB, JavaScript, JSF, XML, WebSphere, WebLogic, Hibernate, spring, SQL, PL/SQL, CSS, Log4j, JUnit, Eclipse, Oracle 11g, TFS

Confidential

Software Development Engineer

Responsibilities:

  • Developed the HTML layout of the pages
  • Worked on Bing Maps integration into the site
  • Implemented validation for fields through JQuery
  • Worked on creating profile pages for Volunteer, Non-profit, Enterprises with ASP .Net and C# .Net
  • Implemented Restful Web Services with WCF
  • Did database interactions in the business logic with Entity Framework Model
  • Implemented mailing functionality with SMTP and Outlook to send mails to volunteers
  • Deployed the application on IIS Server
  • Did data import to and export from SQL Server with SSIS
  • Actively participated in project discussions with senior team members
  • Used Team Foundation Sever as version controlling and configuration management

Environment: ASP .Net, C# .Net, SQL Server, SSIS, SharePoint, Entity Framework, Outlook, SMTP Mailing, HTML, JavaScript, JSON, JQuery, CSS, Visual Studio, SQL Server Management Studio, Team Foundation Server, XML, and IIS

We'd love your feedback!