We provide IT Staff Augmentation Services!

Sr. Hadoop / Big Data Consultant Resume

SUMMARY:

  • Having nearly 10+ years of IT Consulting Experience in Java/Scala/Python/Hadoop/Big Data Real Time, ETL and Reporting projects.
  • Architected and Implemented Java/Hadoop/Big Data ETL projects by setting up Ingestion and Reports using Java, Scala, Python, SPARK, Scala, Spark SQL, Kafka, HBase, PIG, Hive, SQOOP, Oozie and Tableau.
  • Writing UDF’s in Hive and Pig
  • Developed dashboard using Hbase and Java Web service
  • Writing Pig and hive scripts and scheduling them using Oozie workflows and in Crontab.
  • Written Spark Scala Big Ingestion scripts for bringing data into Hadoop and storing them into Hive and Habase for reporting purposes.
  • Created Online Dashboards using Hbase and Java Web services
  • Worked on Big Data Real - time Streaming using Kafka, Spark Streaming
  • Written SPARK SCALA/ PIG/HIVE/Java UDF’s for pre processing complex JSON’s and stored them into HIVE using SPARK as ORC files.
  • Bundled Many Big Data Reporting SPARK Jobs that have been written using Spark Scal using SBT.
  • Developed and Deployed Daily, Weekly, Monthly Reports Using Spark, Scala, SBT on UNIX CRONTAB
  • Designed and Developed Dashboard Reports on Tableau from Hive Data, which had been parsed previously using SPARK SCALA
  • Worked extensively on Big Data Ingestion, Reporting using Spark, Scala, Java, Pig, Hive, Sqoop, Oozie, Python, Kafka and UNIX.
  • Developed Python Restful API Interfaces for Data Science Model Building and Model Scoring using Python Flask and assisted Data Scientists.
  • Generated Request and Response JSON of MODEL SCORING results using Python Restful API’s.
  • Involved in Data wrangling, Data Preparation for Model Training and Model Scoring and Data Migration from Relation Databases to Hadoop.
  • Designed and Developed JSON structure for Elastic Search API’s and Data science Projects.
  • Worked on Big Data Ingestion, Data Transformation using PIG and then loaded the transformed data to HBASE Storage Handler.
  • Managed BIG DATA Ingestion, Transformation, Loading into AWS S3.
  • Worked on building AWS EC2 instances to spin up Hadoop Cluster using BIG DATA technologies.
  • Written Hive Scripts for Ad-hoc Analysis and Reporting.
  • Design and Development of Big Data Warehouse system using HADOOP Eco System tools.
  • Worked on setting up KAFKA Broker servers with AWS ELB (elastic Load balance) Infrastructure on AWS EC2 instances for Managing Pfizer IT Operational Servers log and Message Queues from Pfizer BT on Demand Web Portals.
  • Involved in Code Development using NOSQL Technologies such as HBASE and ELASTIC SEARCH
  • Worked on Cloudera, Horton works (HDP) and MapR Distributions on multiple projects.
  • Written Scripts to automate many tasks related to Hadoop Infrastructure setup.
  • I have also installed and managed Kafka on Horton works HDP AMBARI server on AWS Cluster.

TECHNICAL SKILLS:

Hadoop Eco System Tools: HDFS, Map Reduce HIVE, PIG, Scoop, HBase, Oozie and Zookeeper.

Real Time Streaming: IBM Info sphere Streams, Kafka and MQ.

Cloud Platform: AZURE and AWS

Java Technologies: Core Java, JSP, Servlet, J2EE, JAX-RS RESTFUL web services, JSON, JAXB, JDBC, and Java Persistence API (JPA)

Framework: Spring MVC, HIBERNATE

Python Editors: iPython, Anaconda, PyCharm

Database: MySQL, Oracle

Mainframe: JCL Batch Jobs, TSO/ISPF, DB2, Ctrl M, File Aid

Tools: Eclipse, Maven, HP Quality Center, HP Service Center

Data Reports: SQL, PL/SQL, Tableau and OBIEE

Scripting: Windows Shell, Excel VBA and UNIX Shell Scripting

Client end Technologies: Ajax, HTML, Java script, CSS

Application Server: Apache Tomcat, JBOSS, Web Logic

PROFESSIONAL EXPERIENCE:

Confidential

Sr. Hadoop / Big Data Consultant

Responsibilities:

  • Developing Daily Big Data Ad-hoc and Historical Reports using Spark, Scala and Hive.
  • Writing Hive and Pig UDF’s using Java, Python.
  • Scheduling Oozie workflows.
  • Performing Big Data ETL using Spark and Scala.
  • Bundling all the Spark Jobs into Jar using SBT and scheduling them on Unix CRONTAB.
  • Designing and Developing the Tableau reports.
  • Written Spark Scala UDF’s and Pig UDF’s in Java and Python.
  • Developed RESTFUL API’s using Python, Java and generated Model Scoring results in JSON.
  • Performed Data Ingestion, Data Reporting using Spark, Scala, Pig, Sqoop, Hive and Oozie.
  • Worked on Developing Restful Python Code for Data Science Model Scoring.
  • Worked on IBM Streams for Real time Streaming Analytics.
  • Designed and Developed JSON format AVRO file format from Elastic Search API’s.
  • Written Java RESTFULL API code in an application to makes use of KAFKA Publisher and Subscribers i.e. Kafka Consumer Client API’s and Kafka Producer Clients of Kafka Topics and Partitions Logging Messages.
  • Performed Hadoop Streaming on Hive output using Python.
  • Performed data transformation using Pig.
  • Generated the Enterprise Data warehouse feeds using Pig and Hive.
  • Created Hive ORC and External tables

Technologies used: Apache Spark using Scala, Java, Python, JSP, Servlet, Restful web service, Spring MVC, Pig, Hive, Scoop, Map Reduce, Sqoop, and Oozie.

Confidential

Hadoop / Big Data Consultant

Responsibilities:

  • Worked on writing many SQL Queries to generate Health care Feeds using Big Data ETL tools such as Spark, Pig and Hive.
  • Written Spark ETL ingestion scripts using Scala.
  • Bringing the data into Big Data Lake using Pig, Sqoop and Hive.
  • Written Map Reduce job for Change Data Capture on HBASE.
  • Written Java programs and developed BIG DATA API to perform Search on ELASTIC SEARCH.
  • Setting up Elastic Search JSON Query Templates.
  • Installed and managed Kafka on Horton works HDP AMBARI server on AWS Cluster.
  • Designed and Developed JSON format AVRO file format from Elastic Search API’s.
  • Written Java RESTFULL API code in an application to makes use of KAFKA Publisher and Subscribers i.e. Kafka Consumer Client API’s and Kafka Producer Clients of Kafka Topics and Partitions Logging Messages.
  • Performed Hadoop Streaming on Hive output using Python.
  • Developed Elastic Search RESTFUL API’s using Java and JSON.
  • Performed data transformation using Pig.
  • Generated the Vendor feeds using Hive.
  • Created Hive ORC and External tables

Technologies used: Java, JSP, Servlet, Restful web service, Spring MVC, Pig, Hive, Scoop, Map Reduce, Python, Elastic Search.

Confidential

Hadoop / Big Data Analyst

Responsibilities:

  • Involved in Architectural Design, Setup and Development of Big Data Technologies on Cloud.
  • Worked extensively in setting up Hadoop ETL flows using Pig, Hive and Hbase.
  • Been a very good team member in BI Reporting Team.
  • Written Java code to create Kafka Consumer and Kafka Producer of Kafka Topics and Partitions.
  • Designed and Developed AVRO JSON schema
  • Written REST Java API code in an application to makes use of KAFKA Publisher and Subscribers i.e. Kafka Consumer Client API’s and Kafka Producer Clients of Kafka Topics and Partitions Logging Messages.
  • Implemented entire Big Data Hadoop Cluster on AWS EC2 instances on Data Center.
  • Implemented Kafka messaging system at Pfizer on AWS.
  • Involved in Data Preparation, Data Ingestion, Data Analysis and Data Reporting.
  • Written Pig scripts and UDF for Data Transformation and Cleaning.
  • Generating Daily Operational Data Discrepancy Reports using HIVE Query Scripts and working closely with Predictive analytics and statisticians Teams.
  • Written and executed Map Reduce Jobs, Sqoop Data Transfer using Oozie workflow Automation to load data from multiple RDBMS sources into HDFS to be processed by Pig and Hive to create a more usable data store.
  • Written python, windows Power shell scripts for data wrangling.
  • Have been involved in Generating Reports and Dashboards.

Technologies used: Java, JSP, Servlet, Restful web service, Spring MVC, Pig, Hive, Scoop, Map Reduce, Python, Elastic Search, AWS, Kafka.

Confidential

Sr. Developer

Responsibilities:

  • Analyzing the Business and Functional Requirements and Use case Documents.
  • Development and Configuration of the US Healthcare Benefits plan, option, and coverage’s for Health and welfare plans using JAVA technologies.
  • Designing the Framework, suite, Strategy and Creating the Test Plan and Test Case Design.
  • Peer Review and Hand off.
  • Performed Unit testing and solving Defects Tracking in HP Quality Center.

Technologies: Core Java, JSP, Servlet, Spring MVC, Hibernate, HTML, CSS, JavaScript, Ajax...Etc.

Hire Now