We provide IT Staff Augmentation Services!

Hadoop Engineer Resume

4.00/5 (Submit Your Rating)

Boston, MA

SUMMARY:

  • Working as an Architect in Big Data solutions that involved Cloudera, Microsoft Azure HDInsight’s.
  • Cloudera Certified Developer for Apache Hadoop with 16+ Years of IT experience in Development, Design, Application Support & Maintenance and managing projects of IT applications.
  • Experience in Planning and Defining Scope, Developing Schedules, Budgeting, Cost estimations, Team Leadership, Monitoring and Reporting Progress
  • Experience in JAVA patterns by using Open Source products.
  • Experience in solutions involving end to end using Hadoop HDFS, Map/Reduce, Strom, Solr, Kafka, Scala, Pig, Hive, HBase, Sqoop, Oozie and Zookeeper and performance tuning the Hadoop cluster.
  • Experience in programming python on Spark.
  • Experience in real time streaming using Kafka and Storm (POC).
  • Experience in Installing and configuring and upgrading Hadoop cluster using Cloudera Manger.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience working with large databases like Oracle, MySQL and DB2.
  • Experience in data migration and data modelling using N0 - SQL databases.
  • Knowledge in Data analytics using R Lang, Apache Hadoop, Map-Reduce.
  • Knowledge in Data Analysis techniques like clustering, classification, regression, forecasting and prediction using R Lang.
  • Knowledge on BI and data warehouse processes and techniques.
  • Expert in working with Multi-threaded applications using VC++ and C++ in Windows, Unix and Solaris Environment
  • Experience in various activities of Agile Methodology UMF, UML and Design patterns
  • Experience in various phases of Software Development such as Study, Analysis, Development, Testing, Implementation and Maintenance of a Real time systems.
  • Experience in Managing and leading the projects with Onsite/Offshore model

TECHNICAL SKILLS:

Programming Languages: Java, Python, R Lang, Map reduce, Pig-Latin, Spark, Scala, C++, VC++, C# .Net 4.0.

Tools: MS Project, Visual Studio, CVS, Sqoop, Oozie, zookeeper, Spark, Storm, Kafka, AWS. PowerShell.

Frame works: Hadoop HDFS (Apache, Cloudera, Hortonworks), Cassandra, Microsoft Azure HDInsights, MVC.

Database: My SQL, Oracle, Teradata, Hive, HBase, MongoDB, Sybase, DB2.

Operating Systems: Windows, UNIX, Linux, Solaris

Theoretical Knowledge: Flume, Solr, Microstatergy.

PROFESSIONAL EXPERIENCE:

Confidential, Boston, MA

Hadoop Engineer

Responsibilities:

  • Architecting and designing the pipeline of the data.
  • Interacting with data modelers and creating the refined data scripts.
  • Curation through Pig and scrubbing of data and loading to the cloud environment.
  • Creating Hive External tables to Data scientists for analytics.
  • Involved in optimization of data and performance tuning of hive queries on spark.
  • Developing and resolving queries on Spark cluster using Python.
  • Supporting visualization team to create dash boards on QlikView.

Environment:: Microsoft Azure HDInsights on Windows and Linux, Spark, Python, Hive, BLOB Storage, ETL (Informatica), QlikView, PowerShell, Pig.

Confidential, New York

Developer /Architect

Responsibilities:

  • Architecting and designing the project from start.
  • Interacting with data scientists and gathering requirements.
  • Installation of Spark cluster and performance tuning.
  • Developing and integrating application on Spark (RDD’s) using Hadoop Cluster.
  • Coding and implementation of Hive tables using monthly partitions.
  • Importing data from Oracle using Sqoop and other files through FTP.
  • Developing and scripting in Python.
  • Developing and running scripts on Linux production environment.
  • Reporting and visualizations using Business objects. ( Tableau)

Environment: Cloudera 5.3, RHEL, R, Sqoop, SparkR, Oozie, Hive, python, BO, Kafka, Spark streaming.

Confidential

Developer /Architect

Responsibilities:

  • Analyze Barclays enterprise architecture
  • Solution proposal for migrating from Teradata to Hadoop platform
  • Design the actual solution
  • Analyze tools and technologies required and support provided by existing software’s latest versions and its support for Hadoop HDFS.
  • Propose for technological upgrades
  • Hadoop cluster sizing
  • Develop a solution for end to end data flow from Oracle to HDFS, access data available in HDFS from Informatica, push/pull of data available in Teradata into and from HDFS

Environment:: Java 1.7, Cloudera 4, Sqoop 1.4.3, Hive 0.96, RHEL, Eclipse Helios, Informatica 9.5, Teradata, Oozie, Zookeeper, Oracle.

Confidential

Developer /Architect

Responsibilities:

  • Responsible for performing in-depth analysis and conceptualization of Retail Banking Customer 360 degree view.
  • Responsible for creating use cases/functional requirements.
  • Responsible for designing the user interface.
  • Responsible for creating entity model design and user interface creation using AppBuilder.
  • Responsible for the overall solution delivery developed using InfoSphere Data Explorer.
  • Created the estimation and work breakdown structure for the solution.
  • Planned the development and helped the team resolve technical issues.
  • Worked with Data Explorer product development team to resolve technical issues and identify solutions.
  • Published solution offering document for this solution.
  • Responsible to take business requirements to process credit card historical data from Banking SME.
  • Responsible to perform loading of credit card historical data into Hadoop.
  • Responsible to design and develop Map/Reduce programs for analytics purpose to process the credit card historical data and generate output which is further indexed using API for visualization purposes.
  • Involved in setting up 50 node Hadoop cluster for executing the solution

Environment: Hortonworks, DB2 9.7, Linux, Hive, HBase, Zookeeper.

Confidential

Project Lead

Responsibilities:

  • Developed classes for end-to-end framework for interacting with Hadoop.
  • Worked on Hive Queries to accomplish insert and update on data through joins and functions including UDFs.
  • Participated extensively in the design and development of the project.

Environment: Apache Hadoop, Hive, Java.

Confidential

Developer / Project Lead

Responsibilities:

  • De-duplicate incoming data using MapReduce written in Java
  • Write transformation queries in Hive
  • Write UDF in Hive

Environment: Hadoop, Hive, Amazon Web Services, Java.

Confidential - Dallas, TX

Project Lead

Responsibilities:

  • Managing and leading the project team
  • Detailed project planning and controlling
  • Managing project deliverables in line with the project plan
  • Coach, mentor and lead personnel within a technical team environment
  • Recording and managing project issues and escalating where necessary
  • Monitoring project progress and performance
  • Providing status reports to the project sponsor
  • Managing project training within the defined budget

Environment:: Java, UNIX.

Confidential

Project Lead

Responsibilities:

  • Enhancing the application
  • Developing and maintaining a detailed project plan
  • Understanding SRS, Functional Specification documents
  • Design Document, Coding and Test Cases Preparation
  • Client interaction, status reporting to manager and above
  • Liaison with, and updates progress to, project steering board/senior management.
  • Managing project evaluation and dissemination activities

Environment: VC++, MYSQL, Windows xp

Confidential

Project Lead

Responsibilities:

  • Involved in R&D of interfacing the Excel in Python
  • Understanding SRS, Functional Specification documents

Environment: Python

Confidential

Responsibilities:

  • Analysis
  • Understanding SRS, Functional Specification documents
  • Design and coding of the product

Environment: Visual C++ 2005

Confidential

Team Lead

Responsibilities:

  • Analysis
  • Understanding SRS, Functional Specification documents
  • Design and coding of the product

Environment: Visual C++ 6.0, Win CVS, Oracle 10g.

Confidential

Developer /Team Lead

Responsibilities:

  • Involved in the interaction with the customer for study and analysis of the requirements of the system.
  • Responsible for Analysis and Design of the modules using RequisitePro and Rational Rose. This involved highly interactive displays, handy programming interface to co-ordinate amongst multiple threads of execution, structured exception-handling techniques for handling errors.
  • Handling the Critical Analysis Module, this deals with real time results.
  • Responsible for testing the functionality of devices at the customer site.
  • Responsible for developing various modules using VC++ in Windows NT environment. Integrated the modules with the project using Visual Source Safe 5.0.
  • Involved in Integrating and Integration testing of total system.
  • Tested the modules developed using the real time software simulators and also in real time environment with all the devices, using the tools such as Rational Purify and Quantify.
  • Involved in designing the architecture of the overall system.
  • Involved in the designing & developing of the communication mechanism, which deals with real time background process

Environment: Windows NT, Solaris. Visual C++ 6.0, VSS, PSos (RTOS), VC++

We'd love your feedback!