We provide IT Staff Augmentation Services!

Project Engineer Resume

4.00/5 (Submit Your Rating)

Irving, TexaS

PROFESSIONAL SUMMARY:

  • Expertise in Hadoop eco system components HDFS, Map Reduce, Yarn, H Base, Pig, Sqoop, Spark, Spark SQL, Spark Streaming, and Hive for scalability, distributed computing, and high performance computing.
  • Experienced in Installing, Maintaining and Configuring Hadoop Cluster.
  • Strong knowledge on creating and monitoring Hadoop clusters on Amazon EC2, VM, Hortonworks Data Platform 2.1 & 2.2, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS etc.
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Having Good knowledge on Single node and Multi node Cluster Configurations.
  • Expertise on Scala Programming language and Spark Core.
  • Worked with AWS based data ingestion and transformations.
  • Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Good knowledge on Amazon EMR, Amazon RDS S3 Buckets, Dynamo DB, RedShift.
  • Analyze data, interpret results, and convey findings in a concise and professional manner
  • Partner with Data Infrastructure team and business owners to implement new data sources and ensure consistent definitions are used in reporting and analytics
  • Promote full cycle approach including request analysis, creating/pulling dataset, report creation and implementation and providing final analysis to the requestor
  • Very Good understanding of SQL, ETL and Data Warehousing Technologies
  • Expert in TSQL, creating and using Stored Procedures, Views, User Defined Functions, implementing Business Intelligence solutions using SQL Server 2000/2005/2008.
  • Developed Web-Services module for integration using SOAP and REST.
  • Good experience on Kafka and Storm
  • Knowledge of java virtual machines (JVM) and multithreaded processing.
  • Good exposure to IDE tools like Eclipse Net Beans and I Report.
  • Excellent exposure to database designing and modeling using E/R diagrams.
  • Java Developer with extensive experience on various Java Libraries, API’s, and frameworks.
  • Hands on development experience with RDBMS, including writing complex SQL queries, Stored procedure, and triggers.
  • Have sound knowledge on designing data warehousing applications with using Tools like Teradata, Oracle, and SQL Server.
  • Experience on using Talend ETL tool.
  • Strong in databases like Sybase, DB2, Oracle, MS SQL, Clickstream.
  • Strong Working experience in snowflake.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, impala, Spark, Splunk, Zookeeper and KafkaNO SQL Database: HBase, Cassandra

Monitoring and Reporting: Tableau, Custom shell scriptsHadoop Distribution: AWS, Horton Works, Cloudera, Map R

Build Tools: SQL Developer

Programming & Scripting: JAVA, SQL, Shell Scripting, Python, Scala

Java Technologies: Servlets, JavaBeans, JDBC, Spring, Hibernate, SOAP/Rest services

Databases: Oracle, MY SQL, MS SQL server, Teradata

Web Dev. Technologies: HTML, XML, JSON, CSS, JQUERY, JavaScript, angular JS

Version Control: SVN, CVS, GIT

Operating Systems: Linux, Unix, Mac OS-X, Cen OS, Windows10, Windows 8, Windows 7, Windows Server 2008/2003

PROFESSIONAL EXPERIENCE:

Confidential, Irving, Texas

Project Engineer

Responsibilities:

  • Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
  • Involved in loading data from Oracle database into HDFS using Sqoop queries.
  • Implemented Map reduces programs to get Top K Results using Map Reduce programs by fallowing Map Reduce Design Patterns.
  • Installed/Configured/Maintained Apache Hadoop clusters for Analytics, application development and Hadoop tools like Hive, HSQL Pig, HBase, OLAP, Zookeeper, Avro, parquet and Sqoop on Linux ARCH.
  • Having experience in doing structured modelling on unstructured data models.
  • Developed data pipeline using Flume, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in loading the created H Files into HBase for faster access of large customer base without taking Performance hit.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Write test cases, analyze and reporting test results to product teams.
  • Worked with AWS data pipeline.
  • Hadoop workflow management using Oozie.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Installed Oozie workflow engine to run multiple Hive and Pig Jobs, used Sqoop to import and export data from HDFS to RDBMS and vice-versa for visualization and to generate reports.
  • Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Worked in functional, system, and regression testing activities with agile methodology.
  • Worked on Python plugin on MySQL workbench to upload CSV files.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.

Confidential

Project Engineer

Responsibilities:
  • Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS.
  • Assisted with performance tuning and monitoring.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Supported code/design analysis, strategy development and project planning.
  • Created reports for the BI team using Sqoop to export data into HDFS and Hive.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Assisted with data capacity planning and node forecasting.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Administrator for Pig, Hive and Hbase installing updates, patches and upgrades.

Confidential

Project Trainee

Responsibilities:
  • The project is for developing a web based application to eliminate all the paperwork in the hospital and laboratories, reading the data from different instruments and store the data in a relational database and generating business intelligence reports for the management.
  • Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.
  • Developed custom JSP tags for the application.
  • Writing queries for fetching and manipulating data using ORM software iBatis.
  • Used Quartz schedulers to run the jobs sequentially at given time.
  • Implemented design patterns like Filter, Cache Manager and Singleton to improve the performance of the application.
  • Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.
  • Deployed the application in client's location on Tomcat Server.

Environment: HTML, Java Script, Ajax, Servlets, JSP, iBatis, Tomcat Server, PostgreSQL, Jasper Reports.

We'd love your feedback!