We provide IT Staff Augmentation Services!

Sr Big Data Engineer Resume

5.00/5 (Submit Your Rating)

Farmington Hills, MI

SUMMARY

  • Over 10 years of experience in design, development, implementation of Software applications and BI/DWH solutions. Experience in data discovery and advance analytics and building business solutions with knowledge in developing strategic ideas for deploying Big Data solutions in both cloud and on - premise environments, to efficiently solve Big Data processing requirements.
  • Build Advanced Analytics Applications on different eco systems MapR, Cloudera, HWX, GCP, Azure and AWS.
  • Strong Understanding in distributed systems, RDBMS, large-scale & small-scale non-relational data stores, map-reduce systems, database performance, data modeling, and multi-terabyte data warehouses.
  • Extensively used Hadoop open-source tools like Hive, HBase, Sqoop, Spark for ETL on Hadoop Cluster.
  • Worked with several Data Integrating and Replication tools like Informatica BDM, SAP BODS, Atunity Replicate etc.
  • Strong knowledge on system development lifecycles and project management on BI implementations.
  • Extensively used RDBMS like Oracle and SQL Server for developing different applications.
  • Build several Data Lakes to halp different clients to perform their advance analysis on big data.
  • Work with Data science team to provide and feed data for AI, ML and Deep learning projects
  • Real-time experience in Hadoop Distributed files system, Hadoop framework and Parallel processing implementation (MapR, AWS EMR, Cloudera) with hands on experience in HDFS, Map Reduce, Pig/Hive, HBase, Yarn, Sqoop, Spark, Java, RDBMS, Linux/Unix shell scripting and Linux internals.
  • Experience in writing UDF’s and map reduce programs in java for Hive and Pig.
  • Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
  • Created Kafka data pipelines for Google Ads platform to consumer latest customer profiles.
  • Experience in Data visualization using oracle big data Discovery tool & IBM Cognos.
  • Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
  • Extensively used RDBMS like Oracle and SQL Server for developing different applications.
  • Experience in creating scripts and Macros using Microsoft Visual Studios to automate tasks.
  • Experience in working with GitHub Repository.
  • Experienced in designing software which enables a system which is secure and enforces autantication, authorization, confidentiality, data integrity, accountability, availability and non-repudiation.
  • Have experience in web designing, web hosting and DNS configurations.
  • Have experience working with web designer tools like Adobe Dreamweaver CC, WordPress & Joomla.
  • Proficient in Manual, Functional and Automation testing.
  • Also experienced in Smoke, Integration, Regression, Functional, Front End and Back End Testing.
  • Capable in developing/writing Test Plans, Test Cases, and Test Scripts based on User Requirements, and SAD documentation.
  • Highly experienced in writing test cases and executing in HP Interactive Testing Tools: Quality Center, Quick Test Professional (QTP).

TECHNICAL SKILLS

Reporting Tools: Tableau 8.1

Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper and HBase, CAWA, Spark, spark-sql, Impala, Mapr-DB, Azure, VOCI, Oracle Big Data Discovery, Kafka, Nifi

Hadoop Ecosystems: MapR, Cloudera, AWS EMR, Horton Works.

Servers: Application Servers (WAS, Tomcat), Web Servers (IIS6, 7, IHS).

Operating Systems: Windows 2003 Enterprise Server, XP, 2000, UNIX, Red Hat Enterprise Linux Server release 6.7

Databases: SQL Server 2005, SQL 2008, Oracle 9i/10g, DB2, MS Access2003, Teradata.

Languages: C, C++, Java, XML, JSP/Servlets, Struts, spring, HTML, Python, PHP, JavaScript, jQuery, Web services, Scala.

Data Modeling: Star-Schema and Snowflake-schema.

ETL Tools: Knowledge on Informatica & IBM Data stage 8.1,SSIS

PROFESSIONAL EXPERIENCE:

Confidential, Farmington Hills, MI

Sr Big Data Engineer

Responsibilities:

  • Design, Plan, Implement and Responsible for Sales, Finance Data pipelines to Data Lake (S3) and Data Warehouse (Redshift),
  • Process Data, using EMR and
  • Participate in the design, architect and implementation of CCPA on S3 and redshift.
  • Write code in Java for the pipelines to ingest data from different sources to flow through S3 and Redshift.
  • Deploy code to dev, int and prod.
  • Collaborate with different teams to communicate, negotiate and implement end to end solutions.
  • Use EMR to process heavy load batch processing.
  • Support BI team for analytics reports.
  • Implemented system wide monitoring and alerts.
  • Installed & configured Hive, Impala, Oracle BigData Discovery, Hue, Apache Spark, Tika, Tika Tesseract, Sqoop, Spark sql etc.
  • Importing and exporting data into MapRFS and Hive using Sqoop.
  • Used Bash Shell Scripting, Sqoop, AVRO, Hive, Impala, HDP, Pig, Java, Map/Reduce daily to develop ETL, batch processing, and data storage functionality.
  • Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
  • Worked on loading all tables from the reference source database schema through Sqoop. Worked on designed, coded and configured server side J2EE components like JSP, AWS and JAVA.

Environment: AWS Redshift, S3, Java Spring, PostgreSQL, MS SQL, Python, AWS EMR, Github, Jenkins, Veracode, Scala, Talend, Jaws

Confidential

Sr Big Data Developer/ Independent Consultant

Responsibilities:

  • Helped client to understand performance issues on the cluster by analyzing the Cloudera stats.
  • Designed and implemented Optum Data Extracts and HCG Grouper Extracts on AWS.
  • Improved memory and time performances for several existing pipelines.
  • Improved Solr Data Ingestion, data quality for Medley Pipeline.
  • Owned Member Sphere, Mosaic, designed and developed Optum and HCG pipelines.
  • Build pipelines using Scala, spark, sparksql, hive, hbase tools and build pipelines using AWS airflow and exploring the power of distributed computing on AWS EMR
  • Loaded processed data into different consumption points like Apache solr, Hbase, atscale cubes for visualization and search.
  • Automated the workflow using Talent Big Data.
  • Scheduled jobs using Autosys.
  • Experienced in managing and reviewing Hadoop log files.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.
  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.

Environment: Attunity, AWS, Oracle SQl, Cloudera, Spark, Talend workload automation, Jenkins, Git, Anvizent Tool

Confidential

Sr Big Data Consultant

Responsibilities:

  • Responsible for moving all the production jobs from HortonWorks to Cloudera.
  • Leading a team of 2 Onsite and 4 Offshore.
  • Improving the Performance of Teradata queries wherever needed while migration.
  • Java API to build automation of Google Adwords Campaign.
  • Maintaining weekly cubes refresh for TM1 to populate the latest data from the PROD screenshots.
  • Built models containing query subjects, query items, and namespaces from imported metadata.
  • Created Ad-hoc reports using Query Studio.
  • Fine-tuned and enhanced queries for the performance of the reports and Models.
  • Ability to work under stringent deadlines with teams as well as independent.
  • Lead the Undisputed Leader project.
  • Created Kafka Streaming for our Google Ads platform to stream real-time changes of the Customer Profile to HBase for Google Ads App to target customers based on the latest profile.

Environment: SAP, Teradata 12.0. HortonWorks, Cloudera. IBM mainframe, Oracle DB, SQL DB, Control M workload automation

Confidential

Sr Big Data Advance Analytics Consultant

Responsibilities:

  • Worked collaboratively with MapR vendor and client to manage and build out of large data clusters.
  • Helped design big data clusters and administered them.
  • Worked both independently and as an integral part of the development team.
  • Communicated all issues and participated in weekly strategy meetings.
  • Administered back-end services and databases in the virtual environment.
  • Did several benchmark tests on hadoop sql engines (Hive, Spark-sql, Impala) and on different data formats Avro, sequence, Parquet using different compression codecs like Gzip, snappy etc.
  • Worked on extracting text from Emails, Images and voice and created data pipelines.
  • Worked on sentiment analysis and structured content programs for creating text analytics app.
  • Created and Implemented applications on Oracle Big Data Discovery for Data visualization, Dashboard and Reports.
  • Collected data from different databases (i.e. Oracle, My Sql) to Hadoop. Used CA Workload Automation for workflow scheduling and monitoring. .
  • Worked on Designing and Developing ETL Workflows using Java for processing data in MapRFS/Hbase using Oozie.
  • Experienced in managing and reviewing Hadoop log files. Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Developed Sqoop scripts to import export data from relational sources Teradata and handled incremental loading on the customer, transaction data by date.
  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts. Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data. Created Hive tables, loaded data and wrote Hive queries that run within the map.
  • Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources

Environment: MapR eco system, ODI, Oracle Endeca.Oracle BigData Discovery, CA workload automation

Confidential

Data Engineer

Responsibilities:

  • Worked on Various Tracks in HealthCare like Membership, Benefits, Billing and Finance.
  • Analysis of Business Requirements and identifying discrepancies in the initial stages to ensure minimization of cost and time.
  • Strong knowledge in Developing Test Plan, Test cases, test scenarios, expected results and prioritizing tests for various modules like membership, 834 EDI, finance, benefits.
  • Wrote test cases, test conditions and test scripts in MS-Excel and exported to Quality Center.
  • Hands on experience in maintaining the Change Request's list and updating the testing process.
  • Good understanding of the physical and logical data modeling, dimensional and relational schemas.
  • Actively participated in validation of transformations applied on source data to load target tables.
  • Extensively used SQL for retrieving data used for the data warehouse, Data Driven Tests to validate the same scenario with different test data.
  • Designed Test Plan and Test Strategy by studying and analyzing Business Requirements of the Project in detail.
  • Analyzing requirement specifications and SAD documentation to design Test Scenarios and Test Cases.
  • Identifying, raising and tracking of the defect.
  • Responsible for closing the defects being fixed.
  • Tested Web services on SoapUI.
  • Regular interaction with the onsite and development team to ensure quality and speedy recovery of defects.
  • Worked on Complete Integration Testing between several third party Systems and applications like BETS, CM, Xcelys, TMS and FS.
  • Presented functional demos to the client regarding the defects and the working of the application.

Confidential

Java Developer

Responsibilities:

  • Designed and developed Web Services using Java/J2EE in WebLogic environment. Developed web pages using Java Servlet, JSP, CSS, Java Script, DHTML, HTML5, and HTML. Added extensive Struts validation.
  • Involve in the Analysis, Design, and Development and testing of business requirements.
  • Developed business logic in JAVA/J2EE technology.
  • Implemented business logic and generated WSDL for those web services using SOAP.
  • Worked on Developing JSP pages
  • Implemented Struts Framework
  • Developed Business Logic using Java/J2EE
  • Modified Stored Procedures in MYSQL Database.
  • Developed the application using Spring Web MVC framework.
  • Worked with Spring Configuration files to add new content to the website.
  • Worked on the Spring DAO module and ORM using Hibernate. Used Hibernate Template and HibernateDaoSupport for Spring-Hibernate Communication.
  • Configured Association Mappings such as one-one and one-many in Hibernate
  • Worked with JavaScript calls as the Search is triggered through JS calls when a Search key is entered in the Search window
  • Worked on analyzing other Search engines to make use of best practices.
  • Collaborated with the Business team to fix defects.
  • Worked on XML, XSL and XHTML files.
  • Interacted with project management to understand, learn and to perform analysis of the Search Techniques.
  • Used Ivy for dependency management.

We'd love your feedback!