Big Data Engineer Resume CA - Hire IT People

SUMMARY:

5+ years of experience in a various IT related technology, which includes hands - on experience in Big Data technologies
Proficient in installing, configuring and using Apache Hadoop ecosystems such as MapReduce, Hive, Pig, Flume, Yarn, HBase, Sqoop, Spark, Storm, Kafka, Oozie, and Zookeeper
Strong comprehension of Hadoop daemons and Map-Reduce topics
Used informatica Power Center for Extraction, Transformation, and Loading (ETL) of information from numerous sources like Flat files, XML documents, and Databases
Experienced in developing UDFs for Pig and Hive
Strong knowledge of Spark for handling large data processing in streaming process along with Scala
Hands On experience on developing UDF, DATA Frames and SQL Queries in Spark SQL.
Highly skilled in integrating kafka with Spark streaming for high speed data processing
Worked with NoSQL databases like HBase, Cassandra and MongoDB for information extraction and place huge amount of data
Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores
Ability to develop Map Reduce program using Java and Python
Good understanding and exposure to Python programming
Exporting and importing data to and from Oracle using SQL developer for analysis
Developed PL/SQL programs (Functions, Procedures, Packages and Triggers)
Good experience in using Sqoop for traditional RDBMS data pulls
Worked with different distributions of hadoop like Hortonworks and Cloudera
Strong database skills in IBM- DB2, Oracle and Proficient in database development, including Constraints, Indexes, Views, Stored Procedures, Triggers and Cursors
Extensive use of Open Source Software and Web/Application Servers like Eclipse 3.x IDE and Apache Tomcat 6.0
Experience in designing a component using UML Design-Use Case, Class, Sequence, and Development, Component diagrams for the requirements
Involved in reports development using reporting tools like Tableau. Used excel sheet, flat files, CSV files to generated Tableau adhoc reports
Broad design, development and testing experience with Talend Integration Suite and knowledge in Performance tuning of mappings
Experience in cluster monitoring tools like Ambari & Apache hue
Solid Technical foundation, great investigative capacity, cooperative person, and objective arranged, with a promise toward incredibleness
Outstanding communication and presentation skills, willing to learn, adapt to new technologies and third-party products

PROFESSIONAL EXPERIENCE:

Big Data Engineer

Confidential, CA

Responsibilities:

Used Spark API over Cloudera Hadoop YARN to perform analytics on data
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN
Worked on batch processing of data sources using Apache Spark, Elastic search
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala
Worked on migrating PIG scripts and MapReduce programs to Spark Data frames API and Spark SQL to improve performance
Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review
Created scripts for importing data into HDFS/Hive using Sqoop from DB2
Loading data from different source (database & files) into Hive using Talend tool
Conducted POC's for ingesting data using Flume
Used all major ETL transformations to load the tables through Informatica mappings
Created Hive queries and tables that helped line of business identify trends by applying strategies on historical data before promoting them to production
Worked on Sequence files, RC files, Map side joins, bucketing, Partitioning for Hive performance enhancement and storage improvement
Developed Pig scripts to parse the raw data, populate staging tables and store the refined data in partitioned DB2 tables for Business analysis
Worked on managing and reviewing Hadoop log files. Tested and reported defects in an Agile Methodology perspective
Conduct/Participate in project team meetings to gather status, discuss issues & action items
Provide support for research and resolution of testing issues
Coordinating with Business for UAT sign off

Environment: Hadoop, Cloudera, Talend, Scala, Spark, HDFS, Hive, Pig, Sqoop, DB2, SQL, Linux, Yarn, NDM, Quality Center 9.2, Informatica, Windows & Microsoft Office

Data Analyst

Confidential, NJ

Responsibilities:

Worked as a Data Analyst to generate data models using Oracle and developed relational database systems
Involved with data analysis primarily identifying the datasets, source data, meta data, data formats and data definition
Installed and worked with R and Tableau in creating visualizations for the data
Documented the complete process flow to describe program development, logic, testing, implementation and application integration
Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs
Devised procedures that solve complex business problems with due considerations for hardware/ software capacity and limitations, operating times and desired results
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it
Involved in the implementation of metadata repository, maintaining data quality, data cleaning procedures, data transformations, stored procedures, triggers and execution plans
Responsible for data extraction, data aggregation, building of centralized data solutions and quantitative analysis to generate business insights
Created and designed reports that use gathered metrics to infer and draw logical conclusions of past and future behavior
Worked hands on with ETL process.
Worked closely with ETL, SSIS, SSRS developers to explain the data transformations using logic
Prepared the workspace for markdown, accomplished data analysis, statistical analysis, generated reports, listings, and graphs

Environment: Oracle, Tableau, R, MS Excel, SQL, MS-SQL Databases

Big Data Engineer

Confidential , OH

Responsibilities:

Used Spark API over Cloudera Hadoop YARN to perform analytics on data
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN
Worked on batch processing of data sources using Apache Spark, Elastic search
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala
Worked on migrating PIG scripts and MapReduce programs to Spark Data frames API and Spark SQL to improve performance
Performed Data Analysis, Statistical Analysis, Generated Reports and Listing using SAS/SQL, SAS/ ACCESS and SAS/EXCEL, Pivot Tables and Graphs
Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review
Created scripts for importing data into HDFS/Hive using Sqoop from DB2
Loading data from different source (database & files) into Hive using Talend tool
Conducted POC's for ingesting data using Flume
Used all major ETL transformations to load the tables through Informatica mappings
Created Hive queries and tables that helped line of business identify trends by applying strategies on historical data before promoting them to production
Worked on Sequence files, RC files, Map side joins, bucketing, Partitioning for Hive performance enhancement and storage improvement
Worked on managing and reviewing Hadoop log files. Tested and reported defects in an Agile Methodology perspective
Conduct/Participate in project team meetings to gather status, discuss issues & action items
Involved in reports development using reporting tools like Tableau. Used excel sheet, flat files, CSV files to generated Tableau ad-hoc reports
Provide support for research and resolution of testing issues

Environment: Hadoop, Cloudera, Talend, Python, Spark, HDFS, Hive, Pig, Sqoop, DB2, SQL, Linux, Yarn, NDM, SAS/SQL, SAS/EXCEL, JIRA, Informatica, Windows & Microsoft Office, Tableau

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship