Big Data Engineer Resume Lafayette, LA - Hire IT People

SUMMARY:

Around8 years of professional IT experience in all phases of Software Development Life Cycle which includes hands on experience in Hadoop ecosystem technologies and Java/J2EE technologies.
Over4 years of hands on experience in using Hadoop ecosystem components like HDFS, Map Reduce, Hive, Impala,Sqoop, Pig, Flume, and Spark.
Over 3 years of experience in Java programming with hands - on on the frameworks Spring, Struts and Hibernate.
Experienced on Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming and Spark MLlib.
A good knowledge implementing Apache Spark with Scala.
Great hands on experience with Pyspark for using Spark libiries by using python scripting for data analysis.
Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
Experienced on collection the real time streaming data and creating the pipeline for row data from different source using Kafka and store data into HDFS and NoSQL using Spark.
Experience in developing PIG Latin Scripts and Hive Query Language.
Working knowledge on Oozie, a workflow scheduler system to manage the jobs that run on Pig, Hive, and Sqoop.
Good knowledge on Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
Experience in Programming SQL, Stored procedure's PL/ SQL, and Triggers in Oracle and SQL Server.
Well versed with core Java concepts like collections, multithreading, serialization, Java beans.
Experience in implementing Web Services based in Service Oriented Architecture (SOA) using SOAP, RESTful Web Services.
Hands on experience with Version Control tools like GitHub and SVN.

TECHNICAL SKILLS:

Hadoop
Spark
Kafka
Python
Scala
Java
CA WA workload
Unix Shell scripting
Apache Maven
SQL Server
Oracle
MySQL
GitHub
SVN
Hadoop Developer
Java Developer
Jenkins uDeploy
HTML
CSS
Java script

EXPERIENCE:

Confidential, Lafayette, LA

Big Data Engineer

Responsibilities:

Developed a data pipeline using Kafka, HBase, Spark and Hive to ingest, transform and analyse.
Worked on transformations on raw input data consumed from Kafka topics and transformed the data into new topics for further processing.
Developed Scala scripts using both Data frames/SQL/Data sets and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDD'S and Scala.
Involved in creating Hive tables, loading and analysing data using hive queries and written complex Hive queries to transform the data.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python.
Extensively worked on data export from Data Lake to target RDBMS.
Extracted the data from SQL Server and Oracle into HDFS using Sqoop. Created and worked Sqoop jobs with incremental load to populate Hive External tables.
Solved performance issues in Hive scripts with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs.
Extensively involved in query optimization in Hive query language.
Involved in assimilating different structured and unstructured data and using Hive and Impala to aggregate and transform data required for reporting.
Worked on scripting for automation and monitored using Shell scripts.
Worked on define, monitor and manage scheduled and event-based workloads through ESP jobs.
Worked with Jenkins for code builds and IBM UrbanCode Deploy for Hadoop code deployment.
Worked with GitHub.

Environment: Apache Hadoop, Apache Spark, Scala, Kafka, HBase, Hive, Pig, Oozie, Python, SQL, CA WA Workstation (ESP), Toad, GitHub.

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities:

Worked on migrating MapReduce programs into Spark transformations using Spark and Python (PySpark).
Imported the data from different sources like HDFS/HBase into SparkRDD
Implemented log-aggregation and transforming data for analytics using Apache Kafka.
Developed Spark programs (Spark streaming and Spark SQL) in Scala for in-memory data processing.
Used Scala to write the code for all the use cases in spark and extensive experience with Scala for data analytics on Spark cluster and Performed map-side joins on RDD.
Developed Python code to gather the data from HBase and designs the solution to implement using PySpark.
Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured
Importing and exporting data into HDFS and Hive using Sqoop and Flume.
Performing big data processing using Hadoop, Map Reduce, Sqoop, Oozie, Impala
Developed Hive queries for the analysts.
Responsible for Data Ingestion like Flume.
Import and export of data using Sqoop from or to HDFS and Relational DB.
Worked on various Java concepts when working in Map-Reduce
Served as a performance /Experience in optimization of Map reduce algorithm using combiners and partitions to deliver the best results.
Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
Built alert & monitoring scripts for applications & servers using Python & Shell Script.
Worked with GitHub.

Environment: Hadoop, Spark, Scala, Kafka, Hive, Pig, Sqoop, Flume, Oozie, Impala,Map Reduce, Python, HBase, Cassandra,Shell scripting, SQL, Oracle 11g, Linux, GitHub.

Confidential, Detroit, MI

Hadoop Developer

Responsibilities:

Developed the Pig UDF'S to process the data for analysis.
Involved in loading data from LINUX file system to HDFS.
Involved in running Ad-Hoc query through PIG Latin language, Hive or Java MapReduce.
Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analysed the imported data using Hadoop Components
Proficient in using Cloudera Manager, an end to end tool to manage Hadoop services.
Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
Developed Hive queries for the analysts.
Worked extensively with Sqoop for importing metadata from Oracle.
Experienced in defining job flows using Oozie
Developed Shell Script to perform Data Profiling on the ingested data with the help of hive bucketing.
Working Knowledge in NoSQL Databases like HBase and Cassandra.
Generated property list for every application dynamically using python.
Managed Batch jobs using UNIX shell and Perl scripts
Used SVN and GitHub as version control tools.

Environment: : JDK1.6, HDFS, Map Reduce, Spark,Yarn, Hive, Pig, Sqoop, Flume, Oozie, Impala, Cloudera, NoSQL Hbase, Cassandra, Oracle 11g, Python, Shell scripting, Perl, Linux, SVN, GitHub.

Confidential

Java/J2EE Developer

Responsibilities:

Involved in different phases to gather requirements, document the functional specifications, design, data modeling and development of the applications.
J2EE Front-End and Back-End supporting business logic, integration, and persistence.
Used JSP with Spring Framework for developing User Interfaces.
Developed the front-end user interface using J2EE, Servlets, JDBC, HTML, DHTML, CSS, XML, XSL, XSLT and JavaScript as per Use Case Specification.
Integrated Security Web Services for authentication of users.
Used Hibernate Object/Relational mapping and persistence framework as well as a Data Access abstraction Layer.
Data Access Objects (DAO) framework is bundled as part of the Hibernate Database Layer.
Designed Data Mapping XML documents that are utilized by Hibernate, to call stored procedures.
Implemented Web-Services to integrate between different applications (internal and third-party components using SOAP and RESTful services using Apache-CXF
Developed and published web-services using SOAP.
Developed efficient PL/SQL packages for data migration and involved in bulk loads, testing and reports generation.
Development of complex SQL queries and stored procedures to process and store the data
Used CVS version control to maintain the Source Code.

Environment: Java, J2EE, JSPs, Struts, EJB, Spring, RESTful, SOAP, Apache- CXF, WebSphere , PL/SQL,Hibernate, HTML, XML, Oracle 9i, Swing, JavaScript, CVS.

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Lafayette, LA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship