Big Data Engineer Resume
Wilmington, DE
SUMMARY:
- Around 8+ years of strong experience in software development using Bigdata, Hadoop, Apache Spark, Java/J2EE, Python & Scala technologies.
- Very good hands - on in Spark Core, Spark SQL, Spark Streaming and Spark ML (Machine Learning).
- Highly skilled in integratingKafka wifSpark streaming for high speed data processing.
- Sound noledge in using ApacheSolrto search against structured and un-structured data.
- Solid understanding of RDD operations in Apache Spark i.e., Transformations & Actions, Persistence (Caching), Accumulators, Broadcast Variables, Optimizing Broadcasts.
- Work experience wif cloud infrastructure like Amazon Web Services (AWS).
- Good expertise in using AWS services like EMR, EC2 and S3 to run apache spark development and production jobs.
- Experience in pulling data from Amazon S3 cloud to HDFS & vice versa.
- Used Amazon Kinesis Video Streams to capture, process, and store video streams for analytics and machine learning.
- Experienced in running query usingImpalaand used BI tools to run ad-hoc queries directly on Hadoop.
- Very well-versed noledge wif workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Used Zookeeper on a distributed HBase for cluster configuration and management.
- Developed various Map Reduce and spark applications to perform ETL workloads on terabytes of data.
- Experience in managing and reviewing Hadoop log files.
- Good noledge on integrating Talend wif Hadoop.
- Having good working experience of No SQL database like HBase, Cassandra and Mongo DB.
- Experience in working wif Flume to load teh log data from multiple sources directly into HDFS.
- Experience wif Avro Data Serialization system.
- Experience in fine-tuning MapReduce jobs for better scalability and performance.
- Installing, configuring and managing of Hadoop Clusters and Data Science tools.
- Excellent Java development skills using J2EE, Servlets, Junit and familiar wif popular frameworks such as Spring, MVC and AJAX.
- Extensive experience in PL/SQL, developing stored procedures wif optimization techniques.
- Adept at Web Development and experience in developing front end applications using JavaScript, CSS and HTML.
- Expertise in Waterfall and Agile - SCRUM methodologies.
- Proficient wif Core Java, AWT and wif teh mark-up languages like HTML, XHTML, DHTML, CSS, XML, XSL, XSLT, XPath, XQuery, Angular JS.
- Worked wif version control systems like Subversion and GIT for providing common platform for all teh developers.
- Quick learner wif strong desire to master new technologies.
- Highly motivated, dedicated and hardworking, wif strong Analytical & Logical Development skills.
TECHNICAL SKILLS
Languages & Scripting: Python, Scala, Java & JavaScript
Big Data Frameworks: Apache Hadoop, Apache Spark, Hive, Impala, Avro, Oozie, Sqoop, Zookeeper, HBase, Flume, Kafka, MongoDB, Cassandra
Cloud Technologies: Amazon EC2, S3, EMR, Dynamo DB, Lambda, Kinesis
IDE’s & Utilities: Eclipse, NetBeans, Mahout, Log4j
Database/ RDBMS: MYSQL, MS-SQL server, DB2, Oracle 11g/10g/9
Web Development: HTML, XML, JavaScript, AJAX, SOAP, WSD
Operating Systems: Unix, Linux, Windows, Mac
Version Control: GIT, SVN, Win CVS, VSS
BI Tools: Tableau, Pentaho
PROFESSIONAL EXPERIENCE:
Confidential, Wilmington, DE
Big Data Engineer
Responsibilities:
- Interacting wif multiple teams understanding their business requirements for designing flexible and common component.
- Validating teh source file for Data Integrity and Data Quality by reading header and trailer information and column validations.
- Used Spark SQL for creating data frames and performed transformations on data frames like adding schema manually, casting, joining data frames before storing them.
- Implemented Spark SQL to access Hive tables into spark for faster processing of data.
- Worked on Spark streaming using Apache Kafka for real time data processing.
- Used Hive to do transformations, joins, filter and some pre-aggregations before storing teh data onto HDFS.
- Created external Hive tables to store data which is loaded.
- Used Sqoop for importing and exporting data from Netezza, Teradata into HDFS and Hive.
- Worked on three layers for storing data such as raw layer, intermediate layer and publish layer.
- Optimizations techniques include partitioning & bucketing.
- Using Avro file format compressed wif Snappy in intermediate tables for faster processing of data.
- Used Kinesis Data Analytics to analyse data streams wif SQL.
- Used parquet file format for published tables and created views on teh tables.
- Created sentry policy files to provide access to teh required databases and tables to view from Impala to teh business users.
- Automated teh jobs wif Oozie and scheduled them wif Autosys.
- Experience in Amazon AWS to spin up teh EMR cluster to process teh huge data which is stored in Amazon S3 and push it to HDFS.
- Participated in evaluation and selection of new technologies to support system efficiency.
- Participated in development and execution of system and disaster recovery processes.
Environment: Hadoop, Cloudera, Amazon AWS, HDFS, Hive, Impala, Apache Spark, Autosys, Kafka, DynamoDB, Lambda, s3, SQS, SNS, Sqoop, Java, Scala, Eclipse, Tableau, Teradata, UNIX, and Maven, SBT.
Confidential, Irving, TX
Sr. Hadoop/Spark Developer
Responsibilities:
- Implemented Data loading using Spark, Storm, Kafka, Elastic Search.
- Stored data in AWS S3 like HDFS and performed EMR programs on data stored in S3.
- Exported teh analysed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
- Migrated Map reduce jobs to Spark Jobs to achieve better performance.
- Design and Implementation of Real time applications using Apache Storm, Trident Storm, Kafka.
- Used Spark Data Frame API to process Structured and Semi Structured files and load them back into S3 Bucket.
- Experienced wif batch processing of data sources using Apache Spark, Elastic Search.
- Created Hive External tables and loaded teh data in to tables and query data using HQL.
- Responsible to manage data coming from different sources.
- Good noledge in cloud integration wif Amazon Elastic MapReduce (EMR).
- Experience in integrating Cassandra wif Elastic Search and Hadoop.
- Extensive experience in Spark Streaming through core Spark API running Scala, Java & Python scripts to transform raw data from several data sources into forming baseline data.
- Hands on expertise in running teh Spark & Spark SQL on Amazon (EMR).
- Implemented SPARK batch jobs on AWS EC2 instances through Amazon Simple Storage Service (Amazon S3).
- Performed performance tuning for Spark Steaming e.g. setting right Batch Interval time, correct level ofParallelism, selection of correct Serialization & memory tuning.
- Created HBase tables to store variable data formats of input data coming from different portfolios.
- Involved in adding huge volumes of data in rows and columns to store data in HBase.
- Used Spark API over Hadoop YARN to perform analytics on data in Hive.
- Developed Spark code using Scala and Spark-SQL for batch processing of data.
- Involved in requirement and design phase to implement Streaming Lambda Architecture to use real time streaming using Spark and Kafka.
- Hands on experience working on NoSQL databases like HBase.
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Migrated Map reduce jobs to Spark Jobs to achieve better performance.
Environment:Hadoop, Map Reduce, Spark, Spark SQL, Kafka, Storm, HDFS, Hive, Sqoop, Oozie, Java, SQL, Shell script.
Confidential, Chicago, ILSr. Java Developer
Responsibilities:
- Development involved technologies likeJava, J2EE(Multithreading, JNDI, XML Parsers), JSP, Servlets, and Spring MVC.
- Implemented business logic using Session Beans, Servlets and stored procedures.
- Created user-friendly GUI interface and Web pages using HTML and DHTML embedded in JSP.
- Deployed in different environments like DB-DEV, DEV and Stable. And worked wif testing team for Staging Deployment.
- Developed JSP pages, Controller, Helper & Validator classes for teh application.
- Designed customized GUI pages based on teh business requirement.
- Worked wif client-side and server-side validations for teh web pages developed.
- Prepared all related documentation (Master Test Plan and Test Project Schedule).
- Worked closely wif Application Architects, Business Analysts, and Project Managers to discuss business requirements and overall status of assigned projects.
- Used JIRA for defect tracking and raising different type of severity of teh defects.
- Worked on cross browser compatibility issues for Chrome, Firefox, Safari, IE 11.
- Worked in Agile (SCRUM) environment and used Git Bash as version control tool.
- Worked on Rest Web services, to specify teh business behaviours and backend calls.
- Opened CRQs to deploy teh application in Stable and Production environments.
- Deployed teh application in Development, Stable and Production environments and validated teh application upon successful deployment.
- Performed manual testing on teh changes made to teh application.
Environment: Soap UI 5.0, Spring Framework, JIRA, Maven, Crucible, Confluence, HTML/HTML5, CSS/CSS3, XML, JavaScript, XPATH, XSLT, Web Services, SQL Server 2008, Command Editor, Hibernate ORM, Java7 & 8, JEE, Multithreading, JAX RS, JSON, GitHub, Junit, IntelliJ.
Confidential
Sr. Java Developer
Responsibilities:
- Involved in Analysis, Designing, Development and Testing phases of teh application.
- Was involved in creation and maintenance of teh backend services using Spring, Hibernate, SQL Server and Oracle.
- Developed Web pages using JSPs wif Tag libraries, HTML, and JavaScript.
- Writing J2EE code using Spring, hibernate to upload input CSV files for credit risk data.
- Implemented Dependency Injection (IOC) feature of spring framework to inject dependency into objects and AOP is used for Logging.
- Designed and developed persistence layer build on ORM framework and developed it using Hibernate
- Implemented various Design patterns like Business Delegate, Data Transfer Objects DTO, Service locator, Session Facade and Data Access Objects DAO patterns.
- Involved in writing SQL, Stored procedure, and PL/SQL for back end.
- Used Views and Functions at teh Oracle Database end.
- Developed various documents wifin teh application using XML by using Eclipse as IDE tool.
- Developed SOAP requests to interact wif billing schedule system.
- Used Webservices like Soap & WSDL to send information or remote procedure calls encoded as XML.
- Integrating and deploying teh application on WebLogic application server using ANT.
- Developed user interfaces for presenting teh expense reports, transaction details using JSP, XML, HTML, and Java Script.
- Used Log4J for logging teh application exceptions and debugging statements.
- Proficient in doing Object Oriented Design using UML-Rational Rose.
Environment:Java, JSP, Servlets, Web Sphere Application Server, Eclipse, Java Script, Web Services (SOAP & WSDL), Microsoft VSS, Oracle, PL/SQL and JDBC.
Confidential
Java /J2EE Developer
Responsibilities:
- Maintained teh UI screens using web technologies like HTML, Java Script, jQuery, and CSS.
- Involved in requirements like analysis, design, development and testing.
- Designed, deployed, and tested Multi-tier application using teh Java technologies.
- Involved in front end development using JSP, HTML and CSS.
- Documented teh changes for future development projects.
- Responsible to write teh different service classes and utility API which will be used across teh frame work.
- Used Axis to implement Web Services for integration of different systems.
- Used MYSQL database to store data and execute SQL queries on teh backend.
- Exposed various capabilities as Web Services using SOAP/WSDL.
- Used SOAP UI for testing teh Restful Webservices by sending an SOAP request.
- Used Hibernate as Persistence framework mapping teh ORM objects to table using Hibernate annotations.
- Involved in developing JSP for client data presentation and data validation on teh client side wif in teh forms.
- Used JDBC connections to store and retrieve teh data from teh database.
- Developed Web services component using XML, WSDL, and SOAP wif DOM parser to transfer and transform data between applications.
- Involved in production support, monitoring server and error logs and Foreseeing teh Potential Issues, and escalating to teh higher levels.
Environment:Java, J2EE, JSP, Servlets, Spring, Servlets, Custom Tags, Java Beans, JMS, Hibernate, IBM MQ Series, Ajax, JUnit, JNDI, Oracle, XML, SAX, Rational Rose, UML.
