We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

2.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY:

  • Over 8+ Years of experience in analysis, development, Integration and testing of applications using Java and J2EE Technologies and Bigdata Technologies.
  • Hands on experience in Hadoop ecosystem including Spark, Kafka, HBase, Scala, Pig, Impala, Sqoop, Oozie, Flume, Storm, big data technologies and worked on Spark SQL, Spark Streaming and using Core Spark API to explore Spark features to build data pipelines.
  • Extensive programming experience in developing web based applications using Java 6/7/8, J2EE 1.4/1.5/1.6/1.7, JSP, Servlets 2.4/3.1/4.0, EJB, Struts 1.x/2.x, spring 4.2/3.2, Hibernate 4.3/4.2/4.1/3.5, JDBC, JavaScript, HTML, Java Script Libraries, Web Services.
  • Very Good Knowledge and experience in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provide fast and efficient processing of Teradata Big Data Analytics.
  • Around 4+ years of experience on BIG DATA using HADOOP framework and related technologies such as HDFS, HBASE, MapReduce, HIVE, PIG, IMPALA, FLUME, OOZIE, SQOOP, and SPARK and EC2 cloud computing with AWS.
  • Good experience in Tableau for Data Visualization and analysis on large data sets, drawing various conclusions and Leveraged and integrated Google Cloud Storage and Big Query applications, which connected to Tableau for end user web - based dashboards and reports.
  • Strong hands on experience with AWS services, including but not limited to EMR, S3, EC2, route S3, RDS, ELB, Dynamo DB, Cloud Formation, etc.
  • Strong experience in UI & client side validations using HTML 5, CSS3, Java script, JSP, AJAX, JSON, XML, XSLT and java script frameworks like JQuery.
  • Experienced in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Expertise in Data Development in Hortonworks HDP platform & Hadoop ecosystem tools like Hadoop, HDFS, Spark, Zeppelin, Hive, HBase, SQOOP, flume, Atlas, SOLR, Pig, Falcon, Oozie, Hue, Tez, Apache NiFi, Kafka.
  • Experienced in working with MapReduce programs using Apache Hadoop for working with Big Data.
  • Good Knowledge and experience in Amazon Web Service (AWS) concepts like EMR and EC2 web services successfully loaded files to HDFS from Oracle, SQL Server, Teradata and Netezza using Sqoop.
  • Experience in installation, configuration, supporting and managing - Cloudera Hadoop platform along with CDH4 & CDH5 clusters.
  • Experienced in data analysis using HIVE, PIG LATIN, HBASE and custom Map Reduce programs in Java.
  • Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, parquet and Avro.
  • Experiences with different data formats like Json, Avro, parquet, RC and ORC and compressions like snappy & bzip.
  • Worked and learned a great deal from Amazon Web Services (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.
  • Experienced in using various modules of Spring Framework such as ORM Support, Dependency Injection, Spring MVC, Spring Jdbc andintegrating with Struts and Hibernate.
  • Excellent understanding /knowledge on Hadoop (Gen-1 and Gen-2) and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager (YARN)
  • Good knowledge on AWS cloud formation templates and configured SQS service through java API to send and receive the information.
  • Experienced in implementing Object Relation Model (ORM) frameworks like Hibernatefor persistence.
  • Involved in producing, consuming SOAP based and Restful web services using WSDL, SOAP, JAX-WS, JAX-RS, and AXIS.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS
  • Experienced in implementing Restful Web services using spring support for rest services.
  • Expertise in DB Design, DB normalization and writing SQL queries & PL/SQL- Stored Procedures, Functions, Triggers, Sequences, Indexes and Views.
  • Ability to spin up different AWS instances including EC2-classic and EC2-VPC using cloud formation templates.
  • Experienced in Big data technologies like Hadoop, Pig, Hive, UNIX, R, Azure, and Core Java.
  • Expertise in unit testing using JUNIT test cases and used Mockito for stubbing.
  • Proficient in developing Web Services, related technologies &frame works: WSDL, SOAP, REST, JAX-WS, JAXB, JAX-RPC, AXIS and Jersey, SOAP UI and generating client's client using Eclipse for Web Services consumption.
  • Experienced in performing validations using Jquery Validator and Hibernate Validator.
  • Expertise in using J2EE Application Servers such asJBoss 4.2, Web Logic 8.1/9.2/10.3.3, Web sphere 8.5 and Web Servers such as Tomcat 6.x/7.x.
  • Experience in all stages of SDLC (Agile, Waterfall), writing Technical Design document, Development, Testing and Implementation of Enterprise level Data mart and Data warehouses
  • Experienced in creating Use Case Model, Use Case Diagrams, Class Diagrams, sequence Diagrams and experienced in working with continuous integration tools like Jenkins.
  • Experienced in Load Testing using JMeter and expertise in using IDE’s like MyEclipse, Eclipse 3.x, RAD and IntelliJ.
  • Good knowledge in deploying WAR, JAR, and EAR files in Web-Logic, Web sphere, J-Boss application servers in Linux/Unix/Windows environment.
  • Expert in developing core Java concepts such as Collections, Multithreading, Serialization, Exception handling.
  • Experienced with build/deployment and release of the application using Maven, Ant and good knowledge and work experience in Relational Databases including DB2, Oracle and MySQL.

TECHNICAL SKILLS:

Languages: JAVA, SQL, PL/SQL, Scala, Python.

J2EE Technologies: Servlets, JSP, STRUTS Framework, Spring, JSF, CORBA, Hibernate, Java Beans, JDBC, Java 1.6

Web Technologies: JavaScript, JQuery, XML, DOM, CSS, HTML, Restful web services.

Frameworks: Spring MVC, Struts, J2EE Design Patterns, Hibernate, GWT, Hadoop.

Application/Web servers: IBM Web Sphere 5.x/4.x, SOA, Web Logic 8.x/7.x, JBoss, Tomcat.

Databases: Oracle 9i/10g, 11g, 12c, DB2, SQL Server 2000.

IDE Tools: IBM Web Sphere Application Developer (WSAD), Jbuilder, Eclipse, Visual Studio.NET 2003/2005/2008.

Operating Systems: UNIX, Windows NT/2000/95/98/Me/XP, Sun Solaris.

Tools: and Utilities: Toad, SQL Navigator, SQL Loader, CVS, Maven, Junit, Log4j, ANT, Macro media tool, Jira.

Big Data Techs: Hive, Pig, Sqoop, Cloudera, HDFS, MongoDB, NoSQL, HBase, Hadoop

AWS: AWS SDK, AWS EMR, S3, EC2, AWS Cloud.

PROFESSIONAL EXPERIENCE:

Confidential, Chicago IL

Sr. Big Data Engineer

Responsibilities:

  • Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, HBase, and Hive.
  • Responsible for building scalable distributed data solutions using Hadoop and involved in Job management using Fair scheduler and Developed job processing scripts using Oozie workflow.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in near real time and Persists into Cassandra.
  • Developed Spark scripts by using Scala shell commands as per the requirement and configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed data analysis, feature selection, feature extraction using Apache Spark Machine Learning streaming libraries in Python.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive and developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Involved in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Implemented ELK (Elastic Search, Log stash, Kibana) stack to collect and analyze the logs produced by the spark cluster.
  • Worked on setting up and configuring AWS's EMR Clusters and Used Amazon IAM to grant fine-grained access to AWS resources to users
  • Evaluate deep learning algorithms for text summarization using Python, Keras, TensorFlow and Theano on cloudera Hadoop Sstem
  • Developed data pipeline programs with Spark Python APIs, data aggregations with Hive, and formatting data (json) for visualization, and generating. E.g. High charts: Outlier, data distribution, Correlation/comparison
  • Involved in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Performed Data Cleaning, features scaling, features engineering using pandas and numpy packages in python.
  • Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as well as RDBMS and NoSQL data stores for data access and analysis.
  • Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement the former in project.
  • Developed Spark RDD transformations, actions, and DataFrame's, case classes, Datasets for the required input data and performed the data transformations using Spark-Core
  • Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS and worked extensively with Sqoop for importing metadata from Oracle.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries and developed Hive queries to process the data and generate the data cubes for visualizing
  • Implemented schema extraction for Parquet and Avro file Formats in Hive and worked with Talend open studio for designing ETL Jobs for Processing of data.
  • Used AWS Data Pipeline to schedule an Amazon EMR cluster to clean and process web server logs stored in Amazon S3 bucket.
  • Implemented Partitioning, Dynamic Partitions, and Buckets in HIVE and worked with continuous Integration of application using Jenkins.
  • Worked on Mainframe decommissioning - wrote COBOL equivalent programs in Python, Hive, Pig, Spark core, Hive on Spark SQL, and UNIX shell scripting.
  • Used Reporting tools like Tableau to connect with Hive for generating daily reports of data.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.

Environment: Hadoop YARN, Spark Core, Spark Streaming, Spark SQL, Scala, Python, Kafka, Hive, Sqoop, Amazon AWS, Elastic Search, Impala, Cassandra, Tableau, Talend, Oozie, Jenkins, ETL, Data Warehousing, SQL, Nifi, Cloudera, Oracle 12c, Linux.

Confidential, Raleigh, NC

Sr. Big Data Engineer

Responsibilities:

  • Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs in java for data cleaning and processing.
  • Worked on different modules in parallel, using SCRUM (Agile) development environment with tight schedules.
  • Implemented RESTful web services using spring which supports JSON data formats and consumed RESTful web services using Jquery Ajax.
  • Microservice architecture development using Python and Docker on an Ubuntu Linux platform using HTTP/REST interfaces with deployment into a multi-node Kubernetes environment.
  • Developed data pipeline using KAFKA, FLUME, SQOOP, PIG AND JAVA MAPREDUCE to ingest customer behavioral data and financial histories into HDFS for analysis
  • Migrate on in-house database to AWS Cloud and also designed, built, and deployed a multitude of applications utilizing the AWS stack (Including EC2, RDS) by focusing on high-availability and auto-scaling.
  • Implemented Spark using python and Spark SQL for faster testing processing the data and developed spark scripts by using python shell commands as per the requirement.
  • Implemented persistency layer using Hibernate Framework ( ORM ) and integrating with spring service layer.
  • Involved in building all domain pipelines using Spark data frame and Spark batch processing
  • Developed Spark SQL to load tables into HDFS to run select queries on top and developed Spark code and Spark-SQL/Streaming for faster testing and processing of data
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs on Scala and Python
  • Developed API for using AWS Lambda to manage the servers and run the code in the AWS.
  • Worked with Spring Core, Spring AOP, Spring Integration Framework with JDBC.
  • Developed GUI HTML, XHTML, AJAX, CSS 5 and JavaScript (jQuery) and exclusively used CSS for modifying Layout and design of the web pages.
  • Involved in support for Amazon AWS and RDS to host static/media files and the database into Amazon Cloud.
  • Used Apache Mongo DB (NoSQL) in AWS Linux instance in parallel to RDS MySQL to store and analyze and migrated an existing on-premises application to AWS and used AWS services like EC2 and S3 for small data sets.
  • Consumed web services and were called using JAX-RPCSOAP protocol, WSDL descriptor file and Universal Description, Discovery and Integration Registry (UDDI).
  • Updated Python scripts to match training data with our database stored in AWS Cloud Search, so that we would be able to assign each document a response label for further classification.
  • Used Oracle12c/10g Database, SQL to perform data mapping and backend testing. Also documented all the SQL queries for future testing purpose.
  • Developed Spark/Scala Python for regular expression (regex) project in the Hadoop/Hive environment with Linux/Windows for big data resources.
  • Created Avro & Parquet Hive tables with Snappy compression, loaded data and wrote Hive queries, which will invoke and run MapReduce tasks in the backend
  • Written the Apache PIG scripts to process the HDFS data and analyzed large structured datasets using Hive's data warehousing infrastructure.
  • Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data and working on Apache Spark with Python to develop and execute Big Data Analytics.
  • Designed NoSQL Schema to store data in MongoDB and imported data from MySQLDB using Sqoop.
  • User Interface implemented using HTML 5, Java script/Jquery, backbone and minified scripts using GRUNT.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Involved in NoSQL database design, integration and implementation. Loaded data into NoSQL database HBase and integrated Google API in the application to render Google maps and used AWS Cognito for Web identity federation.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Deployed Application on Web Sphere server using BART and written Sqoopscripts to transfer data between HDFS and RDBMS

Environment: Java J2EE, Hibernate, Spring 4.2, Restful web services, Sqoop, Web Sphere Server, HDFS, DB2, Html 5, CSS3, JSP, Hadoop, SPARK- SCALA/PYTHON HDFS, AWS, AWS Lambda, EC2, S3. Hive, Pig, Sqoop, MapReduce, Impala, Cloudera, NoSQL, HBase, Shell Scripting, Linux, MySQL, Apache Kafka, AJAX, JQuery, JUnit, JDK 1.7 and RAD IDE.

Confidential, Chicago IL

Sr. Java/Big Data Engineer

Responsibilities:

  • Developed J2EE application based on the Service Oriented Architecture.
  • Implemented Rating and Policy RESTful web services using spring 3.0 which supports XML and JSON data formats.
  • Designed & developed Map-Reduce modules with Java as programming language on highly unstructured and semi structured data.
  • Handled importing of data from various data sources, performed transformations using MapReduce, Spark and loaded data into HDFS.
  • Optimizing existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's and used Design Patterns like Singleton, Factory, SessionFacade and DAO.
  • Designed and developed the framework to consume the web services hosted in Amazon EC2 instances.
  • Used spring Inheritance to develop beans from already developed parent beans.
  • Used Impala to read, write and query the Hadoop data in HDFS from HBase or Cassandra and configured Kafka to read and write messages from external programs.
  • Import the data from different sources like HDFS/Hbase into SPARK RDD and Loading files to HDFS and writing HIVE queries to process required data.
  • Queried MySQL database queries from Python using Python-MySQL connector and MySQL dB package to retrieve information.
  • Developed screens using jQuery, JSP, JavaScript, AJAX,ExtJS, HTML5, CSS, JavaScript, JQuery and AJAX.
  • Involved in a full life cycle Object Oriented application development - ObjectModeling, DatabaseMapping, GUIDesign
  • Used AWS SDK for connection to Amazon S3 buckets as it is used as the object storage service to store and retrieve the media files related to the application.
  • Developed Python web services for processing JSON and interfacing with the Data layer.
  • Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop and Continuous monitoring and managing the Hadoop cluster through ClouderaManager.
  • Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
  • Created Micro services using AWS Lambda and API Gateway using REST API.
  • Performed job functions using Spark API's in Scala for real time analysis and for fast querying purposes
  • Used Spring JDBC templat es in DAO layer for some reporting queries.
  • Implemented MVC Architecture for front end development using spring MVC Framework and Implemented backend using various spring framework modules.
  • Develop different components of Hadoop ecosystem system process that involves Map Reduce, and Hive.
  • Implemented User Interface using HTML 5, JSP, CSS 3, Java script/JQuery and performed validations using Java Script libraries and Responsible for managing data file like AVRO, JSON.CSV
  • Involved in multi-tiered J2EE design utilizing MVC architecture (Struts Framework) and Hibernate
  • Worked in SCRUM (Agile) development environment with tight schedules.
  • Storing the result directly from HUE to HDFS, or New Table, or into .CSV/.TSV/.TXT format.

Environment: J2EE, Hibernate 4.2, Spring 3.0, Restful web services, Web Sphere Server 8.5,Oracle 11g, Hive, Ajax, Web services, HDFS, Pig, SOAP, XML, Java Beans, AWS Lambda, AWS SDK, S3, Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java Cloudera HDFS, Eclipse, Html 5, JSP, CSS 3, JQuery, JUnit,JMeter, JDK 1.7, Cloudera, IBM Clear Case and Eclipse IDE.

Confidential, Norcross, GA

Java/J2EE Developer

Responsibilities:

  • Responsible for implementing RESTful web services using spring JAXB is used for marshalling and un-marshalling of java objects
  • Developed applications using Java 1.7, Spring MVC, Apache Camel, and Oracle.
  • Designed and implemented application using spring, JNDI, Spring IOC, Spring Annotations, Spring AOP, Spring Transactions, JDBC, SQL, JMS, Oracle, and Web-Logic.
  • Developed the necessary front end user interfaces in JSP's, HTML-5, Java Script, X-Query, CSS-3 and Angular-JS that will be integrated using Spring MVC Framework.
  • Used Hibernate Query Language (HQL) for accessing data from database and created Hibernate POJO Objects and mapped using Hibernate Annotations.
  • Developed front-end content using FTL, HTML, CSS and client-side validations using JavaScript.
  • Used Web services (SOAP) for transmission of large blocks of XML data over HTTP.
  • Developed Data Access Objects to access middleware Web Services.
  • Wrote Hibernate configuration file, Hibernate mapping files and defined persistence classes to persist the data into Oracle Database.
  • Used spring framework, Spring-IOC, Spring-AOP, Spring-Boot, Spring-JDBC modules.
  • Involved in designing test plans, test cases and overall Unit testing of the system.
  • Responsible for foreseeing performance testing and analyzing the test results.
  • Used JAX-Web Services to interact with other applications using SOAP and WSDL file.
  • Implemented software concepts like UML, HTML, XML, XSLT and SQL stored procedures.
  • Used Struts framework and spring framework to do backend development. The work involved design, implementation and coding in XML, Java Servlets, JSP, and JavaScript.
  • Used Eclipse Juno as IDE and deployed into J-Boss 6.3 Application Server and also, used Maven build tool to achieve more functionality for build process.
  • Implemented front end application using Struts 2 MVC Architecture.
  • Implemented Mbeans to manage application during the production maintenance.
  • Used SVN for Source control management and used Tomcat server for deployment.
  • Modified Agile/Scrum methodology is used for development of this application.

Environment: Java, J2EE, Spring Framework 4.0, WebSphere7.0, Maven, Log4J, SVN, J-Unit, Tomcat, Web logic, Oracle 11g, Spring Batch, XML, HTML5, CSS, JSP, JSON, AJAX, JMS, JPA, JVM, JDK1.8, J-Query, Angular-2, Node.js, Eclipse, Maven, SOAP, SOA, Hibernate, AWS (S3, EC2, VPC, Cloud Deploy, EBS, Auto-scaling) Eclipse IDE.

Confidential, Phoenix, AZ

Java Developer

Responsibilities:

  • Worked on designing and developing Multi-tier enterprise level web applications using various J2EE technologies including Servlets2.x, JDBC, Apache Ant1.5, HTML, XHTML, DHTML, CSS, JavaScript3.x, JSP and XML technologies.
  • Implemented “Work Flow” for “Approval Process of RMA Request” using spring framework dependency injection.
  • Implemented DAO 's using Spring JDBC support to interact with the RMA database. Spring framework was used for transaction handling.
  • Involved in designing front-end screens using Java script, JSP, JSF, AJAX, HTML, CSS and DHTML.
  • Involved in designing and developing of Object Oriented methodologies using UML and created Use Case, Class, Sequence diagrams.
  • Developed Data Access Classes using the Hibernate.
  • Involved in writing Stored Procedures and Functions, Triggers.
  • Used HTML, XHTML, DHTML, Java Script, AJAX, J-QUERY, XML, XSLT, XPATH, JSP and Tag Libraries to develop view pages.
  • Implemented EJB Components using State less Session Bean and State full session beans.
  • Used various web/application servers like Apache Tomcat, Web-Logic, Web-Sphere and JBOSS.
  • Created XML based configuration, property files for application and developing parsers using JAXP, SAX, and DOM technologies.
  • Developed persistence layer using ORM Hibernate for transparently store objects into database.
  • Implemented user interface using HTML, CSS, JSPs and Struts 2 Tags.
  • Applied design patterns including MVC Pattern, Abstract Factory Pattern, DAO Pattern and Singleton.
  • Implemented SOAP Web Services using Axis for delivering returns data to internal system.
  • Implemented user interface using Struts2 MVC Framework, Struts Tag Library, HTML, CSS and JSP.
  • Worked with Web Logic server for deployment.
  • Used JUnit for unit testing the application

Environment: Java1.5, J2EE, Spring Framework 3.0, Struts 2, Axis Web services, MySQL 5.1, JavaScript, Tomcat 6.0, Html, CSS, JDK 1.6, SVN, Oracle, Web Logic server and Eclipse IDE.

Confidential

Java Developer

Responsibilities:

  • Developed Use Case diagrams, Class Diagrams, Sequence Diagrams.
  • Implemented the web based application, based on the StrutsMVC framework for multiple roles, providing different interfaces for the various types of administrators and users.
  • Used Hibernate Framework for persisting data into the Oracle Data Base. And integrating with struts 2 framework.
  • Implemented the View layer of MVC framework using Struts, JSP’s, JSP Standard Tag Libraries (JSTL), HTML, and JavaScript.
  • Implemented front end validations using Java Script.
  • Implemented statements module to support PDF and Excel formats using IText and Apache POI libraries respectively.
  • Implemented DAO’s using Hibernate 3.0 to extract data from Data Base.
  • Responsible for using AJAX framework with JQuery, Dojo, ExtJs implementation for Widgets and Events handling.
  • Customized third party vendor information using Web services (SOAP and WSDL).
  • DevelopedGUI using JSP, spring web flow following spring web MVC pattern.
  • Implemented Web service client with Apache AXIS for consuming.
  • Used MAVEN scripts for build.
  • Used log4J for logging and debugging of the application.

Environment: Java 1.5, Struts 2, Hibernate 3.0, Jdbc, Html, Jsp, JavaScript, Jstl, Eclipse IDE, Axis Web services, Web Sphere and Oracle.

We'd love your feedback!