We provide IT Staff Augmentation Services!

 big Data Developer/sr. Analyst Resume

5.00/5 (Submit Your Rating)

Scottsdale, AZ

PROFESSIONAL SUMMARY:

  • Over 8+ years of professional IT experience with Big Data Technology including Hadoop/YARN, Pig, Hive, Hbase, MongoDB and Spark.
  • Hands on experience with Apache Spark, Spark SQL and Spark Streaming.
  • Worked with different distributions of Hadoop and Big Data technologies including Hortonworks and Cloudera.
  • Expertise in Big Data Hadoop Ecosystem like Flume, Hive, MongoDB, Sqoop, Oozie, Zookeeper, Kafka etc.
  • Well versed with Developing and Implementing MapReduce programs using Java and Python.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
  • Good Exposure on Apache Hadoop MapReduce programming, PIG Scripting and Distribute Application and HDFS.
  • Experience in NoSQL database MongoDB and HBase.
  • Familiarity on real time streaming data with Spark and Kafka.
  • Strong understanding of Data warehouse concepts, ETL, data modeling experience using Normalization, Business Process Analysis, Reengineering, Dimensional Data modeling, physical & logical data modeling.
  • Experience in Object Oriented language like Java and Core Java.
  • Experience in Database design, Entity relationships, Database analysis, Programming SQL, PL/SQL, Packages and Triggers in Oracle and SQL Server on Windows and LINUX.
  • Extensive experience working in Oracle, DB2, SQL Server and Mysql database.
  • Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self - motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.

TECHNICAL SKILLS:

Big Data Technologies: Spark, Kafka, Hadoop, Yarn, HDFS, Hive, Map Reduce, Pig, Sqoop, Flume, Zookeepers, and Cloudera.

Scripting Languages: Python, Shell

Programming Languages: Java, Scala, C, C++

Web Technologies: HTML, J2EE, CSS, JavaScript,JSP

Application Server: IBM Web Sphere Server, Apache Tomcat.

DB Languages: SQL, PL/SQL

Databases / ETL: Oracle 9i/10g/11g

NoSQL Databases: Hbase, Cassandra, ElasticSearch, MongoDB, Phoenix

Operating Systems: Linux, UNIX

PROFESSIONAL EXPERIENCE:

Confidential, Scottsdale, AZ

Big Data Developer/Sr. Analyst

Responsibilities:

  • Worked on implementing logic to post application log messages to Kafka
  • Built Spark Streaming application that consumes log messages from Kafka stream and posts them in Elastic Search
  • Wrote Map Reduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS.
  • Worked on Oozie workflow, cron job.
  • Cluster coordination services through Zookeeper.
  • Worked with Sqoop for importing and exporting data between HDFS and Oracle systems.
  • Designed a data warehouse using Hive. Created partitioned tables in Hive.
  • Developed the Hive UDF to pre-process the data for analysis.
  • Applied Map Reduce framework jobs in java for data processing by installing and configuring Hadoop, HDFS.
  • Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a Mapreduce way.
  • Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
  • Involved in scheduling Oozie workflow engine to run multiple Hive jobs.
  • Moved data from Hadoop to MongoDB using Bulk output format class.
  • Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
  • Automated the workflow using shell scripts.
  • Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.

Environment: Hadoop, Spark, MapReduce, HDFS, Informatica, Hive, Zookeeper, Hortonworks, Oozie, Elastic Search, Cassandra,Apache Phoenix.

Confidential, San Jose, CA

Big Data/ Hadoop Developer

Responsibilities:

  • Developed Spark SQL jobs that read data from Data Lake using Hive transform and save it in Hbase.
  • Built Java client that is responsible for receiving XML file using REST call and publishing it to Kafka.
  • Built Kafka + Spark streaming job that is responsible for reading XML file messages from Kafka and transforming it to POJO using JAXB.
  • Built Spark + Drools integration that lets us develop Drools rules as part of Spark streaming job.
  • Built HbaseDAO’s that responsible for querying data that drools needs from Hbase.
  • Built logic to publish output of Drools rules to Kafka for further processing.
  • Wrote MapReduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS.
  • Worked on Oozie workflow, cron job.
  • Cluster coordination services through Zookeeper.
  • Worked with Sqoop for importing and exporting data between HDFS and RDBMS systems.
  • Designed a data warehouse using Hive. Created partitioned tables in Hive.
  • Developed the Hive UDF to pre-process the data for analysis.
  • Analyzed the data by performing Hive queries and running Pig scripts to know Artist behavior.
  • Applied MapReduce framework jobs in java for data processing by installing and configuring Hadoop, HDFS.
  • Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a Map reduce way.
  • Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Moved data from Hadoop to MongoDB using Bulk output format class.
  • Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
  • Automated the workflow using shell scripts.
  • Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.

Environment: Hadoop, HDFS, Hive, Spark, Spark SQL, Spark Streaming, Kafka, Hbase, Mapreduce, Pig, Oozie, Sqoop, REST, OpenShift, Zookeeper, Cassandra, Drools.

Confidential, Walnut Creek, CA.

Hadoop Developer

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Developed Sqoop jobs for extracting data from different databases, for both initial and incremental data load
  • Developed MapReduce jobs for cleaning up the ingested data, as well as calculating computed fields.
  • Designed Hive external tables for storing data extracted using Sqoop.
  • Developed Hive jobs for moving data from Avro to ORC format, ORC format was used to speed up the queries
  • Created Hive External tables for derived data and loaded the data into tables and query data using HQL for calculating the claim fraud flags.
  • Designed Hive External tables with ElasticSearch as Storage format for storing the results of claim flag calculation
  • Implemented the workflows using Apache Oozie framework to orchestrate end to end execution.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the mapreduce jobs given by the users.
  • Exported analyzed data using Sqoop for generating reports.
  • Extensively used Pig for data cleansing. Developed Hive scripts to extract the data from the web server output files.
  • Worked on data lake concepts, converted all ETL jobs into pig/hive scripts.
  • Participated in the Oracle Golden gate POC that would be used for bringing CDC changes to Hadoop using Flume.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Used Pig as ETLtool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.

Environment: Hadoop, Spark, MapReduce, HDFS, Flume, Sqoop, Hive, Zookeeper, Pig, Hortonworks, Oozie, Elasticsearch, NoSQL, UNIX/LINUX.

Confidential, Houston, TX

Hadoop Developer

Responsibilities:

  • Obtained the requirement specifications from the SME’s, Business Analysts in the BR, and SR meetings for corporate workplace project. Interacted with the Business users to build the sample report layouts.
  • Involved in writing the HLD’s along with the RTM’s tracing back to the corresponding BR’s and SR’s and reviewed them with the Business.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Wrote MapReduce programs in Java to achieve the required Output.
  • Created Hive Tables and Hive scripts to automate data management.
  • Worked on debugging, performance tuning of Hive & Pig Jobs
  • Performed cluster coordination through Zookeeper.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Created POC to store Server Log data in MongoDB to identify System Alert Metrics.
  • Installed and configured Apache Hadoop and Hive/Pig Ecosystems.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Worked on debugging, performance tuning of Hive Jobs.
  • Installed and configured Hive and wrote Hive UDFs for transforming and loading data.
  • Created Hive Tables and Hive scripts to automate data management.
  • Created HBase tables to store various data formats of PII data coming from different portfolios.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.

Environment: Hadoop, Oracle, HiveQL, Pig, Flume, MapReduce, Zookeeper, HDFS, Hbase, MongoDB, PL/SQL, Windows, Linux.

Confidential, Texas, Dallas, TX

J2EE Developer

Responsibilities:

  • Involved in Documentation and Use case design using UML modeling including development of Class diagrams, Sequence diagrams, and Use case Transaction diagrams.
  • Implemented an agile client delivery process, including automated testing, pair programming, and rapid prototyping.
  • Involved in developing EJB (Stateless Session Beans) for implementing business logic.
  • Involved in working with JMS Queues.
  • Accessed and Manipulated XML documents using XML DOM Parser.
  • Deployed the EJBs on JBoss Application Server.
  • Involved in developing Status and Error Message handling.
  • Used Web services SOAP protocol to transfer XML messages from one environment to other.
  • Implemented various HQL queries to access the database through application work flow.
  • Involved in writing Junit Test Cases using Junit testing framework.
  • Used Log4j for External Configuration Files and debugging.

Environment: Java, JDK, Junit, EJB, JMS, XML, XML Parsers (DOM), JBoss, Web Services, HTML, JavaScript, Oracle and Windows XP.

Confidential, Columbus, OH

Java Developer

Responsibilities:

  • Involved in requirement gathering, functional and technical specifications.
  • Monitoring and fine tuning IDM performance and Enhancements in the self-registration process.
  • Developed OMSA GUI using MVC architecture, Core Java, Java Collections, JSP, JDBC, Servlets, ANT and XML within a Windows and UNIX environment.
  • Used Java Collection Classes like Arraylist, Vectors, Hash Map and Hash Table.
  • Wrote requirements and detailed design documents, designed architecture for data collection.
  • Developed algorithms and coded programs in Java.
  • Involved in design and implementation using Core Java, Struts, and JMS

Environment: JAVA, Oracle, SQL/ PL SQL, JMS.

We'd love your feedback!