Â Big Data Developer/Sr. Analyst Resume Scottsdale, AZ - Hire IT People

PROFESSIONAL SUMMARY:

Over 8+ years of professional IT experience with Big Data Technology including Hadoop/YARN, Pig, Hive, Hbase, MongoDB and Spark.
Hands on experience with Apache Spark, Spark SQL and Spark Streaming.
Worked with different distributions of Hadoop and Big Data technologies including Hortonworks and Cloudera.
Expertise in Big Data Hadoop Ecosystem like Flume, Hive, MongoDB, Sqoop, Oozie, Zookeeper, Kafka etc.
Well versed with Developing and Implementing MapReduce programs using Java and Python.
Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
Good Exposure on Apache Hadoop MapReduce programming, PIG Scripting and Distribute Application and HDFS.
Experience in NoSQL database MongoDB and HBase.
Familiarity on real time streaming data with Spark and Kafka.
Strong understanding of Data warehouse concepts, ETL, data modeling experience using Normalization, Business Process Analysis, Reengineering, Dimensional Data modeling, physical & logical data modeling.
Experience in Object Oriented language like Java and Core Java.
Experience in Database design, Entity relationships, Database analysis, Programming SQL, PL/SQL, Packages and Triggers in Oracle and SQL Server on Windows and LINUX.
Extensive experience working in Oracle, DB2, SQL Server and Mysql database.
Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self - motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.

TECHNICAL SKILLS:

Big Data Technologies: Spark, Kafka, Hadoop, Yarn, HDFS, Hive, Map Reduce, Pig, Sqoop, Flume, Zookeepers, and Cloudera.

Scripting Languages: Python, Shell

Programming Languages: Java, Scala, C, C++

Web Technologies: HTML, J2EE, CSS, JavaScript,JSP

Application Server: IBM Web Sphere Server, Apache Tomcat.

DB Languages: SQL, PL/SQL

Databases / ETL: Oracle 9i/10g/11g

NoSQL Databases: Hbase, Cassandra, ElasticSearch, MongoDB, Phoenix

Operating Systems: Linux, UNIX

PROFESSIONAL EXPERIENCE:

Confidential, Scottsdale, AZ

Big Data Developer/Sr. Analyst

Responsibilities:

Worked on implementing logic to post application log messages to Kafka
Built Spark Streaming application that consumes log messages from Kafka stream and posts them in Elastic Search
Wrote Map Reduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS.
Worked on Oozie workflow, cron job.
Cluster coordination services through Zookeeper.
Worked with Sqoop for importing and exporting data between HDFS and Oracle systems.
Designed a data warehouse using Hive. Created partitioned tables in Hive.
Developed the Hive UDF to pre-process the data for analysis.
Applied Map Reduce framework jobs in java for data processing by installing and configuring Hadoop, HDFS.
Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a Mapreduce way.
Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
Involved in scheduling Oozie workflow engine to run multiple Hive jobs.
Moved data from Hadoop to MongoDB using Bulk output format class.
Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
Automated the workflow using shell scripts.
Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.

Environment: Hadoop, Spark, MapReduce, HDFS, Informatica, Hive, Zookeeper, Hortonworks, Oozie, Elastic Search, Cassandra,Apache Phoenix.

Confidential, San Jose, CA

Big Data/ Hadoop Developer

Responsibilities:

Developed Spark SQL jobs that read data from Data Lake using Hive transform and save it in Hbase.
Built Java client that is responsible for receiving XML file using REST call and publishing it to Kafka.
Built Kafka + Spark streaming job that is responsible for reading XML file messages from Kafka and transforming it to POJO using JAXB.
Built Spark + Drools integration that lets us develop Drools rules as part of Spark streaming job.
Built HbaseDAO’s that responsible for querying data that drools needs from Hbase.
Built logic to publish output of Drools rules to Kafka for further processing.
Wrote MapReduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS.
Worked on Oozie workflow, cron job.
Cluster coordination services through Zookeeper.
Worked with Sqoop for importing and exporting data between HDFS and RDBMS systems.
Designed a data warehouse using Hive. Created partitioned tables in Hive.
Developed the Hive UDF to pre-process the data for analysis.
Analyzed the data by performing Hive queries and running Pig scripts to know Artist behavior.
Applied MapReduce framework jobs in java for data processing by installing and configuring Hadoop, HDFS.
Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a Map reduce way.
Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
Moved data from Hadoop to MongoDB using Bulk output format class.
Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
Automated the workflow using shell scripts.
Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.

Environment: Hadoop, HDFS, Hive, Spark, Spark SQL, Spark Streaming, Kafka, Hbase, Mapreduce, Pig, Oozie, Sqoop, REST, OpenShift, Zookeeper, Cassandra, Drools.

Confidential, Walnut Creek, CA.

Hadoop Developer

Responsibilities:

Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Developed Sqoop jobs for extracting data from different databases, for both initial and incremental data load
Developed MapReduce jobs for cleaning up the ingested data, as well as calculating computed fields.
Designed Hive external tables for storing data extracted using Sqoop.
Developed Hive jobs for moving data from Avro to ORC format, ORC format was used to speed up the queries
Created Hive External tables for derived data and loaded the data into tables and query data using HQL for calculating the claim fraud flags.
Designed Hive External tables with ElasticSearch as Storage format for storing the results of claim flag calculation
Implemented the workflows using Apache Oozie framework to orchestrate end to end execution.
Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the mapreduce jobs given by the users.
Exported analyzed data using Sqoop for generating reports.
Extensively used Pig for data cleansing. Developed Hive scripts to extract the data from the web server output files.
Worked on data lake concepts, converted all ETL jobs into pig/hive scripts.
Participated in the Oracle Golden gate POC that would be used for bringing CDC changes to Hadoop using Flume.
Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Used Pig as ETLtool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
Wrote shell scripts for rolling day-to-day processes and it is automated.

Environment: Hadoop, Spark, MapReduce, HDFS, Flume, Sqoop, Hive, Zookeeper, Pig, Hortonworks, Oozie, Elasticsearch, NoSQL, UNIX/LINUX.

Confidential, Houston, TX

Hadoop Developer

Responsibilities:

Obtained the requirement specifications from the SME’s, Business Analysts in the BR, and SR meetings for corporate workplace project. Interacted with the Business users to build the sample report layouts.
Involved in writing the HLD’s along with the RTM’s tracing back to the corresponding BR’s and SR’s and reviewed them with the Business.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Wrote MapReduce programs in Java to achieve the required Output.
Created Hive Tables and Hive scripts to automate data management.
Worked on debugging, performance tuning of Hive & Pig Jobs
Performed cluster coordination through Zookeeper.
Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Created POC to store Server Log data in MongoDB to identify System Alert Metrics.
Installed and configured Apache Hadoop and Hive/Pig Ecosystems.
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Worked on debugging, performance tuning of Hive Jobs.
Installed and configured Hive and wrote Hive UDFs for transforming and loading data.
Created Hive Tables and Hive scripts to automate data management.
Created HBase tables to store various data formats of PII data coming from different portfolios.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.

Environment: Hadoop, Oracle, HiveQL, Pig, Flume, MapReduce, Zookeeper, HDFS, Hbase, MongoDB, PL/SQL, Windows, Linux.

Confidential, Texas, Dallas, TX

J2EE Developer

Responsibilities:

Involved in Documentation and Use case design using UML modeling including development of Class diagrams, Sequence diagrams, and Use case Transaction diagrams.
Implemented an agile client delivery process, including automated testing, pair programming, and rapid prototyping.
Involved in developing EJB (Stateless Session Beans) for implementing business logic.
Involved in working with JMS Queues.
Accessed and Manipulated XML documents using XML DOM Parser.
Deployed the EJBs on JBoss Application Server.
Involved in developing Status and Error Message handling.
Used Web services SOAP protocol to transfer XML messages from one environment to other.
Implemented various HQL queries to access the database through application work flow.
Involved in writing Junit Test Cases using Junit testing framework.
Used Log4j for External Configuration Files and debugging.

Environment: Java, JDK, Junit, EJB, JMS, XML, XML Parsers (DOM), JBoss, Web Services, HTML, JavaScript, Oracle and Windows XP.

Confidential, Columbus, OH

Java Developer

Responsibilities:

Involved in requirement gathering, functional and technical specifications.
Monitoring and fine tuning IDM performance and Enhancements in the self-registration process.
Developed OMSA GUI using MVC architecture, Core Java, Java Collections, JSP, JDBC, Servlets, ANT and XML within a Windows and UNIX environment.
Used Java Collection Classes like Arraylist, Vectors, Hash Map and Hash Table.
Wrote requirements and detailed design documents, designed architecture for data collection.
Developed algorithms and coded programs in Java.
Involved in design and implementation using Core Java, Struts, and JMS

Environment: JAVA, Oracle, SQL/ PL SQL, JMS.

We provide IT Staff Augmentation Services!

Â big Data Developer/sr. Analyst Resume

Scottsdale, AZ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship