We provide IT Staff Augmentation Services!

Sr.big Data Developer Resume

4.00/5 (Submit Your Rating)

Alpharetta, GA

SUMMARY

  • Over 7+ years of professional IT experience with Big Data Ecosystem experience in ingestion, storage, querying, processing and analysis of big data.
  • Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, HBase, Zoo Keeper, Hive, Sqoop, Pig, Flume, Cassandra, Cloudera and Horton Works.
  • Experience in building, maintaining multiple Hadoop clusters of different sizes and configuration and setting up the rack topology for large clusters.
  • Good Understanding of Hadoop architecture and Hands - on experience with Hadoop components such as JobTracker, TaskTracker, NameNode, DataNode and MapReduce concepts and HDFS Framework.
  • Well versed with Developing and Implementing MapReduce programs using Hadoop to work with BigData.
  • Familiarity with NoSQL databases like HBase and Cassandra.
  • Detailed knowledge and experience of Design, Development and Testing Software solutions using Java and J2EE technologies.
  • Experience in Database design, Entity relationships, Database analysis, Programming SQL, Stored procedure’s PL/ SQL, Packages and Triggers in Oracle and SQL Server on Windows and LINUX.
  • Familiarity on real time streaming data with Spark, kafka, kinesis
  • Strong understanding of Data warehouse concepts, ETL, data modeling experience using Normalization, Business Process Analysis, Reengineering, Dimensional Data modeling, physical & logical data modeling.
  • Worked in Informatica for the Extraction, Transformation and Loading from various sources to the enterprise data warehouse Developed and tested extraction, transformation and load (ETL) processes.
  • Worked on hybrid role as big data developer and Jaspersoft reporting developer.
  • Involved in grooming sessions with Project Manager/Scrum Master/Technology Director.
  • Experience with front-end technologies like HTML, CSS and Javascript.
  • Involved in the Coreteam for selecting the reporting tool and did analysis on new emerging tools like qliksense.
  • Hands on experience on tools such as Eclipse, JDeveloper, RAD, JCreator, Toad, Xml SPY, Rational Rose, Linux vi-editor and project management tools like Clear case and SVN.
  • Experience in writing Shell scripts using ksh, bash, and perl, for process automation of databases, applications, backup and scheduling.
  • Strong analytical skills with ability to quickly understand clients business needs. Involved in meetings to gather information and requirements from the clients. Leading the Team and involved in Onsite, Offshore co-ordination.
  • Research-oriented, motivated, proactive, self-starter with strong technical, analytical and interpersonal skills.

TECHNICAL SKILLS

Big Data Technologies: Hadoop, HDFS, Hive, Map Reduce, Pig, Sqoop, Flume, Zookeeper, Cloudera, Amazon,EC2,EMR,Redshift

Reporting Tools: Jaspersoft,Qliksense,Tableau

Scripting Languages: Python, Perl, Shell

Programming Languages: C, C++, Java

Web Technologies: HTML, J2EE, CSS, JavaScript, AJAX, Servlets, JSP, DOM, XML, XSLT, XPATH,JMS.

Java Frameworks: Struts, Spring, Hibernate

Application Server: Weblogic Server, Apache Tomcat.

DB Languages: SQL, PL/SQL,Postgres,Paraccel

Databases /ETL: Oracle 9i/10g/11g, MySQL 5.2, DB2, Informatica v 8.x

NoSQL Databases: Hbase, Cassandra, Mango DB.

Operating Systems: Linux, UNIX, Windows 2003 Server

PROFESSIONAL EXPERIENCE

Confidential, Alpharetta,GA

Sr.Big Data Developer

Responsibilities:

  • Worked on hybrid role as big data developer and Jaspersoft report developer.
  • Held Technical Senior resource and part of Big Data Center of Excellence for creating technical guidance, road map and strategies in delivering various big data solutions throughout the Organization.
  • Wrote technical design document, deployment document, supporting documents.
  • Suggested to reduce project cost and worked on spike for moving from Cloudera to Amazon.
  • Initiated agonistic data ingestion concept, which does automated scripts based on the schema changes.
  • Provided support during deployments and helped admin team in end point changes.
  • Daily status check for Oozie workflow and implement necessary changes incase of tweak’s and monitor Cloudera manager and check data node status to ensure nodes are up and running.
  • Worked on historical load for the various sources.
  • Involved in grooming sessions with Project Manager/Scrum Master/Technology Director.
  • Extensively helped stakeholders and guided them for revenue benefits and business model implementations with partners.
  • Experienced in handling different data formats like Json in Hive using Hive SerDe's.
  • Experienced with different compression techniques like LZO, GZip and Snappy.
  • Extensively worked on Hive tables, partitions and buckets for analyzing large volumes of data
  • Worked on Map Reduce jobs for converting XML data into JSON data format and store on HBase
  • Used Pig as ELT tool to do transformations, joins and aggregations before storing the data into HDFS
  • Developed shell scripts for adding process dates to the source files
  • Implemented the workflows using Oozie, cronjob.
  • Converted Impala scripts into self-service where analytics team can pull the data.
  • Lead teammates for converting business onetime reports to self-services solutions and automated reports.
  • Worked with Project Managers to ingest data into hadoop ecosystems from all possible sources within the organization and laid platform for analytics/R team.
  • Worked on more financial reports, which stood me up as go to guy for business.
  • Worked on data lake concepts, converted all ETL jobs into pig/hive scripts.
  • Reverse engineered the reports and identified the Data Elements (in the source systems), Dimensions, Facts and Measures required for new enhancements of reports
  • Creating Dashboards/Reports/Domain/Adhoc View and built Data visualizations using Jaspersoft and provide analysis on the data.
  • Created audit report which shows all users usage on reports.
  • Suggested improvement processes for all process automation scripts and tasks.
  • Involved in the Coreteam for selecting the reporting tool and did analysis on new emerging tools like qliksense, jetro data.
  • Involved in the technology and architectural meetings across all teams and provided inputs to other teams.

Environment: Cloudera,MapReduce,HDFS,Pig,Hive,Sqoop,Oozie,Postgres,AWS,S3,Redshift,EC2EMR,jaspersoft studio and server,tableau,Source tree,stash,Intellij,JIRA.

Confidential, New York,NY

Sr.Big data Engineer

Responsibilities:

  • Experience in working with Sqoop for importing and exporting data between HDFS and RDBMS systems.
  • Designed a data warehouse using Hive. Created partitioned tables in Hive.
  • Developed the Hive UDF'S to pre-process the data for analysis.
  • Analyzed the data by performing Hive queries and running Pig scripts to know Artist behavior
  • Worked on historical load for the various feeds.
  • Worked on data lake concepts, converted all ETL jobs into pig/hive scripts.
  • Created and maintained the Data Model repository as per company standards.
  • Wrote MapReduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS
  • Worked on oozie workflow, cron job.
  • Developed workflow in Amazon Datapipeline to automate the tasks of loading the data into S3 and pre-processing with Pig/Hive.
  • Worked with Tableau team in creating Dashboards and built Data visualizations using Tableau and provide analysis on the data.
  • Exported analyzed data to S3 using Sqoop for generating reports.
  • Extensively used Pig/Hive for data cleansing. Developed Pig Latin/Hive scripts to extract the data from the web server output files to load into S3.
  • Suggested improvement processes for all process automation scripts and tasks.
  • Provided technical assistance for configuration, administration and monitoring of Hadoop clusters
  • Participated in evaluation and selection of new technologies to support system efficiency
  • Assisted team mates in creation of ETL processes for transformation of data sources from existing system.
  • Worked with analyst and test team for writing Hive Queries.
  • Worked extensively in creating MapReduce jobs to power data for search and aggregation
  • Wrote Pig Scripts to generate MapReduce jobs and performed ETL procedures on the data in S3.

Environment: Hadoop 1.2,MapReduce,HDFS,Pig,Hive,Sqoop,AWS,S3,Redshift,Paraccel,EC2EMR,jaspersoft,Tableau

Confidential, Atlanta, GA

Sr.Hadoop Developer

Responsibilities:

  • Involved in architecture design, development and implementation of Hadoop deployment, backup and recovery systems.
  • Worked on the proof-of-concept for Apache Hadoop framework initiation.
  • Experience in HDFS, MapReduce and Hadoop Framework
  • Trained and guided the team on Hadoop framework, HDFS, MapReduce concepts.
  • Developed MapReduce jobs for Log Analysis, Recommendation and Analytics.
  • Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
  • Installed and configured Hadoop HDFS, MapReduce, Pig, Hive, Sqoop.
  • Wrote Pig Scripts to generate MapReduce jobs and performed ETL procedures on the data in HDFS.
  • Loaded large amount of Application Server logs, MLM data into Cassandra using Sqoop
  • Processed HDFS data and created external tables using Hive, in order to analyze visitors per day, page views and most purchased products.
  • Exported analyzed data to Oracle database using Sqoop for generating reports.
  • Used MapReduce and Sqoop to load, aggregate, store and analyze web log data from different web servers.
  • Developed Hive queries for the analysts.
  • Cluster co-ordination services through ZooKeeper
  • Experience in optimization of Map reduce algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization for a HDFS/Cassandra cluster.

Environment: Mapreduce,Hive,Pig, Sqoop, Oracle, Cassandra, Cloudera Manager,ZooKeeper.

Confidential, Houston, TX

Hadoop Developer

Responsibilities:

  • Responsible for loading the customer’s data and event logs from MSMQ into HBase using REST API.
  • Responsible for architecting Hadoop clusters with CDH4 on CentOS, managing with Cloudera Manager.
  • Involved in initiating and successfully completing Proof of Concept on FLUME for Pre-Processing, Increased Reliability and Ease of Scalability over traditional MSMQ.
  • Involved in loading data from LINUX file system to HDFS.
  • Importing and exporting data into HDFS and Hive using Flume.
  • Used Hive to find correlations between customer’s browser logs in different sites and analyzed them to build risk profile for such sites.
  • End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large data sets.
  • Developed the Pig UDF’S to pre-process the data for analysis
  • Monitored Hadoop cluster job performance and performed capacity planning and managed nodes on Hadoop cluster.
  • Proficient in using Cloudera Manager, an end to end tool to manage Hadoop operations.

Environment: Cloudera Distribution, CDH4, FLUME, HBase, HDFS, Pig, MapReduce, Hive

Confidential, Columbus, Ohio

Java/J2EE Developer

Responsibilities:

  • Involved in requirement gathering, functional and technical specifications.
  • Monitoring and fine tuning IDM performance
  • Enhancements in the self-registration process.
  • Fixing the existing bugs in various releases
  • Global deployment of the application and co-ordination between the client, development team and the end users.
  • Setting up of the users by reconciliations, bulk load and bulk link in all the environments.
  • Wrote requirements and detailed design documents, designed architecture for data collection.
  • Developed OMSA UI using MVC architecture, Core Java, Java Collections, JSP, JDBC, Servlets, ANT and XML within a Windows and UNIX environment.
  • Used Java Collection Classes like Array List, Vectors, Hash Map and Hash Table.
  • Used Design Patterns MVC, Singleton, Factory, Abstract Factory.
  • Developed algorithms and coded programs in Java.
  • Co-ordinate with different IT groups and Customer.
  • Involved in design and implementation using Core Java, Struts, and JMS
  • Performed all types of testing includes Unit testing, Integration and testing environments.
  • Worked on a modifying an existing JMS messaging framework for increased loads and performance optimizations
  • Used Combination of client and server side validation using Struts validation framework.

Environment: JAVA, STL's, Design Patterns, Oracle, SQL/ PL SQL,JMS.

Confidential

Java developer and ETL Developer

Responsibilities:

  • Involved in Full Life Cycle Development in Distributed Environment Using Java and J2EE framework.
  • Responsible for developing and modifying the existing service layer based on the business requirements.
  • Involved in designing & developing web-services using SOAP and WSDL.
  • Involved in database design.
  • Created tables, stored procedures in SQL for data manipulation and retrieval, Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle 9i.
  • Created User Interface using JSF.
  • Involved in integration testing the Business Logic layer and Data Access layer.
  • Integrated JSF with JSP and used JSF Custom Tag Libraries to display the value of variables defined in configuration files.
  • Written Stored Procedures functions and views to retrieve the data.
  • Used Maven build to wrap around Ant build scripts.
  • CVS tool is used for version control of code and project documents.
  • Responsible to mentor/work with team members to make sure the standards and guide lines are followed and delivery of tasks in time.
  • Implemented History Load, Incremental Load and ETL logic using Informatica 8.6/9.1 as per the ETL design document and Technical design document.
  • Familiar with ETL Standards and Process and developed ETL logic as per standards from Source-Flat File, Flat-File-Stage, Stage-Work, Work-Work Interim tables and Work Interim tables- Target Tables.
  • Fixing the issues from Source System to downstream data mart during the development process till the code goes live into Production.
  • Deploying the Code to QA as per the project plan which consists of DDL’s for Source-Target tables, ETL logic, JCT Entries, Collects Stats for 3NF Tables and migrating the code to QA using Kintana.
  • Prepared ETL Scripts for Data acquisition and Transformation. Developed the various mappings using transformation like source qualifier, joiner, filter, router, Expression and lookup transformations etc.
  • Doing Analysis for existing ETL Jobs and understanding the flow.

Environment: JQuery, JSP, Servlets, JDBC, HTML, JUnit, JavaScript, XML, SQL, Maven, Web Services, UML, WebLogic Workshop and CVS, Informatics

Confidential

Java/UI Developer

Responsibilities:

  • Involved in the design, coding, deployment and maintenance of the project.
  • Involved in design and implementation of web tier using Servlets and JSP.
  • Performed client side validations using Java Script.
  • Used Apache POI for Excel files reading.
  • Written build scripts with Ant for deploying war and ear applications.
  • Configured connection pools and establishes a connection with MySQL.
  • Used technologies like JSP, JSTL, JavaScript and Tiles for Presentation tier
  • Involved in JUnit testing of the application using JUnit framework.
  • Worked on front end enhancements.

Environment: Java, J2EE, Tomcat, MySQL, Eclipse, Apache POI, Java Script, CSS, HTML.

We'd love your feedback!