Bigdata Developer Resume
NJ
SUMMARY
- Highly motivated Hadoop developer with Java and database expertise with over 5+ years of experience in Information Technology.
- 3+ years of hands on experience on Hadoop system experience translating complex BigData problems into meaningful solutions.
- 2+ years of experience in database administration and PL/SQL scripting, Web services, Java and Unix shell scripting.
- Developed databases and projects using R and Python, PL/SQL, Java, NoSQL/MySQL.
- Designed and implemented Big Data solutions using Hadoop, Spark, Hive, Pig, Flume, Sqoop.
- Experienced in HortonWorks Distribution including ApacheHadoop(HDFS, YARN, Hive, Pig, Sqoop, Impala, and Flume) using Java and ETL.
- Experienced in building analytics for structured and unstructured data and managing large data ingestion using technologies like Kafka/Avro.
- Expertise in writing ETL Jobs for analyzing data using Pig Latin scripting.
- Working experience of HBase and good understanding of NOSQL databases like Cassandra and Mongo DB.
- Hands on experience with Talend.
- Responsible for coding SQL Statements and Stored procedures for back end communication using JDBC.
- Experience in importing and exporting data using Sqoop and kafka from HDFS to RDBMS and vice - versa.
- Responsible for the field architecture and educating those new to Hadoop on its value to their organization through whiteboard sessions, demos, Technical group presentations, external sessions and reporting, proof of concepts, reference architecture, and training.
- Experience in Java, JSP, Servlets, EJB, WebLogic, WebSphere, Hibernate, Spring, JBoss, JDBC, Java Script, Ajax, JQuery, XML, and HTML.
- Exceptional ability to quickly master new concepts and capable of working in groups as well as independently.
- Experience in debugging, troubleshooting production systems, profiling and identifying performance bottlenecks.
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Hands on Experience in Apache Samza
- Used Akka for writing actor model to the code.
- Good knowledge of Hadoop Development and various components such as HDFS, Job Tracker, Task Tracker, Data Node, Name Node and Map-Reduce concepts.
- Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, HBase, Zookeeper, Sqoop, Oozie, Cassandra, Flume and Avro.
- Experience in installation, configuration, Management, supporting and monitoring Hadoop cluster using various distributions such as Apache and Cloudera.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
- Experience in developing Pig scripts and Hive Query Language.
- Involved in project planning, setting up standards for implementation and design of Hadoop based applications.
- Has good knowledge of virtualization and worked on VMware Virtual Center.
- Experience in setting up cluster and monitoring cluster performance based on the usage.
- Experience writing MapReduce programs with custom logics based on the requirement.
- Experience writing custom UDFs in pig and hive based on the user requirement.
- Experience in storing, processing unstructured data using NOSQL databases like HBase, Cassandra and MongoDB.
- Experience in writing work flows and scheduling jobs using Oozie.
- Written Hive queries for data analysis and to process the data for visualization.
- Experience in managing and reviewing Hadoop Log files.
- Experience in importing and exporting the different formats of data into HDFS, HBASE from different RDBMS databases and vice versa.
- Working knowledge on Data Migration applications such as Extraction, Transformation and Loading of data from multiple sources into Data Warehouse.
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
- Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
TECHNICAL SKILLS
Operating Systems: Linux (Ubuntu, CentOS), Windows, MAC OS
Big Data Frameworks: Hadoop, Spark, HDFS, MapReduce, Hive, HBase, Cassendra, MangoDB, Impala, Pig, Sqoop, Flume, Ozzie and Zookeeper, Talend, Kafka, Apache Samza, Akka
Programming Languages: Core Java, J2EE (Servlets, Spring, JSP, JDBC, Maven), R, JUnit
Web Technologies: HTML, AngularJS, CSS, XML, JavaScript, AJAX, WebServices (SOAP and REST)
Scripting Languages: Python, Bash, PL/SQL, UNIX Shell Scripting
Databases: ORACLE 10g, MySQL, NOSQL, DB2, MangoDB, Cassendra
Network Protocol: TCP/IP, UDP, HTTP,DNS, DHCP
Web Servers: Apache Tomcat
IDE’s: Eclipse, IntelliJ IDEA, Android Studio.
PROFESSIONAL EXPERIENCE
Confidential, NJ
BigData Developer
Responsibilities:
- Configured Talend Jobs to pull data into Data Lake from various sources like Trading Systems, Reference Data, Market Data, Client Ref Data Configured Mappings in Hive to pull data from Source Data in Hive into Hive using ELT components.
- Created Talend Job Mappings for over 15 different source applications.
- Worked with Avro, Parquet File Formats as well ORC File formats in Hive.
- Developed a Query Tool on top of Hadoop DataLake to query data based on criteria which are mapped to Hive Queries.
- Developed 20 different outgoing feeds from the Data Lake to various Risk Management Systems for Risk Analysis.
- Hands on experience in VPN, Putty, winSCP, VNCviewer, etc.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Generated both Fixed Width and Delimited.
- Used Talend tool to develop feed mappings in a visual way using Job Designer.
- Deployed Talend Jobs onto dev/qa/production environments.
- Developed Preprocessing jobs in Java on Spark for various feeds to make use of Created JUnit Test and used JIRA for Task Tracking.
- Analyzing data using Hadoop components Hive, Impala and Pig.
- Implemented Real time streaming the data using Apache Samza.
- Used Akka actor model to write correct concurrent, parallel and distributed systems.
- Worked hands on with ETL process using Talend.
- Has good knowledge of virtualization and worked on VMware Virtual Center.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Involved in data ingestion into HDFS using Apache Sqoop from a variety of sources using connectors like JDBC and import parameters.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting x.
- Provided quick response to ad hoc internal and external client requests for data.
- Loaded and transformed large sets of structured, semi structured and unstructured data.
- Involved in loading data from UNIX file system to HDFS.
- Responsible for creating Hive tables, loading data and writing hive queries.
- Written Hive queries for data analysis to meet the business requirements.
- Continuously monitoring and managing the Hadoop cluster through HortonWorks Manager.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Analyzed the application usage in a day-day basis on a sample of machines log data using Spark, Hive and Pig
Environment: HortonWorks. Hadoop, MapReduce, Hive, Pig, Sqoop,Oozie Scheduler, UNIX, Java 7.0, Java Lambdas, JSON, Apache Spark, HDFS,YARN, Flume, Ozzie and Zookeeper, Mahout, Apache Samza, Akka, Talend, Base, MySQL, Java spring boots.
Confidential, NY
BigData Developer
Responsibilities:
- Work Queue is a Business Exception Management System for various business units of any financial Services firm.
- Integrated with Business Exception Management to pull Data into Hive Data Mart using Sqoop.
- Created Talend Jobs to design the extraction and transformation to load data into Data Mart.
- Created various Rules specific to the Business Unit for Data Mapping.
- Enrich the Business Data with Reference Data, Coverage Data, Market Data Interacted with Margin Lending, Securities Lending, Trades Completion, Settlement Systems to receive Business Exceptions in millions of data every day.
- Integrated Talend Big Data with Build Tools for automated Deployment.
- Generated Analysis Feeds from Hive reports to Management year end, month end and week end in Fixed Width Format.
- Worked on analyzing Hadoop cluster using different big data analytic tools including Kafka, Pig, Hive and Map Reduce.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Real time streaming the data using Spark with Kafka.
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scale.
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Involved in loading data from LINUX file system to HDFS.
- Importing and exporting data into HDFS using Sqoop and Kafka.
- Experience working on processing unstructured data using Pig.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Supported Map Reduce Programs those are running on the cluster.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in scheduling Oozie workflow engine to run multiple pig jobs.
- Automated all the jobs for pulling data from FTP server to load data into Hive tables using Oozie workflows.
- Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
- Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc.
- Installed and configured Hive.
- Worked on Data Migration applications such as Extraction, Transformation and Loading of data from multiple sources into Data Warehouse.
- Used NoSQL database with Cassandra and Monod.
- Exported the result set from Hive to MySQL using Shell scripts.
- Implemented SQL, PL/SQL Stored Procedures.
- Actively involved in code review and bug fixing for improving the performance.
- Developed screens using JSP, DHTML, CSS, AJAX, JavaScript, Struts, spring, Java and XML.
Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Sqoop, Kafka, LINUX, HortonWorks, Big Data, Java APIs, Java collection, SQL, NoSQL, AJAX, HBase, Talend, Spring Boot, Spring Data, Jersey WebServices, JUnit, Sybase, IBM MQ, Angular JS, JRules, Autosys Jobs.
Confidential, WARREN, NJ
Big Data / Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Developed job processing scripts using Oozie workflow.
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig.
- Used Spark SQL from extracting data from NoSQL(HBASE) and placing data into NoSQL(HBASE).
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Involved in Hadoop cluster task like commissioning & decommissioning Nodes without any effect to running jobs and data.
- Wrote Map Reduce jobs in Python to discover trends in data usage by users.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Helped the team to increase the Cluster size from 22 to 30 Nodes.
- Job management using Fair scheduler.
- Worked extensively with Sqoop for importing metadata from Oracle.
- Involved in creating Hive tables, and loading and analyzing data using hive queries.
- Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as we as RDBMS and NoSQL data stores for data access and analysis. Experienced in running Hadoop streaming jobs to process terabytes of XML format data.
- Used NoSQL database with Cassandra and MongoDB.
- Obtained good knowledge on Scala.
- Load and transform large sets of structured, semi structured and unstructured data.
- Used Spark in three distinct workloads like pipelines, iterative processing and research.
- Responsible to manage data coming from different sources.
- Assisted in exporting analyzed data to relational databases using Sqoop.
- Wrote Hive Queries and UDF.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Gained experience in managing and reviewing Hadoop log files.
Environment: Hadoop Ecosystem, HortonWorks, Mongo DB, Zookeeper, Spark, Scala, Python MapRedce, Sqoop, HDFS, Hive, Pig, Oozie, Spark, Kafka, Cassandra, ElasticSearch, Python, Oracle 10g, MySQL, QlikView.
Confidential
Junior Developer
Responsibilities:
- As a Junior Developer, primarily worked on the Java Server Side built on Spring Boot using Jersey Web Services. Also built the Android App that could be downloaded to the Distributors and their Salesman.
- The Spring Boot Application is hosted on Tomcat containers in AWS Server Instances.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
- Implemented Singleton classes for property loading and static data from DB.
- Actively involved in backend tuning SQL queries/DB script.
- Worked in writing commands using UNIX Shell scripting.
- Involved in developing other subsystems’ server-side components.
- Rest Call Development using Jersey Rest API and Spring Boot Handled all Server Side Rest Services for Distributors, Salesman and Customers.
- Email and SMS Text using Amazon SNS. Stored Procedure to process Payments.
- Participated in build and deployment activities using MAVEN build scripts.
- Handled JDBC Calls using Spring Data (JdbcTemplate) to execute SQL queries and Stored Procedures.
- JUnit Testing Framework for Testing the individual method calls. Created Create/Update/Delete calls from Front End. Developed the Android App for Both Prodcast Customers and Prodcast Distributors. Android App invokes the same Rest Calls developed for JQuery front end as well.
- Used Log4j for application logging and debugging.
- Conducted daily scrum standup meetings to discuss the progress of the project.
Enviornment: Spring Boot, Spring Data, Jersey WebServices, JUnit, Amazon SNS for SMS and Emailing, MySQL Database,Angular JS.
Confidential
Java Developer
Responsibilities:
- Responsible for development, support and enhancement of Benefits system.
- Worked Closely with Architect/Lead to design technical design documents.
- Developed User Interface and implementing business process using JSP and Servlets.
- Wrote Servlets programming and JSP scripting for the communication between web browser and server.
- Developed JAVA code for Data Base connectivity
- Responsible for coding SQL Statements and Stored procedures for back end communication using JDBC.
- Coded different deployment descriptors using XML. Generated Jar files are deployed on Apache Tomcat Server.
- Involved in the development of presentation layer and GUI framework in JSP. Client Side validations were done using JavaScript.
- Involved in code reviews and mentored the team in resolving issues.
- Participated in weekly design reviews and walkthroughs with project manager and development teams.