Big Data/cloud Developer Resume
Charlotte, NC
PROFESSIONAL SUMMARY:
- Having 5 years of professional experience this includes Analysis , Design , Development , Integration , Deployment and Maintenance of quality software applications using Java / J2EE Technologies and Hadoop technologies.
- Hands on experience in using various Hadoop distributions (Hortonworks, Cloudera).
- Experience in working with Amazon EMR, Cloudera (CDH4 & CDH5 ) and Horton Works Hadoop Distributions.
- Experience in Hadoop Ecosystem tools which including HDFS, Yarn, MapReduce, Hive, Sqoop, Kafka, Spark, Zookeeper and Oozie.
- Good knowledge in EMR (Elastic Map Reducing) to perform big data operations in AWS.
- Knowledge in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
- Excellent understanding of Spark and its benefits in Big Data Analytics.
- Experience in design and develop the POC in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
- Hand - on experience in using Scala , SparkStreaming , batch processing for processing the Streaming data and batch data.
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala .
- Hands-on experience on fetching the live stream data from DB2 to S3 buckets using Spark Streaming and ApacheKafka .
- Experienced working with Hadoop Big Data technologies (HDFS and MapReduce programs), Hadoop ecosystems ( HBase , Hive ) and NoSQL database CassandraDB.
- Experience in queried and analyzed data from Cassandra for quick searching, sorting and grouping through CQL .
- Experience on usage of NoSQL in writing applications like CassandraDB.
- Experience on importing and exporting data using Kafka.
- Experience in developing data pipeline by using Kafka to store the data into HDFS.
- Experience in configuring the Zookeeper to coordinate servers in clusters and to maintain data consistency.
- Expertise in loading the data from the different data sources like ( Teradata and DB2 ) into HDFS using Sqoop and load into partitioned Hive tables.
- Experience in migrating data by using SQOOP from HDFS to Relational Database System and vice-versa according to client's requirements.
- Used Cassandra CQL with Java API’s to retrieve the data from Cassandra tables.
- Good understanding and working experience on Cloud based architectures.
- Experience in handling various file formats like AVRO, Parquet, Sequential etc.
- Good understanding on Linux/Linux Kernel Internals and debugging.
- Good Experience on source control repositories like GIT and SVN .
- Experience in working different scripting technologies like Python, UNIX shell scripts.
- Excellent Java development skills using J2EE, J2SE, Servlets, JDBC.
- Experience in developing web page interfaces using HTML, CSS and Type Scripting languages.
- Experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
- Good understanding and experience with Software Development methodologies like Agile and Waterfall and performed Testing such as Unit, Regression, White-box, Black-box.
- Ability to work onsite and offshore team members.
TECHNICAL SKILLS:
Big Data Technologies: HDFS, MapReduce, YARN, Hive, Sqoop, Kafka, HBase, Cassandra, Spark, Ambari, Hue, Impala, Oozie and Zookeeper
Hadoop Distributions: Cloudera CDH4 & CDH5
Database: Oracle 10g, MySQL, DB2, HBase, Cassandra Ds
Programming Languages: Java, SQL, Python and Scala
Operating System: Windows, Linux, UNIX, Mac OS
Cloud Platforms: AWS Cloud
Application Servers: Apache Tomcat 6.0/7.0
IDE Tools: Eclipse, WebStorm, Visual Studio Code
Built Tools: Maven, GitHub, JUNIT, log4j
Development Methodologies: Visual paradigm for UML,Waterfall, Agile/Scrum
PROFESSIONAL EXPERIENCE:
Confidential, Charlotte, NC
Big Data/Cloud Developer
Responsibilities:
- Created EMR Clusters for Data ingestion and also Query clusters for analytics purpose.
- We used to perform Spark transformation logic to extract client Emails and Call data related to retail clients for inbound phone reporting.
- Using Spark, we need to write transformation logics to extract Retail divisional unit related KPI’s for a particular department.
- Built data analytics on Spark which increased the revenue of the business.
- Experience on creating AWS Service Catalog create and manage catalogs of IT services.
- We used to import the data from different data sources like Oracle and IBM DB2 databases into Amazon S3 in different file formats like AVRO, and CSV using Sqoop.
- Using AWS Secret Manager to protect secrets that needs to access applications, service enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle
- Creating Hive tables as per requirements where Internal and External tables are defined with appropriate Static/Dynamic partitions intended for efficiency.
- Experience in AWS Route53 for effectively connecting user requests to infrastructure running in AWS EC2 and AWS S3 Buckets.
- We are using Amazon Cloud Watch to monitor and track resources on AWS.
- Experience in creating Cloud Formation Template (CFT) to create bucket, roles, parameters and etc., and using Amazon Cloud Formation we monitor the Stack creation.
- Created Email alerts for any failures using Splunk.
- Experience in debugging the error logs.
- Experience in Jupyter Notebook for Spark SQL and scheduling the cronjobs using Spark Submit.
- Scheduled the daily jobs using Oozie workflows and Facilitated Crontabs for Data Analysts to schedule their jobs.
- Created Web application to access tableau reports.
- Created a web Application using Angular 7.
- Used NPM (Node Package Manager) for Building packages.
- Used HTML5 and Bootstrap to design the web pages.
- Used Jasmine/Karma for Unit Test cases.
- Ability to fix Production Data Loading Issues which will arise at the time of Production Support.
- Using JIRA for bug tracking, Bitbucket for version control and Control-M for scheduling the jobs.
- Worked with Agile SCRUM team in delivering agreed user stories on time for every Sprint.
Environment: AWS Tools (S3, EMR, EC2, Cloud Watch, Cloud Formation, IAM), Hadoop, Sqoop, Hive, Presto, Spring Tool Suit (STS), Oracle, DB2, Spark, Python, Jupyter Notebook
Confidential, Atlanta, GA
Hadoop Developer
Responsibilities:
- Experience with Cloudera Manager for management of Hadoop cluster.
- Experience in importing the tables from Teradata into Hive using Sqoop jobs .
- Experienced in working with Spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
- Experience in creating batch and real-time pipelines using Spark as the main processing framework.
- Worked on the large-scale Hadoop Yarn cluster for distributed data processing and analysis using Spark .
- Analyze large datasets to find patterns and insights within structured and unstructured data to help business with the help of Tableau.
- Using Spark, we need to write transformation logics to extract healthcare related KPI’s for a particular department. Example: Radiology.
- Experience in loading D-Stream data into Spark RDD and did in-memory data computation to generate output response.
- Some of the major challenges were handling the resources like RAM and writing optimized Spark programs to write ETL’s.
- The results from the Hive warehouse layer we are publishing into IBM DB2. Where the Tableau picks it up.
- Experience in Bamboo for Data migration.
- Worked with Oozie workflow engine to run multiple Hive jobs.
- Used Zookeeper to coordinate the servers in clusters and to maintain the data consistency.
- Experience in using JIRA for bug tracking, BitBucket for version control and Control-M for scheduling the jobs.
- Worked with Agile SCRUM team in delivering agreed user stories on time for every Sprint.
Environment: Cloudera (CDH5), HDFS, Sqoop, Hive, Oozie, Zookeeper, Spark, Scala, Java, Teradata, Maven, IBM DB2, Control M, Bamboo, Bitbucket and Eclipse.
Confidential, Lake Forest, CA
Big Data/Cloud Developer
Responsibilities:
- Worked in AWS environment for development and deployment of Custom Hadoop Applications.
- Experience in maintaining the EC2 (Elastic Computing Cloud) and RDS (Relational Database Services) in amazon web services.
- Creating S3 buckets, also managing policies for S3 buckets and utilized S3 bucket and Glacier for storage and backup on AWS.
- Strong experience in using Amazon Athena to analyze data in Amazon S3 using standard SQL .
- Experience in moving the data from the Amazon S3 bucket to the AWS Glue Data Catalog then, we use the AWS Glue job, which influence the Apache Spark Python API (pySpark), to transform the data from the Glue Data Catalog.
- Creating Hive tables as per requirement were Internal (or) External tables are defined with appropriate static/dynamic partitions and bucketing intended for efficiency.
- Designed the ETL process and created the high-level design document including the logical data flows, source data extraction process, the database staging and the extract creation, source archival, job scheduling and Error Handling.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Effectively migrated data from different source systems to build a secure data warehouse.
- Built data analytics on Spark which increased the revenue of the business.
- Implemented big data workflows to ingest the data from various sources to Hadoop using OOZIE and these workflows comprises of heterogeneous jobs like Hive, Sqoop and Python Script.
- Implementing project using AgileSCRUM methodology, involved in daily stand up meetings.
Environment: AWS, Hadoop, HDFS, Sqoop, Kafka, Hive, Oozie, Zookeeper, Spark-Core, Spark-SQL, Scala, Python, and Visual Studio Code.
Confidential, BOSTON, MA
Java/Hadoop Developer (Internship)
Responsibilities:
- Handled large amount of data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Wrote MapReduce jobs using Java API.
- Experience in working with Hadoop clusters using Hortonworks distributions.
- Launching and Setup of HADOOPCluster which includes configuring different components of HADOOP.
- Extracted and restructured the data into MongoDB using import and export command line utility tool.
- Developed MapReduce programs to clean and aggregate data.
- Involve in creating Hive tables and loading &analyzing data by using Hive queries.
- Scheduled jobs using Oozie workflow Engine.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Writing UDF (User Defined Functions) in Hive when needed.
- Hands on experience in J2EE components on EclipseIDE .
- Involved in loading data from UNIX file system and FTP to HDFS
- Handled Avro Data files using Avro Tools and Map Reduce.
- Experience in using JIRA for bug tracking, Git for version control.
Environment: Hortonworks, Apache Hadoop 1.0.1, HDFS, MapReduce, Java, Hive, Sqoop, MongoDB, MYSQL, Python, UNIX, Shell scripts, Git and Eclipse.
Confidential
Java Developer
Responsibilities:
- Worked in complete SDLC phases like Requirements, Specification, Design, Implementation and Testing.
- Designed and developed a system framework using J2EE technologies based on MVC architecture.
- Used JUnit for testing UI frameworks.
- Involved working on developing profile view web pages add, edit using HTML, CSS, JQuery, JavaScript, AJAX, DHTML, JSP custom tags also front-end development.
- Optimized XML parsers like SAX and DOM for the production data.
- Developed the application by using MAVEN script.
- Experience on using Log4j for debugging.
- Client-side Validations are done using JavaScript.
- Expertise in implementing Struts MVC framework for developing J2EE web application.
- Developed Spring and Hibernate data layer components for application.
- Worked on SVN version controlling.
- Implemented validations using JavaScript for the fields on Login screen and registration page.
- Having good knowledge of JDBC connectivity.
- Developed the DAO layer for the application using Hibernate and JDBC.
- Designed and developed the application using Agile methodology and followed SCRUM.
Environment: HTML, CSS, JQuery, JavaScript, Angular JS, Java/J2EE, JDBC, Struts, Spring, Hibernate, Junit, SVN, Maven, Ajax, Apache CFX, Jenkins, Log4j, Agile, Scrum and Web Service.