Sr. Hadoop Developer Resume
Chicago, IL
PROFESSIONAL SUMMARY:
- Over 9+ years of experience as Solutions - oriented IT Software Developer which includes 5+ years of experience in Web Application development using Hadoop and related Big Data technologies, with 3+ years of experience as Java developer
- Experience in analysis, design, development and integration using Bigdata - Hadoop Technology like MapReduce, Hive, Pig, Sqoop , Ozzie, Kafka , HBase, AWS , Cloudera , Horton works , Impala , Avro , Data Processing , Java/J2EE, SQL.
- Good knowledge on Hadoop Architecture and its components such as HDFS, MapReduce, Job Tracker, Task Tracker, Name Node, Data Node.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, Hive, Spark, Scala, Spark-SQL, MapReduce, Sqoop, Flume, HBase, Zookeeper, and Oozie.
- Having extensive knowledge on Hadoop technology experience in Storage, writing Queries, processing and analysis of data.
- Experience in extending Hive functionalities with custom UDFs for analysis of data, file processing, by using Hive Query Language
- Experience working with Amazon AWS cloud which includes services like (EC2, S3A, RDS and EBS), Elastic Beanstalk, Cloud Watch.
- Worked on Data Modeling using various ML (Machine Learning Algorithms) Python.
- Experienced in transferring data from different data sources into HDFS systems using Kafka.
- Experience in Configured Hive meta store with MySQL, which stores the metadata for Hive tables
- Extensive experience in creating data pipeline for Real Time Streaming applications using Kafka, Flume, Storm and Spark Streaming and analyze sentiment analysis for twitter source
- Strong knowledge in using Flume for Streaming the Data to HDFS.
- Good knowledge in using job scheduling and monitoring tools like Oozie and Zoo Keeper.
- Expertise on working with various databases in writing SQl queries, Stored Procedures, functions and Triggers by using PL\SQL and SQl.
- Experience in NoSQL Column-Oriented Databases like Cassandra, HBase, MongoDB and Filo DB and its Integration with Hadoop cluster.
- Strong Experience in troubleshooting the operating system like Linux, RedHat, and UNIX, maintaining the cluster issues and java related bugs .
- Experience in Developing Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
TECHNICAL SKILLS:
Programming Languages: Python, SQL,Scala
Hadoop: HDFS, MapReduce, HBase, Hive, Pig, Impala, SQOOP, Flume, OOZIE, Spark, Spark QL, and Zookeeper, AWS, Cloudera, Horton works, Kafka, Avro.
Web Technologies: JDBC, JavaScript, AJAX, SOAP,HTML/CSS
Scripting Languages: Java Script, P Python 2.7and Scala.
RDBMS Languages: Oracle, Microsoft SQL Server, MYSQL, PgSQL
NoSQL: MongoDB, HBase, Apache Cassandra, Filo DB.
SOA: Web Services (SOAP, WSDL)
IDES: PyCharm, Eclipse
Operating System: Linux, Windows, UNIX, CentOS.
Methodologies: Agile, Waterfall model, KANBAN
Testing: Hadoop MR UNIT Testing, Quality Center, Hive Testing.
Other Tools: SVN, Apache Ant, Junit and Star UML, TOAD, Pl/SQL Developer, JIRA, Visual Source, QC, Agile Methodology
PROFESSIONAL SUMMARY:
Confidential, Chicago, IL
Sr. Hadoop Developer
Responsibilities:
- Multiple Spark Jobs were written to perform Data Quality checks on data before files were moved to Data Processing Layer.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
- Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
- Responsible for creating data pipeline using Kafka, Flume and Spark Streaming for Twitter source to collect the sentiment tweets of Eaton customers about the reviews
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume1.7.0.
- Involved in deploying the applications in AWS and maintains the EC2 (Elastic Computing Cloud) and RDS (Relational Database Services) in amazon web services.
- Implemented the file validation framework, UDFs, UDTFs and DAOs.
- Strong experienced in working with UNIX/LINUX environments, writing UNIX shell scripts, Python
- Created reporting views in Impala using Sentry Policy files. different databases like MySQL, RDBMS into HDFS and HBASE using Sqoop.
- Advanced knowledge in performance troubleshooting and tuning Cassandra clusters.
- Analyzing the source data to know the quality of data by using Talend Data Quality.
- Involved in creating Hive tables, loading with data and writing hive queries.
- Developed REST APIs using Java, Play framework and Akka.
- Model and Create the consolidated Cassandra, Filo DB and Spark tables based on the data profiling.
- Used OOZIE1.2.1Operational Services for batch processing and scheduling workflows dynamically and created UDF's to store specialized data structures in HBase and Cassandra.
- Developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
- Used Impala to read, write and query the Hadoop data in HDFS from Cassandra and configured Kafka to read and write messages from external programs.
- Optimizing existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
- Create a complete processing engine, based on Cloudera distribution, enhanced to performance
Environment: Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Kafka, Flume, Oracle 11g, FiloDB, Spark, Akka, Scala, Cloudera HDFS, Talend, Eclipse, Oozie, Node.js, Unix/Linux, Aws, JQuery, Ajax, Python, Zookeeper.
Confidential, MN
Hadoop/ Bigdata Developer
Responsibilities:
- Developed efficient MapReduce programs for filtering out the unstructured data and developed multiple MapReduce jobs to perform data cleaning and preprocessing on Hortonworks.
- Implemented Data Interface to get information of customers using RestAPIand Pre-Processdata using MapReduce2.0 and store into HDFS (Hortonworks)
- Extracted files from MySQL, Oracle, and Teradata2 through Sqoop1.4.6and placed in HDFS Cloudera Distribution and processed.
- Worked with various HDFS file formats like Avro1.7.6, Sequence File, Json and various compression formats like Snappy, bzip2.
- Successfully written Spark Streaming application to read streaming twitter data and analyze twitter records in real time using kafka and flume to measure performance of Apache spark streaming.
- Proficient in designing Row keys and Schema Design for NoSQL Database Hbaseand knowledge of other NOSQL database Cassandra.
- Used Hive to perform data validation on the data ingested using scoop and flume and the cleansed data set is pushed into Hbase.
- Good understanding of Cassandra Data Modeling based on applications.
- Wrote ETL jobs to read from web APIs using REST and HTTP calls and loaded into HDFS using java and Talend.
- Developed the Pig 0.15.0UDF's to pre-process the data for analysis and Migrated ETL operations into Hadoop system using Pig Latin scripts and Python Scripts3.5.1.
- Used Pig as ETL tool to do transformations, event joins, filtering and some pre-aggregations before storing the data into HDFS.
- Troubleshooting, debugging & altering Talend issues, while maintaining the health and performance of the ETL environment.
- Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Used spark to parse XML files and extract values from tags and load it into multiple hive tables.
- Experienced in running Hadoop streaming jobs to process terabytes of formatted data using Pythonscripts.
- Developed small distributed applications in our projects using Zookeeper3.4.7and scheduled the workflows using Oozie 4.2.0 .
- Proficiency in writing the Unix/Linux shell commands.
- Developed a SCP Stimulator which emulates the behavior of intelligent networking and Interacts with SSF
Environment: : Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Kafka, Flume, Oracle 11g, Spark, Scala, Cloudera HDFS, Talend, Eclipse, Oozie, Unix/Linux, Aws, Python, Perl, Zookeeper.
Confidential, Elgin, Illinois
Hadoop/ Bigdata Developer
Responsibilities:
- Developed multiple Map-Reduce jobs in java for data cleaning and pre-processing.
- Performed Map Reduce Programs those are running on the cluster.
- Involved in loading data from RDBMS and web logs into HDFS using Sqoop and Flume.
- Worked on loading the data from MySQL to HBase where necessary using Sqoop.
- Configured Hadoop cluster with Name node and slaves and formatted HDFS.
- Performed Importing and exporting data from Oracle to HDFS and Hive using Sqoop
- Performed source data ingestion, cleansing, and transformation in Hadoop.
- Supported Map-Reduce Programs running on the cluster.
- Wrote Pig Scripts to perform ETL procedures on the data in HDFS.
- Used Oozie workflow engine to run multiple Hive and Pig jobs.
- Analyzed the partitioned and bucketed data and compute various metrics for reporting.
- Created HBase tables to store various data formats of data coming from different portfolios.
- Worked on improving the performance of existing Pig and Hive Queries.
- Involved in developing HiveUDFs and reused in some other requirements. Worked on performing Join operations.
- Developed fingerprinting rules on HIVE which help in uniquely identifying a driver profile
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Exported the result set from Hive to MySQL using Sqoop after processing the data.
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Used Hive to partition and bucket data
Environment: Hadoop. MapReduce, HDFS, HBase, HDP Horton, Sqoop, Data Processing Layer, HUE, AZURE, Erwin, MS Visio, Tableau, SQL, MongoDB, Oozie, UNIX, MySQL, RDBMS, Ambari, Solr Cloud, Lily HBase, Cron.
Confidential
Java developer
Responsibilities:
- Involved in complete requirement analysis, design, coding and testing phases of the project.
- Developed the application using, which is based on Model View Controller design pattern.
- Extensively used Hibernate in data access layer to perform database operations.
- Used Spring Framework for Dependency Injection and integrated it with the Struts Framework and Hibernate.
- Developed front end using Struts framework.
- Configured Struts DynaAction Forms, Message Resources, Action Messages, Action Errors, Validation.xml, and Validator-rules.xml.
- Designed and developed front-end using struts framework. Used JSP, JavaScript, JSTL, EL, Custom Tag libraries and Validations provided by struts framework.
- Used Web services - WSDL and SOAP for getting credit card information from third party.
- Worked on advanced Hibernate associations with multiple levels of Caching, lazy loading.
- Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
- Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings.
- Designed various tables required for the project in Oracle 9i database and used Stored Procedures and Triggers in the application.
- Involved in consuming RESTful Web services to render the data to the front page.
- Performed unit testing using JUnit framework.
Environment: HTML, JSP, Servlets, JDBC, JavaScript, Java API, Spring 3.0, Spring MVC, JDBC, Maven, SVN, Servlets, Struts, Amazon WS, RESTful Web Services, Bootstrap
Confidential
Java developer
Responsibilities:
- Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
- Involved in complete requirement analysis, design, coding and testing phases of the project.
- Participated in JAD meetings to gather the requirements and understand the End Users System.
- Developed user interfaces using JSP, HTML, XML and JavaScript.
- Created Stored Procedures & Functions. Used JDBC to process database calls for DB2/AS400 and SQL Server databases.
- Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
- Created Data sources and Helper classes which will be utilized by all the interfaces to access the data and manipulate the data.
- Used Servlets to implement business components.
- Designed and Developed required service classes for database operation.
- Developed web application called iHUB (integration hub) to initiate all the interface processes using Struts Framework, JSP and HTML.
- Used Java Script validation in JSP pages.
- Developed the interfaces using Eclipse 3.1.1 and JBoss 4.1 Involved in integrated testing, Bug fixing and in Production Support.
Environment: HTML, JSP, Servlets, JDBC, JavaScript, Tomcat, Eclipse IDE, XML, XSL, Tomcat 5