Hadoop/java Developer Resume
Chicago, IL
SUMMARY
- Around 7+years of experience in IT industry includes Hadoop/Java developer, Master - slave architecture design, besides Scala, java developer experience.
- Hands on experience with Cloudera and multi cluster nodes on Hortonworks Sandbox.
- Expertise at designing tables in Hive, MYSQL using SQOOP and processing data like importing and exporting of databases to the HDFS.
- Experienced in data processing of different forms including structured, semi-structured and unstructured data.
- Experienced in working with data architecture including pipeline design of data ingestion, Architecture information of Hadoop, data modeling and advanced data processing.
- ETL: Data extraction, managing, aggressions and loading into HBase.
- Hands on experience with Web Services using XML, CSS, HTML, JAX-RS, SOAP, REST, WSDL.
- Expertise in developing Pig Latin Script and Hive Query Language.
- Proficiency in Linux (UNIX) and Windows OS.
- Hands on experience with build and deploying tools like Maven and GitHub using Bash scripting.
- Extensive knowledge about Zookeeper process for various types of centralized configurations.
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map-Reduce and Pig jobs.
- Good understanding of SDLC and STLC, PLSQL.
- Experienced the integration of various data sources including Java, JDBC, RDBMS, Shell Scripting, Spreadsheets, and Text files.
- Hands on experience with full stack java technologies.
- Experience in managing and reviewing Hadoop Log files using FLUME and Kafka and developed the Pig UDF's and Hive UDF's to pre-process the data for analysis.
- Experience with NOSQL databases like HBASE and Cassandra.
- Hands on experience with SPARK to handle the batch oriented and streaming data.
- Hands on experience with Spring tool suit for development of Scala Applications.
- Shell Scripting to load the data and process it from various Enterprise Resource Planning (ERP) sources.
- Exposure to AWS or cloud technologies like docker, puppet, Chef, Git, Jenkins, etc.
- Hands on experience in writing Pig Latin and Pig Interpreter.
- Expertise in Hadoop components like Yarn, Pig, Hive, HBase, Flume, Oozie, Shell Scripting like Bash.
- Hands on experience with SCALA for the batch processing and spark streaming data.
- Good Understanding of Hadoop architecture and the daemons of Hadoop including Apache NiFi, Name-Node, Data Node, Job Tracker, Task Tracker, Resource Manager.
- Exposure to Power BI concepts like data factory, data lakes, Bloob, load balancer, etc.
- Hands on experience in ingesting data into Data Warehouse using various data loading techniques.
- Exposure to various Markup languages including XML, JSON, XHTML.
- Hands on experience with MapReduce, Pig, Programming Model, Installing and Configuration of Hadoop, HBase, Hive, Pig, Sqoop and Flume using Linux commands.
- Handle the JSON, XML, Log data using Hive (SERDE), Pig and filter the data based on query factor.
TECHNICAL SKILLS
Big Data Ecosystem: Sqoop, HBase, Flume, Kafka, Hive, Zookeeper, Apache Pig, Spark. MapReduce, Yarn Cassandra.
Big Data Platform: Cloudera (5.x), Hortonworks Sandbox
Java: JAXB, JAXRS, JAXWS, JMS, JDBC, AJAX, JSP, J2EE
Tools: NetBeans, Eclipse, Sublime
Languages: JAVA, CSS, JSON, C, SCALA
Markup: XML, JSON, CSV, XHTML, HTML, UML
Operating System: Linux, Windows, Ubuntu, Unix.
Business Tool: SSRS, Tableau, MS Excel
ETL Tools: SSIS, Informatica
Version Control: GitHub, GitLab
Web Services: REST, SOAP
Relational Databases: MySQL, MSSQL, ORACLE
NOSQL databases: HBase, Cassandra
Protocols: HTTP, FTP, SSH, HTTPs, TCP
Proficient Language: Pig, Hive, SQL, Hive(Serde), Sqoop
PROFESSIONAL EXPERIENCE
Confidential, Chicago, IL
Hadoop/Java Developer
Responsibilities:
- Importing and exporting data jobs to perform operations like copying data from HDFS and to HDFS using Sqoop.
- Data integration into destination which is received from various providers using Sqoop onto HDFS for analysis and data processing.
- Managed clustering environment using Hadoop platform.
- Worked with Pig, HBase, NoSQL database HBASE and Sqoop, for analyzing the Hadoop cluster as well as big data.
- Managed data using the ingestion tool Kafka.
- Wrote and implemented Apache PIG scripts to load data from and to store data into Hive.
- Assisted admin for extending and setting up the nodes on to the cluster.
- Implemented the NoSQL database HBase and the management of the other tools and process observed running on YARN.
- Wrote Hive UDFS to extract data from staging tables and analyzed the web log data using the Hive QL.
- Used multi-threading concepts and clustering concepts for data processing.
- Manage the clustering and designing of debug the issue if exits any.
- Involved in creating Hive tables, load data and writing hive queries, which runs map reduce in backend and further Partitioning and Bucketing was done when required.
- UsedZookeeperfor various types of centralized configurations.
- Tested the data coming from the source before processing and resolved problem faced.
- Pushed and commit the sample codes on to the GitHub.
- Ingested the raw data, populated staging tables, and stored the refined data.
- Developed programs to parse the raw data, populate staging tables and store the refined data in partitioned tables.
- Created Hive queries for the market analysts to analyze the emerging data and comparing it with fresh data in reference tables.
- Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
- Tested raw data and executed performance scripts.
- Shared responsibility with administration of Hive and Pig.
- Worked in Apache Tomcat for deploying and testing the application.
- Worked with different file formats like Text files, Sequence Files, Avro.
- Written Java program to retrieve data from HDFS and providing REST services.
- Used Automation tools like Maven.
- Used spring framework to provide the RESTFUL services.
- Provided design recommendations and thought leadership to stakeholders that improved review processes and resolved technical problems.
Environment: Java 7, Eclipse, Hadoop, Hive, HBase, Linux, Map Reduce, Pig, HDFS, Oozie, Shell Scripting, MySQL.
Confidential, Long Island, NY
BIG DATA /JAVA ENGINEER
Responsibilities:
- Used Scala for spark streaming and SPARK and Akka for ongoing transactions of customers.
- Working with the transactional data on daily basis with enforcement given.
- Developed Scala applications using Spring tool suits environment.
- Transferred data from a Data Source using Sqoop to HDFS to perform analysis.
- Used Shells cripting to analyze the data from ERP source and processed it to store into HDFS.
- Stored data from HDFS to respective Hive tables for further analysis to identify the Trends in data.
- Data from data source is pushed into spark using Flume.
- Deeply analyzed the trend in the Customer Behavior and the cause leading to that behavior.
- Developed Hive Ad-Hocqueries filtered data to increase the efficiency of the process execution by using functions like joins, group by and so on.
- Increased the time efficiency of the HIVEQL and reduced the time difference of executing the sets of data by applying the compression techniques for Map-Reduce Jobs.
- Ingested the raw data, populated staging tables, and stored the refined data.
- Developed programs to parse the raw data, populate staging tables and store the refined data in partitioned tables.
- Created Hive queries for the market analysts to analyze the emerging data and comparing it with fresh data with reference tables.
- Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
- Tested raw data and executed performance scripts.
- Shared responsibility with administration of Hive and Pig.
- Worked in Apache Tomcat for deploying and testing the application.
- Worked with different file formats like Text files, Sequence Files, Avro.
- Created Hive Partitions for storing Data for Different Trends under Different Partitions.
- Connected the hive tables to Data analyzing tools like Tableau for Graphical representation of the trends.
- Time sensitive task to process and analyze data using Hive.
- Built analysis report using software like ESRIARCGIS for graphical representation with respect to location.
- Assisted the project manager in the project for problem solving with data integration.
- Built big Data solutions using HBase handling millions of records for the different trends of data and exporting it to Hive.
- Written Oozie to run job onto data availability of transactions.
- Debugged the technical issues and errors was resolved.
Environment: Hadoop, Spark, Hive, HBase, Hive Bucketing & partitioning, Tableau, excel, Hive-Hbase, JAVA 6, Esri ARC GIS, Linux, Sqoop.
Confidential, Chicago, IL
BIG DATA ENGINEER
Responsibilities:
- Ingestion and loading data from various file system to HDFS using Unix command line utilities.
- Imported the data from RDBMS (MYSQL) to HDFS using Sqoop.
- Used SSIS to perform the ETL analysis on to the data coming from different sources.
- Load and transform large data sets of structured, semi structure and unstructured data using the cluster.
- Leading a team and was responsible for data ingesting and processing and analyze it till the desired records were generated.
- Created Spark SQL queries for faster processing of data.
- Developed Scala applications using Spring tool suit environment.
- MPP based analytics for enterprising business.
- Ingested data into the data warehouse using the data ingestion techniques.
- Used Spark RDD for faster Data sharing.
- Used Kafka to ingest data into a data center unit.
- Developed Hadoop streaming jobs to ingest large amount of data.
- Converted Data and from Cassandra and cleaned data based on the requirement.
- Log data and implemented Hive custom UDF’s.
- Used JSON,XML and AvroSerDe’s for serialization and de-serialization packaged with Hive to parse the contents of streamed.
- Managing data from various file system to HDFS using Unix command line utilities.
- Tested the data of this files and par formation was done.
- Written Hive Queries to fetch Data from Cassandra and transferred to HDFS through HIVE.
- Exported the patterns analyzed back to MYSQL using Sqoop.
- Debugged the results to find if there is any missing at the outcome.
- Executed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
Environment: MYSQL, Hive, Hive-Serde, Pig, Hive-Udf, RDBMS, HDFS, Cassandra, Map-Reduce, eclipse, NetBeans.
Confidential, Detroit, MI
HADOOP DEVELOPER
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and processing.
- Importing and exporting data into HDFS and Hive using Sqoop
- Managing and reviewing Hadoop log files
- Tested raw data and executed performance scripts.
- Shared responsibility with administration of Hive and Pig.
- Worked in Apache Tomcat for deploying and testing the application.
- Worked with different file formats like Text files, Sequence Files, Avro.
- Created Hive Partitions for storing Data for Different Trends under Different Partitions.
- Running Hadoop streaming jobs to process terabytes of xml format data
- Loading and transforming large sets of different data forms.
- Managing data coming from different sources
- Jobs management using Fair Scheduler
- Automating data jobs using jenkins
- Built analysis report using software like ESRI ARC GIS for graphical representation with respect to location.
- Worked with tools like Sqoop-HBase, Hive-HBase, Sqoop-Hive.
- Built big Data solutions using HBase handling millions of records for the different trends of data and exporting it to Hive.
- Good Understanding of technical and analytical skills with clear understanding of design and project architecture based on reporting requirements.
Environment: Hadoop, Hive, HBase, Hive Bucketing & partitioning, Tableau, excelHive-HBase, Linux, Sqoop
Confidential
HADOOP DEVELOPER
Responsibilities:
- Ingested the raw data, populated staging tables and stored the refined data.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Used the various API of java including JAXRS, JDBC, AJAX.
- Created Hive queries for the market analysts to analyze the emerging data and comparing it with fresh data with EDW reference tables.
- Involved in the regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
- Tested raw data and executed performance scripts.
- Shared responsibility with administration of Hive and Pig.
- Designed and created ETL jobs using to load the data into Cassandra and Relational databases.
- Worked in Apache Tomcat for deploying and testing the application.
- Worked with different file formats like Text files, Sequence Files, Avro.
- Written Java program to retrieve data from HDFS and providing REST services.
- Managed and reviewed Hadoop log files using Flume and Kafka.
- Created and written queries in SQL to update the changes in MySQL when we upload or delete file in HDFS.
- Extended support for application to work with Hive, Pig, HBase and Sqoop.
- Enabled speedy reviews by using Oozie for automated data loading into the HFS and PIG to preprocess the data.
- Debugging was done to find if there is any defect.
Environment: Hive, Pig, HBase, MySQL, Flume, Eclipse, Map-Reduce, Cassandra, NetBeans
Confidential
JAVA Developer
Responsibilities:
- Hands on experience with sprints in an Agile environment.
- Hands on experience to develop web application services using J2EE.
- Developed Ajax code to consume the SOAP services and Rest Services.
- User Interface Design was developed using HTML, CSS, java Script, jQuery, Ajax
- Validation on Web Forms, for client side validation as per the requirement.
- Used the protocols like http, https, and ftp for connecting to the server.
- Experienced in developing code to convert JSON data to Customize JavaScript objects.
- Page customization to display huge data.
- Properties files using Junit and Selenium testing tool.
- Designed Portal GUI using Master Page, Login control, Client side validation.
- Used Eclipse for development, Testing, and Code Review.
- Tested the application under Scrum (Agile) Methodology.
- Perform Regression Testing and Involve in Debugging the scripts.
- Enhancement of web forms by fixing the bugs as per requirement.
Environment: HTML, CSS, Eclipse, JavaScript, JDBC, MS SQL Server, Ajax, JAVA, JAX-RS, JAX-WS, JIRA.