Hadoop Developer Resume Framingham, MA - Hire IT People

SUMMARY:

Over 6 years of IT experience in Design, Development, Deployment, Maintenance, and Support of Java/J2EE applications.
3 years of experience in Hadoop distributed file system (HDFS), Impala, Hive, Hbase, Spark, Hue, Map Reduce framework and Sqoop.
Around 1year experience on Spark and Scala.
Experienced as Hadoop, expertise in providing end to end solutions for real time big data problems by implementing distributed processing concepts such as map reduce on Hadoop frameworks such as HDFS and Hadoop Ecosystem components.
Experience in working on large scale big data implementations and in production environment.
Hands on experience on Data Migration from Relational Database to Hadoop Platform\ SQOOP.
Experienced in using Pig scripts to do transformations, event joins, filters and some pre - aggregations before storing the data onto HDFS.
Developed analytical components using Scala, Spark and Spark Stream.
Expertise in developing both Front End and Back End applications using Java, Servlets, JSP, Web Services, JavaScript, HTML, Spring, Hibernate, JDBC, XML.
Extensively used Apache Flume to collect the logs and error messages across the cluster.
Experienced in creating producer and consumer API’s using Kafka.
Used Spark-Streaming APIs to perform necessary transformations and actions on the fly
Having experience in developing a data pipeline using Kafka to store data into HDFS. Performed real time data streaming using Spark Streaming, Kafka and Flume.
Improving performance and optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD’s.
Developed Spark scripts by using Scala shell commands as per the requirement.
Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning
Good experience in writing Spark applications using Scala and Java and used Scala sbt to develop Scala projects and executed using Spark-submit
Good experience in performing analytics with Impala.
Excellent understanding and knowledge of NOSQL databases like MongoDB, HBase and Cassandra.
Good knowledge on handling analytical operations on data using Apache KUDU.
Very good understanding Cassandra cluster mechanism that includes replication strategies, snitch, gossip, consistent hashing and consistency levels.
Worked with Zookeeper to manage the flow of jobs and coordination in the cluster.
Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
In depth understanding of data structures and algorithms.
Excellent knowledge in Java and SQL in application development and deployment.
Strong knowledge in programming languages like C, C++.
Experienced in Linux administration.

TECHNICAL SKILLS:

Big Data: Apache Hadoop, Apache Spark, Hive, Pig, Sqoop

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Scala.

Databases: MS SQL Server, MySQL.

Scripting Languages: UNIX Shell script, Java Script, python.

Web Technologies: HTML, Java Script, jQuery

Office Tools: MS Office 2003/2007/2010.

Operating System: Linux, Windows XP/7/8/10.

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential - Framingham, MA

Responsibilities:

Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
Analyzed the data by performing Hive queries and running Pig scripts.
Developed Simple to complex Map/reduce Jobs using Hive, Pig and Python.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Developed data pipeline using Pig and Hive from Teradata, DB2 data sources. These pipelines had customized UDF to extend the ETL functionality.
Developed applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Orchestrated hundreds of Sqoop scripts, python scripts, Hive queries using Oozie workflows and sub-workflows.
Moved all crawl data flat files generated from various retailers to HDFS for further processing.
Writing the script files for processing data and loading to HDFS.
Worked on requirement gathering, analysis and translated business requirements into technical design with Hadoop Ecosystem.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
Creation of Java classes and interfaces to implement the system.
Loaded data from source systems to ingestion tables using hive load commands.
Created Hive tables to store the processed results in a tabular format.
Involved in gathering the requirements, designing, development and testing.
Completely involved in the requirement analysis phase.
Created External Hive Table on top of parsed data.
Developed various complex Hive Query’s as per business logic.
Developing complex hive queries using Joins and partitions for huge data sets as per business requirements and load the filtered data from source to edge node hive tables and validate the data.
Moved all log/text files generated by various products into HDFS location.
Performed bucketing and partitioning of data using apache hive which saves the processing time and generating proper sample insights.
Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with Hive queries.
Written Map Reduce code that will take input as log files and parse the logs and structure them in tabular format to facilitate effective querying on the log data.

Environment: HDFS, Map Reduce, Hive, Pig, Flume, Oozie, Sqoop, CDH5, Spark, Python.

Hadoop Developer

Confidential, Boston, MA

Responsibilities:

Troubleshooting various configuration issues between different components in the ecosystem to ensure seamless performance.
MovedlargeamountsofArchivedhistoricaldatafromtheexistingsystemsintotheHadoopdatalakefor future analysis.
Ingested data using Sqoop into Hadoop Data Lake from traditional RDBMS.
DevelopedcustomMapReduceprogramsinjavatotransformloadeddataandanalyzedtheresultsforbetter business insights.
Created Hive tables and implemented partitioning technique to improve query performance.
ExperimentedwithrunningvariousPigcommandsandPigLatinscriptsonthedataandanalyzedtheresults in business perspective.
Implemented various MapReduce Jobs in custom environments and updating them to Hbase tables by generating hive queries.
Participating in telephonic conversation with client partners and vendors to meet day to day deliverables.

Environment: Hadoop MapReduce, HDFS, Hive, Sqoop, Pig, Linux and MySQL.

Java Developer

Confidential

Responsibilities:

Involved in different phases of Software Development Lifecycle (SDLC) like Requirements gathering, Analysis, Design and Development of the application.
Involved in designing and implementation of MVC design pattern using Spring framework for Web-tier.
Worked on the Web Services using SOAP and RESTful web services.
Involved in developing the user interface using Struts.
Wrote several Action Classes and Action Forms to capture user input and created different web pages using JSTL, JSP, HTML, Custom Tags and Struts Tags.
Designed and developed Message Flows and Message Sets and other service component to expose Mainframe applications to enterprise J2EE applications.
Used standard data access technologies like JDBC and ORM tool like Hibernate
Worked on various client websites that used Struts 1 framework and Hibernate
Wrote test cases using JUnit testing framework and configured applications on WebLogic Server
Involved in writing stored procedures, views, user-defined functions and triggers in SQL Server database for Reports module.

Environment: Java, Spring MVC, Struts, RESTful, JSP, JUnit, Eclipse, JIRA, JDBC, Struts 1, Hibernate, WebLogic, Oracle 9i.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Framingham, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship