Hadoop developer/Admin Resume Boston MA - Hire IT People

SUMMARY

Have more than 8+ years of experience with Hadoop stack on CCDH and passionate towards working in Big Data and Analytics environment.
Good experience in application and product development using full SDLC primarily using Hadoop, Java, Mainframe and ETL Technologies and worked on data analysis
Proven skills in establishing strategic direction yet technically strong in designing, implementing, and deploying. Collected/translated business requirements into distributed architecture & robust scalable designs.
Experience in working with Map Reduce programs using Apache Hadoop for working with Big Data.
Experience in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions and AWS.
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts
Experience in using Pig, Hive, Scoop, HBase and Cloudera VM.
Extensive experience with ETL and Query big data tools like Pig Latin and Hive QL.
Worked on Kafka messaging system, able to ingest from Kafka to Spark.
Hands on experience in big data ingestion tools like Flume and Sqoop
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig
Extending Hive and Pig core functionality by writing custom UDFs.
Experienced the integration of various data sources like Java 1.5, RDBMS, Shell Scripting, Spreadsheets, and Text files.
Familiar with Java virtual machine (JVM) and multi-threaded processing.
Set up standards and processes for Hadoop based application design and implementation.
Experience in managing and reviewing Hadoop Log files.
Used Elastisearch to index, fetch, and filtering of data.
Worked on AWS by submitting jobs on Ec2 and EMR.
Working knowledge on Kubernetes.
Extensive experience with SQL, PL/SQL and database concepts.
Good knowledge on network protocols, TCP/IP configuration and network architecture.
Worked on NoSQL databases including HBase, Cassandra.
Knowledge in job workflow scheduling and monitoring tools like oozie and Zookeeper
Experience in developing solutions to analyze large data sets efficiently
Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.
Good understanding of XML methodologies (XML, XSL, XSD) including Web Services.

TECHNICAL SKILLS

Tools: and frameworks: Hive, Sqoop, Pig, Puppet, Ambari, HBase, MongoDB, Cassandra, PowerPivot, Flume, Spark, and Jenkins,vertica

Java & J2EE Technologies: Core Java 1.5,Servlets 2.4

Operating Systems: Windows 95/98/2000/XP/Vista/7/8, Unix, Linux, Solaris

IDE Tools: Eclipse 3.2.2,Net Beans 6.1,RSA, RAD, Oracle Web logic workshop

Methodologies: Agile/ Scrum, Waterfall

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

Programming or Scripting Languages: C, Java, SQL, Unix Shell Scripting, Python, SCALA

Database: Oracle 11g 10g 9i, MySQL, NoSQL

PROFESSIONAL EXPERIENCE

Confidential, Dallas -TX

Hadoop Developer

Responsibilities:

Participated in requirement gathering and converting the requirements into technical specifications
Analyzed large data sets by running Hive queries and Pig scripts
Analyze log files through hive and loading Json format to hive, and worked on external and internal tables and hive optimization techniques.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
Involved in scrum meetings and worked in agile methodology.
Worked on Spark broad cast variables and worked on joins RDD.
Used various transformations in development of spark code and worked on performance tuning of spark applications.
Load the data into Spark RDD and performed in-memory data computation to generate the output response.
Worked with integration spark with Kafka by creating producer objects and sending data from producer to Kafka clusters.
Stored data in the form of Avro format Parquet.
Used elastisearch to download data and ingested data
Used kafkutils to extract data from spark.
Worked on code review and testing of developed spark code using Scala.
Implemented spark applications in both local mode for review and distributed mode.
Analyzed airline data using spark and flume and worked on log files
Worked on scripting using chef and puppet.
Worked on webserver logs and ingested data using flume to hdfs and worked on analysis using spark.
Extracted files from Cassandra through Sqoop and placed in HDFS and processed.
Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
Responsible to oversee and write scripts for data to be processed to get ready for the analysts.
Worked on Hbase database creation and data insertion from pig to hbase and hive.
Worked on performance tuning in hbase.
Submitted MR jobs on EMR cluster.
Load and transform large sets of structured, semi structured and unstructured data.
Responsible for loading data from LINUX file systems to HDFS.
Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
Liased with various technical teams to resolve issues.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts

Environment: Hadoop, Java (jdk1.6), Hive, Pig, Sqoop, MapReduce, Flat files, Oracle 11g/10g, MySQL, Linux, Spark, AWS, Hortonworks HDP 2.5,Vertica

Confidential, Boston MA

Hadoop developer/Admin

Responsibilities:

Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Zookeeper and Sqoop.
Developed shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Involved in collecting and aggregating large amounts of log data and staging data in HBASE/HDFS for further analysis.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Involved in writing shell scripts for rolling day-to-day processes and its automation.
Performed data analysis, querying on hive, pig on Cloudera Distributed Hadoop (CDH)
Transformed massive amounts of raw data into actionable analytics include financial data analysis, market, and product by the use of BI tools.

ENVIRONMENT: Hadoop, Map Reduce, Hue, Hive, HDFS, PIG, Sqoop, Cloudera, ZooKeeper, CDH4&CDH5, Oracle, PL/SQL, Linux, Tableau

Confidential

Hadoop/Big Data Analyst

Responsibilities:

Developed Map Reduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables.
Involved in data ingestion into HDFS using Sqoop from variety of sources using the connectors like jdbc and import parameters.
Responsible for managing data from various sources and their metadata.
Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
Installed and configured Hive and wrote Hive UDF’s that helped spot market trends.
Used Hadoop streaming to process terabytes data in XML format.
Hive Queries in Spark-SQL for analysis and processing the data. Used Scala programming to perform transformations and applying business logic.
Implemented Partitioning, Dynamic Partition, Indexing and buckets in Hive.
Loaded the dataset into Hive for ETL Operation.
Stored processed data in parquet file format.
Streamed data from data source using Flume.
Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
Converting Hive/SQL queries into Spark transformations using Spark RDD, Scala.
Worked with Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark Streaming.
Developed Flume ETL job for handling data from HTTP Source and Sink as HDFS.
Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark.
Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run Map Reduce jobs in the backend.

Environment: Hive, Sqoop, Pig, Puppet, HBase, Cassandra, Tableau, Flume, Spark

Confidential

Java Developer

Responsibilities:

Responsible and active in the analysis, design, implementation and deployment of full Software Development Lifecycle (SDLC) of the project.
Designed and developed user interface using JSP, HTML and JavaScript.
Defined the search criteria and pulled out the record of the customer from the database. Make the required changes and save the updated record back to the database.
Validated the fields of user registration screen and login screen by writing JavaScript validations.
Used DAO and JDBC for database access.
Developed stored procedures and triggers using PL/SQL in order to calculate and update the tables to implement business logic.
Design and develop XML processing components for dynamic menus on the application.
Involved in postproduction support and maintenance of the application.
Involved in the analysis, design, implementation, and testing of the project.
Implemented the presentation layer with HTML, XHTML and JavaScript.

Environment: Java 1.5, Oracle 11g, HTML, XML, SQL, J2EE, JUnit

We provide IT Staff Augmentation Services!

Hadoop Developer/admin Resume

Boston, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship