Hadoop Developer Resume
Kansas City, MissourI
PROFESSIONAL SUMMARY:
- An IT professional with 5+ years of experience with 2 years of working in Hadoop and its stack including Big data analytics, and expertise in application Design and Development in various domains.
- Experienced Hadoop Developer, have a strong background with file distribution systems in a big - data arena.
- Understands the complex processing needs of big data and has experience developing codes and modules to address those needs.
- End-to-End experience in Design, Development, Maintenance and Analysis of various types of applications using efficient Data science methodologies and Hadoop ecosystem tools.
- Expertise in all the stages of the Software development Life Cycle (SDLC) from requirements to testing and Documentation.
- Excellent communication and interpersonal skills with strong team leading capabilities.
- Highly proficient in understanding new technologies and accomplishing new goals.
- Exceptionally well organized, proactive, strong work ethics and willingness to achieve employer objectives.
- Well versed in Object Oriented Programming and Software Development Life Cycle from project definition to post-deployment
- Experience in Java encompasses software design, development, maintenance of custom applicationsoftware, data structure manipulation, handling error and fault-tolerant systems. Good team player and ability to work in fast paced environment.
- Hadoop Developer Experience:
- Hands on experience in installing, configuring and using Apache Hadoop ecosystems such as Map- Reduce, HIVE, PIG, SQOOP, FLUME, and OOZIE.
- Hands on experience in deploying Hadoop cluster using Cloudera Manager and HortonworksAmbari
- Hands on experience in installation, configuration, management and development of big data solutions using Apache, CLOUDERA and Hortonworks distributions.
- Hands on experience in upgrading, applying patches for cloudera distribution.
- Experience in analyzing data using HIVEQL, PIG Latin and Map Reduce programs in JAVA. Extending HIVE and PIG core functionality by using custom UDF's when required.
- Experience in importing and exporting data using SQOOP from HDFS to Relational Database Systems and vice-versa.
- Good understanding of the NoSQL databases like MongoDB, Cassandra.
- Experience in job workflow scheduling tools like OOZIE.
- Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment - Amazon Web Services (AWS) and on private cloud infrastructure - open stack cloud platform
- Good understanding of YARN Map Reduce V2 workflow.
- Good understanding of Talend, Datameer
- Data Analytics Experience:
- Proficient in programming with Resilient Distributed Datasets (RDDs).
- Experience in tuning and debugging Spark application running on both standalone and YARN cluster mode.
- Experience with Spark Sql, MLib, GraphX and integrating Spark with HDFS.
- Experience in integrating Kafka with Spark for real time data processing.
- Excellent analytical, programming, written and verbal communication skills with the ability to interact with individuals at all levels.
- Extensive experience working on an offshore-onsite model.
- Superior analytical, time management and problem-solving skills.
TECHNICAL SKILLS:
Hadoop Stack: HDFS, PIG, Map Reduce, HIVE, FLUME, SQOOP, OOZIE, Kafka
Spark: Spark Core, Spark SQL, Spark Streaming, MLlib
Cluster Management & Monitoring Tools: Cloudera Manager, HortonworksAmbari, Ganglia, Nagios
NoSQL Database: MongoDB, HBase, Cassandra
Languages: Scala, HIVEQL, PIGLATIN, CQL, Java, C#, SQL
Software Engineering: SDLC (Agile & Waterfall)
Databases: Microsoft SQL Server, MySQL, Oracle
Cloud: AWS, Google Compute Engine, Openstack, Azure
Version Control: SVN, TFS
PROFESSIONAL EXPERIENCE:
Hadoop Developer
Confidential, Kansas City, Missouri
Environment: Java, Map Reduce, HDFS, Hive, Pig, Flume, Sqoop, Flume, MySQL, Oracle, UNIX Shell Scripting, Nagios, HCatalog, Hue.
Responsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Involved in installing, configuring and managing Hadoop Ecosystem components like Hive, Pig, Sqoop and Flume.
- Migrated the existing data to Hadoop from RDBMS (SQL Server and Oracle) using Sqoop for processing the data.
- Responsible for loading unstructured and semi-structured data into Hadoop cluster coming from different sources using Flume and managing.
- Developed MapReduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side using distributed cache.
- Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
- Created internal and external tables with properly defined static and dynamic partitions for efficiency.
- Used the RegEx, JSON and Avro SerDe’s for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.
- Implemented Hive custom UDF’s to achieve comprehensive data analysis.
- Used Pig to develop ad-hoc queries.
- Exported the business required information to RDBMS using Sqoop to make the data available for BI team to generate reports based on data.
- Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Big Data Engineer
Confidential, Overland Park, Kansas
Responsibilities:
- Transformed the RDF data into JSON format which is compatible to Mongo DB.
- Designed the parser to process the SPARQL queries.
- Developed a Jena like parser to address complex queries containing joins.
- In order to import the dataset into Mongo DB, converted the N-Triple dataset into JSON format
- Imported the dataset into Mongo DB data store using Java.
- Applied Map Reduce framework jobs in java for data processing.
- Worked on implementing Hive custom UDFs in java to process and analyze data.
- Performed data analysis in Hive by creating tables, loading it with data and writing Hive queries.
- Used Apache Flume to load data into HDFS using interceptors to filter data for the consumption of Business Intelligence (BI) user.
- Implemented a certain consistency level for reads and writes based on the use case
- Monitoring cluster and logs for any issues and fixing them.
- Created the necessary keyspace’s and modeled column families based on the queries
- Implemented UDFS, UDAFS, UDTFS in java for hive to process the data that can’t be performed using Hive inbuilt functions.
- Transformed the log files into structured data using Hive SerDe’s and Pig Loaders.
- Involved in creating Hive Internal and External tables, loading data and writing hive queries which will run internally in map reduce way.
Confidential, KY
Java/J2EE Consultant
Responsibilities:
- Responsible for understanding and execution of requirements.
- Developed JSPs and Servlets to dynamically generate HTML and display the data to the client side.
- Responsible for designing JApplets using SWING and embedding them into the web pages.
- Was responsible for developing and deploying the EJB (Session &MDB).
- Consumed Web Services (WSDL, SOAP, UDDI).
- Writing/Manipulating the database queries, triggers, stored procedures etc.
- Implemented SOAP for data transfer to Web Service.
- Used Web Services on front end, Servlets as Front Controllers and JavaScript for client side validations.