Hadoop Developer Resume
St Louis, MissourI
SUMMARY:
- Over 5 years of professional IT experience with Analysis, Design, Development, Testing, Implementation and Production Support.
- Over 3 years of experience in Big Data and tools in Hadoop Ecosystem including Pig, Hive, Sqoop, Oozie, Zookeeper, Flume & Impala.
- Strong experience on Big Data development, test and deployment to production.
- Involved in all the phases of Life Cycle of Software Development (SDLC).
- Requirement Analysis, Design, Development, Testing and Prod deployments of J2EE Business applications, Web - based and n-tier applications using the following core technologies - Java, Servlets, JSP, JSTL, and XML.
- Proficient knowledge on handling big data usingHadooparchitecture, HDFS, Map Reduce, HBase, Hive, Pig, Flume, Oozie & Sqoop.
- Working experience on ClouderaHadoopdistribution versions.
- Cloudera Hue web interface for executing the respective scripts.
- Expertise in writing Map-Reduce Jobs in Java.
- Expertise in usage ofHadoopand its ecosystem commands.
- Expertise in moving/loading the data from RDBMS (Oracle, DB2) toHadoopHDFS using Sqoop scripts.
- Proficient on processing the data using Apache Pig by registering User Defined Functions (UDF) written in Java.
- Skilled in scheduling recurringHadoopjobs using Apache Oozie workflows.
- Proficient in designing and querying the NoSQL databases like HBase.
- Strong Proficiency in R on concepts such as data transformation, filter and analytics.
- Having experience on RDD architecture and implementingsparkoperations on RDD and also optimizing transformations and actions inspark.
- Good knowledge onsparkcomponents likeSparkSql, MLib,SparkStreaming and GraphX
- Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
- Well knowledgeable in building, configuring, monitoring, supportingHadoopenvironment using Cloudera manager, Hortonworks, AWS, and apache Ambari.
- Developed various Pig and Hive UDFs (User Defined Functions) to extend functionality to solve multiple bigdata filtering problems.
- Extensive knowledge and experience in various OS platforms like UNIX, Linux RHEL, AIX and Windows.
- Hands on experience with writing Unix shell and bash scripts.
- Strong experience in MapReduce programming and customizing framework at various levels and Input formats like Sequence FileInput Format, KeyValuePairInputFormat etc.
- Worked on following NoSQL databases MongoDB, HBase and Cassandra and ensured faster access to data on HDFS.
TECHNICAL SKILLS:
Hadoop: DistributionApache, Cloudera CDH, Horton Works HDP
Big Data Technologies: Apache Hadoop(MRv1,MRv2),Hive,Pig,Sqoop,HBase,Flume, Oozie, AMBARI, Spark, TEZ, KAFKA, Storm,R, Elasticsearch, Solr
Cloud Platforms: Amazon Web Services (EC2), Google cloud platform
Operating Systems: Windows, Linux RHEL & Unix
Languages: C,C++, Java, PL/SQL, Unix Shell
Web Technologies: HTML, JSP, JSF, CSS, JavaScript
IDEs: Eclipse, JBOSS, IBM Web Sphere
Reporting Tools: SAP Business Objects, Microstategy, Tableau.
Webservers /App Servers: Apache Tomcat 6.0/7.0, IBM WebSphere 6.0/7.0, JBoss 4.3
Databases: Oracle, MySQL, SQL Server 2008, MongoDB(NoSQL), LDA
ETL Tools: Informatica, Datastage
PROFESSIONAL EXPERIENCE:
Confidential, St louis, Missouri
Hadoop Developer
Responsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Involved in installing, configuring & managingHadoopEcosystem components like Hive, Pig, Sqoop and Flume.
- Migrated the existing data toHadoopfrom RDBMS using Sqoop for processing the data.
- Responsible for loading unstructured and semi-structured data intoHadoopcluster coming from different sources using Flume and managing.
- Developed MapReduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side using distributed cache.
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing Hive Queries to further analyze the data.
- Wrote SQL queries and performed Back-End Testing for data validation to check the data integrity during migration from back-end to front-end.
- Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
- Worked on setting up Pig, Hive, Redshift and Hbase on multiple nodes and developed using Pig, Hive, Hbase and MapReduce.
- Used the JSON and Avro SerDe's for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.
- Implemented Hive custom UDF's to achieve comprehensive data analysis.
- Wrote MRUnit tests for unit testing the MapReduce jobs.
- Wrote test cases in JUnit for unit testing of classes
- Worked on parsing the XML files using DOM/SAX parsers and used JAXB for data retrieval.
- Implemented daily workflow for extraction, processing and analysis of data with Oozie.
- Responsible for troubleshooting MapReduce jobs by reviewing the log files.
- Created tables, inserted data and executed various Cassandra Query language (CQL 3) commands on tables using cqlsh.
- Setting up the Component Integration testing environment and trouble shooting.
- Involved in deployment of application on WebSphere Application Server.
Environment: Hadoop, Hive, MapReduce, Oozie, Core Java, Jdk6, Spring 3.0, Spring MVC, Bootstrap, Spring AOP, Spring JDBC, DB2, JSON, JAXB, XML, Oracle 11g, Sqoop, Flume, Eclipse, Hue.
Confidential, Reston / Virginia
Big Data Developer
Responsibilities:
- Worked on data loading, querying and data extraction for the data present inHadoopFile System.
- Automated all the jobs for extracting the data from different Data Sources and to push the result set data toHadoopDistributed File System.
- Worked on data migration from existing data stores toHadoopFile System (HDFS)
- Responsible for loading and managing the data from different databases to HDFS.
- Worked on importing and exporting data into HDFS and Hive using Sqoop.
- Exported the data from Oracle, MySQL, SAP files toHadoopFile System using Sqoop.
- Used Pig to preprocess the data and Performed data transformations and then exported this metrics back to Teradata using Sqoop.
- Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
- Developed Pig Latin scripts to extract and transform the data according to business needs and output files to load into HDFS.
- Created tables and views using hive with different file formats and compression.
- Implemented dynamic partitioning and bucketing for the hive tables.
- Worked on Hive for analyzing the datasets.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with IDW tables and historical metrics.
- Experience in creating HBase and hive tables to store large sets of data.
- Experienced in optimizing Hive Queries and provided support for end-to-end ETL Workflow.
- Worked with Business Users in providing them the HQL queries for their data analysis.
- Successfully loaded files to Hive and HDFS from HBase.
- Worked on performing Data Quality checks for the data loaded in HDFS.
- Knowledge on implementing error handling service forHadoopjobs.
Environment: Apache Hadoop(HDFS), Pig: 0.14, Hive: 0.14, Oozie: 4.1, Hbase, Apache Solr, Apache Tomcat, Jenkins, Nexus, Shell Scripts.
Confidential
ETL/Hadoop Developer
Responsibilities:
- Developed solutions to process the data into HDFS (HadoopDistributed File System), within Hadoop.
- Used Sqoop extensively to ingest data from various source systems into HDFS.
- Involved in runningHadoopjobs for processing millions of records of text data.
- Hive was used to produce results quickly based on the report that was requested.
- Extensively used Oozie and Zookeeper to automate the flow of jobs and coordination in the cluster respectively.
- Developed shell scripts, which acts as wrapper to startHadoopjobs and set the configuration parameters.
- Worked on a stand-alone as well as a distributedHadoopapplication.
- Tested the performance of the data sets on various NoSQL databases.
- Understood complex data structures of different types (structured, semi structured) and de-normalizing for storage.
- Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
- Created and maintained Technical documentation for launchingHADOOPClusters and for executing Hive queries and Pig Scripts.
- Developed Shell scripts for automate routine tasks.
- Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows.
Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, HBase, Oozie, Flume, java.
Confidential
ETL Developer
Responsibilities:
- Involved in analyzing and designing database using Normalization techniques.
- Developed Forms, Menus, Object Libraries, and the PL/SQL Library using Oracle Form Builder 10g/9i.
- Created and maintained development environment, resolved Production and validated Environment problems.
- Handled PL/SQL Compile-Time, Runtime Errors, and Debugging stored procedure for business logic modification, responding to System Events through Triggers.
- Created PL/SQL Stored Procedures, Functions, Triggers and Packages for implementing business logic.
- Implemented exception handling using autonomous transactions, Locks, used save points, commits & rollbacks to maintain transactional consistency & database integrity.
- Developed Front end user interface using Forms 10g.
- Detected and Corrected bugs during system integration and user acceptance testing.
- Involved in Planning, Implementation, Testing, and Documentation.
- Expertise to follow agile process in application development.
- Mapped the source and target databases by studying the specifications and analyzed the required transforms.
- Created numerous Mappings and Mapplets using Transformations like Filters, Aggregator, Lookups, Expression, and Sequence generator, Sorter, Joiner and Update Strategy.
- Created Custom Triggers, Stored Procedures, Packages and SQL Scripts.
- Made changes in existing Oracle PL/SQL Functions, Procedures, and Packages and also wrote new code as per the requirements and handled exceptions.
Environment: Oracle 9i/10g, Teradata, SQL, SQL*Plus, PL/SQL, SQL*Loader,InformaticaPower Center 8.6/8.1, Export/Import, UNIX Server, Oracle Web Application Server, Forms and Reports 9i, Toad
Confidential
Java Developer
Responsibilities:
- Analyzed and gathered the system requirements.
- Created design documents and reviewed with team in addition to assisting the business analyst / project manager in explanations to line of business.
- Developed the web tier using JSP to show account details and summary.
- Designed and developed the UI using JSP, HTML and JavaScript.
- Utilized JPA for Object/Relational Mapping purposes for transparent persistence onto the SQL Server database.
- Used Tomcat web server for development purpose.
- Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
- Developed application using Eclipse.
- Used Log4J to print the logging, debugging, warning, info on the server console.
- Interacted with Business Analyst for requirements gathering.