We provide IT Staff Augmentation Services!

Big Data Consultant Resume

5.00/5 (Submit Your Rating)

SUMMARY:

­­­­­­­­­­­­­­

  • 13+ years of experience in Software Industry include consulting, design, project management, client delivery, testing, implementation & maintenance.
  • Hands on experience with Apache Hadoop components like HDFS, MapReduce, Hive, Impala, HBase, Pig, Sqoop, Oozie, Flume, Big Data and Big Data Analytics.
  • Experience in installing, configuring and administrating the Hadoop Cluster of Major Hadoop Distributions such as Cloudera, Hortonworks etc.
  • Experience in the successful implementation of ETL solutions on data extraction, transformation and load in Sqoop, Hive, Pig, Spark etc. Worked with NoSQL databases like HBase.
  • Developed/migrated numerous applications on Spark using Spark Core and Spark Streaming APIs in Java. Optimized MapReduce jobs/SQL queries into Spark Transformations using Spark RDDs.
  • Experience in developing and integrating Spark Framework with Kafka Topics for real time streaming applications.
  • Good experience in working with cloud environment like Amazon Web Services (AWS).
  • Expertise in design and development of various web and enterprise applications using Java/J2EE and big data technologies in cloud environment like Hp Helion Cloud and AWS.
  • Expertise in Implementing Haven as a Service (Haas) big data solutions in Multiple Client locations.
  • Committed to excellence, self - motivator, fast-learner, team-player, and a prudent developer with strong problem-solving skills and communication skills.

TECHNICAL SKILLS:

Big Data Components: HDFS, Hive, Impala, Sqoop, Pig, Spark, Oozie, Flume, Kafka, Hbase, MapReduce, HBase

Programming Languages: JAVA, SQL, PL/SQL, Python, Linux Shell scripting

J2EE Technologies: Servlets, JSP, Maven, JDBC, JMS

Web Technologies: HTML, DHTML, XML, XSLT, CSS, DOM, SAX, AJAX

Application/Web Server: IBM Web Sphere, Web Logic, JBoss, IISTomcat 5.0, Iplanet, CA workload automation

Databases: Oracle, SQL Server, Sybase, Vertica, Hive

Operating Systems: Windows, Red Hat Linux, Solaris, HP Helion Cloud, Amazon Web services (AWS)

Frameworks: Struts, spring, Hibernate, JUnit, Log4j, Apache Camel

ETL: Talend

PROFESSIONAL EXPERIENCE

Confidential

Big Data Consultant

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Responsibilities include analysis of these various applications, designing of the enterprise applications, co-ordination with client, meetings with business users, functional and technical guide to the team, project management.
  • Involved in the process of Hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster and it components.
  • Developed Sqoop jobs and MapReduce programs in Java to extract data and load into HDFS from various RDBMS system such as Oracle, SQL Server etc.
  • Designed and implemented Hive/Impala/Pig/Spark queries/scripts and functions for evaluation, filtering, loading and storing of data.
  • Written Map Reduce and Spark Code to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Extensively used Oozie workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows with many actions.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Developed scripts using Python and Linux Shell scripts to create SFTP file transfer between ODC and Tulsa Network and data conversion of EBCDIC to ASCII.
  • Developed Monitoring scripts to test the VPN Connectivity between the ODC and Tulsa networks and automated the same in autosys.
  • Used Active Directory for Directory Services.
  • Also did a setup of a similar Haas environment for State of Wisconsin in AWS cloud.
  • Created all DLLs needed for the application in Vertica and grant access to Vertica based on the roles in the AD (Active directory)
  • Created and tested all connectivity between applications (Tableau -> Vertica, Tableau - > Hive, Talend - > HDFS, Talend -> Vertica) using Kerberos authentication.
  • Configured SVN server in Linux server for Talend and also worked in GitHub as a code management.

Confidential

Big Data Consultant

Responsibilities:

  • Managing fully distributed Hadoop cluster is an additional responsibility assigned to me. I was trained to overtake the responsibilities of a Hadoop Administrator, which includes managing the cluster, Upgrades and installation of tools that uses Hadoop ecosystem.
  • Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as well as RDBMS and NoSQL data stores for data access and analysis.
  • Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, HDFS, HBase, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume on Cloudera Distribution
  • Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement the former in project. Load and transform large sets of structured, semi structured and unstructured data.
  • Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries. Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig/Spark and MR Jobs
  • Worked on POC’s with Apache Spark using scala to implement spark in project. Consumed the data from Kafka using Apache spark.
  • Actively involved in code review and bug fixing for improving the performance.
  • Involved in various phases of Software Development Life Cycle (SDLC) as requirement gathering, data modeling, analysis, architecture design & development for the project
  • Read the streaming social media data from (Datasift / GNIP).
  • The Streamed data is a nested Json objects which is split into multiple records and then they are converted into a canonical xml and ingested into Topics for further processing, The Datasift provides social media data like (Facebook, twitter, you tube, etc.)
  • RSS feeds are also pulled in a separate flow, the raw RSS feed is converted into canonical for further processing and stored in HDFS.
  • Used Regular expression to split the xml data stored in a file to do reprocessing of the records.
  • Created java components to Connect with Autonomy Idol (and validate each individual tweet and list of all #tags and identify the sentiment score for the value given)
  • Updates Vertica Table with all the #tags and Sentiment scores for live report generation.
  • The Entire time for the flow of data from Social Media post to the report is just 90 seconds. (Tibco sport fire is used to create reports like Trending Drivers, Sentiment analysis of the Race and broadcasted live during the race)
  • Used Maven to build the applications. Created scripts for the same including the dependent jars needed for the build)

Confidential

Team Lead

Responsibilities:

  • Involved in Release Management of the project
  • Leading the offshore development team
  • Planning, Scheduling and controlling the build in addition to testing and deploying releases
  • Worked with multi-disciplinary team to understand the impact analysis of upstream and downstream data.
  • Scheduled the jobs using the Linux software utility Cron.
  • Worked on Migration of the applications from WebSphere 5.1 to 7.0
  • Generated Use case diagrams, Class diagrams, and Sequence diagrams using Microsoft Visio
  • Documented all release procedures for the CAB meetings
  • Used LDAP for authentication

Environment: IBM RAD 7.5, WebSphere Application server 7, LDAP, Sybase, Linux Shell scripting, Cron

Confidential

Project Manager / Release Manager

Responsibilities:

  • Responsible for delivery of the project both timely and quality of the application.
  • Planned and defined scope for the project
  • Tasks planning and sequencing and resource allocation
  • Monitored and Reported progress on the same with the management
  • Worked with different vendors involved in the delivery of this project
  • Documented all release timelines and planned release activities as a release Manager
  • Planned, scheduled and controlled the build in addition to testing and deployment of the release
  • Followed all documentation process for the release (moving the application to different environment DEV / SIT / UAT / PROD)

Environment: .Net, IIS server, Java script, html

Confidential

Team Lead/ Onsite Coordinator

Responsibilities:

  • Client relationship and project management including regular meetings with our customer to ensure scope and quality management.
  • Responsible for delivery and quality of delivery of the application on time
  • Managed a Team size of 5 at offshore Chennai and 3 at offshore China
  • Responsible for Onsite-Offshore Co-ordination with both Chennai and China team.
  • Developed web application using Flex, Java, JSP and Hibernate
  • Created store procedures in oracle for some business rules
  • Managed Installation of application to different environment and support activities

Environment: J2ee, Flex, Hibernate, Oracle 10g, Unix

Confidential

Team Lead

Responsibilities:

  • Being new to Tibco, had a very fast learning curve to learn new technology
  • Involved in requirements gathering with the Business analysts from Netherlands
  • Worked on Agile Methodology to capture the correct requirements
  • Mentored a team of 6 People in Chennai
  • Designed and developed various Tibco BW Components needed for the application
  • Documented as per the process throughout the SDLC.
  • Deployed the application to the different environments Test / UAT / PROD
  • Created a support team to support the application once the application is Migrated to PROD

Environment: Tibco BW, IProcess

Confidential

Team Lead / Onsite Co-Ordinator

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as requirement gathering, data modeling, analysis, architecture design & development for the project
  • Mentored a team of 3 in offshore
  • Created requirement / design document for the team in offshore.
  • Alchemy is a third-party tool used by Confidential Legal department to save all the legal documents. Which are scanned through MFC (Multi-functional printer)
  • Integration of MFC with the tool was very challenging and also ported the old documents which are stored in a different system, developing some java routines to read the PDF files.
  • Managed the client Expectation being it technical or communication and explained them to the offshore team.
  • Tracked all task assignment with the offshore and reported to the Client Manager.

Environment: J2ee, Oracle, MFP (Multi-Functional Printer)

Confidential

Senior Team Member / Onsite Co-Ordinator

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as requirement gathering, data modeling, analysis, architecture design & development for the project
  • Developed the User interface using JSP, JavaScript, CSS, HTML
  • Developed the Java Code using Eclipse as IDE
  • Developed JSPs and Servlets to dynamically generate HTML and display the data to the client side. Extensively used JSP tag libraries.
  • Worked with Struts as a unified MVC framework and developed Tag Libraries.
  • Used Struts framework in UI designing and validations
  • Developed Action Classes, which acts as the controller in Struts framework
  • Used PL/SQL to make complex queries and retrieve data from Oracle database
  • Used ANT scripts to build the application and deploy on Web Logic Application Server
  • Designed, written and maintained complex reusable methods which implements stored procedures to fetch data from the database
  • Prepare the Unit Test Case document / user handbook for test cases.

Environment: WebLogic, Actano (Cots product), Oracle, Webservices, SOAP, XSD, J2ee, Struts, ServletPLSQL Store Procedures, Unix

Confidential

Team Member

Responsibilities:

  • Involved in support of KRS System in Production and enhancement releases
  • Migrated the Application from IPlanet server to WebLogic.
  • Created Multi Lingual support for this application
  • Developed high quality coding in J2ee using JSP / Servlets
  • Performed unit testing and bug fixes
  • Created documentation for all QA Process and maintained the same in SharePoint
  • Managed and controlled access to the code repositories
  • Resolved and troubleshoot problems and complex issues

Environment: WebLogic, J2ee, JSP, Servlets, oracle, Iplanet, Unix, IDOL (Indexing Server)

Confidential

Team Member

Responsibilities:

  • Developed Proof of Concept to compare 2 different MQ series (Sonic MQ and IBM WebSphere MQ)
  • Developed high quality coding in J2ee using JSP / Servlets
  • Performed unit testing and bug fixes
  • Created documentation for all QA Process and maintained the same in SharePoint
  • Managed and controlled access to the code repositories
  • Resolved and troubleshoot problems and complex issues

Environment: J2EE, JMS, IBM Web sphere MQ

We'd love your feedback!