Pre-Upgrade Procedure

You must complete all the tasks in the order as specified before you begin the upgrade procedure:

  1. Take the Backup
  2. Update the Repositories
  3. Stop the Running Jobs

Take the Backup

Before you install this patch and begin to upgrade MURAL to version 5.0.2.p6, ensure that you have backed up all the critical files from the last version installed that is 5.0.2.p5.

  1. Run the following commands to take the backup of the repositories on the management node:

    mkdir ~/mural_5.0.2.p5_backup
    cd ~/mural_5.0.2.p5_backup/
    cp -r -p /etc/reflex-provisioner/inventory/generated/prod/mural/group_vars/all/mrx .
    cp -r -p /etc/reflex-provisioner/inventory/generated/prod/mural/vars/customer/common .
  2. Run the following commands to take the backup of the streaming directory from HDFS on management node:

    hdfs dfs -get /data/streaming data_streaming
  3. Run the following commands to take the backup of /opt/etc/scripts directory on both the master nodes:

    mkdir ~/mural_5.0.2.p5_backup
    cd ~/mural_5.0.2.p5_backup/
    cp -r -p /opt/etc/scripts .
    

Update the Repositories

Before updating the repositories, ensure that all the packages are downloaded from the SFTP server. For more information, refer to Prerequisites. And, perform the following steps to update the repository with updated RPM's:

Update the Docker Images

Perform the following steps to extract the docker images:

  1. Navigate to the /mrx/5.6/ repository

    cd /opt/repos/mrx/5.6/
  2. Run the following commands:

    mv 5.6.2.rc1/mrx-docker-release-5.6.2.rc1-312.tar.gz  .
    
    tar -zxvf mrx-docker-release-5.6.2.rc1-312.tar.gz -C \
    /opt/repos/mrx/5.6/mrx-docker-5.6.2.rc1/

Update the Repository

Perform the following steps to update a repository:

  1. Run the following command:

    createrepo /opt/repos/mrx/5.6/5.6.2.rc1/

    The following sample may resemble the output:

    Spawning worker 0 with 8 pkgs
    Spawning worker 1 with 8 pkgs
    Spawning worker 2 with 8 pkgs
    Spawning worker 3 with 8 pkgs
    Spawning worker 4 with 8 pkgs
    Spawning worker 5 with 8 pkgs
    Spawning worker 6 with 8 pkgs
    Spawning worker 7 with 8 pkgs
    Workers Finished
    Saving Primary metadata
    Saving file lists metadata
    Saving other metadata
    Generating sqlite DBs
    Sqlite DBs complete

Install Solution Provisioner Package

To replace the default configurations with the backup configuration files, run the following command:

yum install -y /opt/repos/mrx/5.6/5.6.2.rc1/reflex-solution-provisioner-5.6.2.rc1-312.el7.centos.x86_64.rpm

Update Variable

Perform to following steps to update the value of variables such as install_type and tag version number:

  1. Change install_type variable:

    1. Open the file /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/inventory/templates/group_vars/global/all/mrx/all.yml
    2. Change install_type value to upgrade.
  2. Edit /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/inventory/templates/group_vars/global/all/mrx/ingestion/main.yml file to update the following properties:

    source_file_mask: "*http*.gz"
    talend_http_topic: talendhttp
    talend_nonhttp_topic: talendnonhttp
    hdfs_dir: /user/mrx/ingestion/
    hdfs_dir2: /user/mrx/ingestion/
    
  3. Edit /etc/reflex-provisioner/work_dir/reflex-configuration-module/conf/generate_inventory/conf_inventory/prod/mural/extra_vars/solution.yml file to update / add the following properties:

    source_file_mask: "*http*.gz"
    source_file_mask2: "*flow*.gz"
    odsClassification: true
  4. Edit extract.conf.j2 file available at /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/roles/mrx/extractJobs/deploy/templates/opt/mrx/ingestion/etc2 location to update the following properties:

    source_file_mask = {{ source_file_mask2 }}

    Note: Update the property values of dme, 5minAgg, hourlyAgg, dailyAgg, monthlyAgg table names as per your setup.

  5. Edit /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/inventory/templates/group_vars/global/all/mrx/dme/main.yml file to update the following properties:

    kafkaTopicHttpPDM: httpPDM_new
    kafkaTopicNonHttpPDM: nonhttpPDM_new
  6. Edit /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/inventory/templates/group_vars/global/all/mrx/5minAgg/main.yml file to update the following property:

    pdm_agg_output_table: 5min_points_new
    
  7. Edit /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/inventory/templates/group_vars/global/all/mrx/hourlyAgg/main.ymlfile to update the following property:

    hourly_output_tableName: hourly_points_new
  8. Edit /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/inventory/templates/group_vars/global/all/mrx/dailyAgg/main.yml file to update the following property:

    daily_output_tableName: daily_points_new
  9. Edit /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/inventory/templates/group_vars/global/all/mrx/monthlyAgg/main.yml file to update the following property:

    monthly_output_tableName: monthly_points_new
  10. Note: Perform Step 10 and 11, only if Grafana UI opens from secure web protocol (https).

  11. Edit /etc/reflex-provisioner/work_dir/reflex-platform-module/inventory/templates/group_vars/global/all/platform/grafana.yml file to update platform_grafana_api_url property value from http to https as follows:

    platform_grafana_api_url: "https://{{ platform_grafana_access_ip }}:{{ platform_grafana_web_ui_port }}"
    
  12. Edit /etc/reflex-provisioner/work_dir/reflex-solution-provisioner/roles/mrx/grafana/deploy/tasks/main.yml to add validate_certs: no parameter below url: parameter line as follows:

    url: '{{ platform_grafana_api_url }}/api/dashboards/db'
    validate_certs: no
    method: POST

Refresh the inventory

Perform the following steps to refresh the inventory:

  1. Navigate to the directory, reflex-provisioner

    cd /etc/reflex-provisioner
  2. Run the following command to refresh the inventory:

    ./scripts/composition/refresh.sh -i mural -s prod

    Output:

    -i mural was triggered!
    -s prod was triggered!
    Refreshing init inventory
    Refreshing mural inventory

Stop the Running Jobs

  1. Stop the input data flow.

  2. Stop all the running jobs before executing solution installer:

    1. Log in to the name node where jobs are running. For example, jobs in Mural 5 lab run from the active namenode, NN2.
      # ssh <NN2 FQDN>
    2. Stop Talend jobs on the active name node. Perform the following steps to check if the Talend job is running and to stop them respectively:

      1. Check status of talend http process:

        # ps -ef | grep talend-http | grep -v grep

        The following sample may resemble the output:

        root 51484 45694 0 06:59 pts/2 00:00:00 sh /root/jobs/ingestion_jobs/run-talend-http-job.sh
      2. Kill talend http process if it is running:

        # kill -9 <PID's>

        Example:

        # kill -9 51484
      3. Check status of talend non-http process:

        # ps -ef | grep talend-nonhttp | grep -v grep

        The following sample may resemble the output:

        root 51483 45693 0 06:59 pts/2 00:00:00 sh /root/jobs/ingestion_jobs/run-talend-nonhttp-job.sh
      4. Kill talend non-http process if it is running:

        # kill -9 <PID's>

        Example:

        # kill -9 51483
    3. Verify if the jobs are killed

      # ps -ef | egrep 'talend-http|talend-nonhttp' | grep -v grep
    4. Stop the master job.

      # ps -ef | grep master_http |grep -v grep

      The following sample may resemble the output

      root 6420 60609  3 07:05 pts/2    00:09:22 /usr/java/latest/bin/java -cp /usr/lib/spark2/jars/netty-all-4.0.42.Final.jar:/usr/lib/spark2/jars/*:/opt/tms/java/DataMediationEngine/WEB-INF/classes/:/opt/tms/java/dme-with-dependencies.jar:/opt/tms/java/ddj-with-dependencies.jar:/usr/lib/hive/lib/*:/usr/lib/spark2/conf/:/usr/lib/spark2/jars/*:/etc/hadoop/conf/:/etc/hadoop/conf/:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//* -Xmx2g -XX:-ResizePLAB org.apache.spark.deploy.SparkSubmit --master yarn-client --conf spark.scheduler.allocation.file=/opt/tms/java/DataMediationEngine/WEB-INF/classes/poolConfig.xml --conf spark.driver.extraJavaOptions=-XX:-ResizePLAB --properties-file /opt/tms/java/DataMediationEngine/WEB-INF/classes/spark.properties --class com.guavus.reflex.marketing.dme.job.MRXMasterJob --name master_http --queue jobs.dme --files /opt/tms/java/DataMediationEngine/WEB-INF/classes/log4j-executor.properties,/opt/tms/java/DataMediationEngine/WEB-INF/classes/streaming.ini --jars /opt/tms/java/dme-with-dependencies.jar /opt/tms/java/dme-with-dependencies.jar
      root 60609 45694  0 07:02 pts/2    00:00:00 sh /root/jobs/streaming_jobs/master_http_wrapper.sh
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      # kill -9 6420 60609
      • Wait for counters in the /var/log/mural_logs/master_http.out file. Once counters show as 0, then proceed to stop the the master_nonhttp job:
      # ps -ef | grep master_nonhttp |grep -v grep

      The following sample may resemble the output:

      root 61263 45694  0 07:03 pts/2    00:00:00 sh /root/jobs/streaming_jobs/master_nonhttp_wrapper.sh
      root 61349 61263  9 07:03 pts/2    00:24:55 /usr/java/latest/bin/java -cp /usr/lib/spark2/jars/netty-all-4.0.42.Final.jar:/usr/lib/spark2/jars/*:/opt/tms/java/DataMediationEngine2/WEB-INF/classes/:/opt/tms/java/dme-with-dependencies.jar:/opt/tms/java/ddj-with-dependencies.jar:/usr/lib/hive/lib/*:/usr/lib/spark2/conf/:/usr/lib/spark2/jars/*:/etc/hadoop/conf/:/etc/hadoop/conf/:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//* -Xmx2g -XX:-ResizePLAB org.apache.spark.deploy.SparkSubmit --master yarn-client --conf spark.scheduler.allocation.file=/opt/tms/java/DataMediationEngine2/WEB-INF/classes/poolConfig.xml --conf spark.driver.extraJavaOptions=-XX:-ResizePLAB --properties-file /opt/tms/java/DataMediationEngine2/WEB-INF/classes/spark.properties --class com.guavus.reflex.marketing.dme.job.MRXMasterJob --name master_nonhttp --queue jobs.dme --files /opt/tms/java/DataMediationEngine2/WEB-INF/classes/log4j-executor.properties,/opt/tms/java/DataMediationEngine2/WEB-INF/classes/streaming.ini --jars /opt/tms/java/dme-with-dependencies.jar /opt/tms/java/dme-with-dependencies.jar
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      # kill -9 61263 61349
      • Verify if the jobs are killed:
      ps -ef | egrep 'master_http|master_nonhttp' | grep -v \
      grep

      The following sample may resemble the output:

      # ps -ef | egrep 'master_http|master_nonhttp' | grep -v grep
      • Wait for counters in the /var/log/mural_logs/master_nonhttp.out file. Once counters show as 0, then proceed to stop Aggreation job.
    5. Stop CONV and SDR jobs:

      # ps -ef | grep conv_config |grep -v grep

      The following sample may resemble the output:

      # ps -ef | grep conv_config | grep -v grep
      root 38891 1 0 Apr22 ? 00:00:00 sh /root/jobs/aggregation_jobs/run-conv_config_file.sh
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      # kill -9 38891
      • Find sdr process:
      ps -ef | grep sdr_config |grep -v grep

      The following sample may resemble the output:

      # ps -ef | grep sdr_config | grep -v grep
      root 55161 1 0 Apr22 ? 00:00:00 /bin/bash /root/jobs/aggregation_jobs/run-sdr_config_file.sh
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      # kill -9 55161
    6. Stop the 5 minutes Aggregation Job:

      ps -ef | grep 5min-agg |grep -v grep

      The following sample may resemble the output:

      # ps -ef | grep 5min-agg |grep -v grep
      root 5131 45694 0 07:05 pts/2 00:00:00 sh /root/jobs/aggregation_jobs/run-5min-agg-mgr_sh.sh
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      #kill -9 5131
    7. Stop the hourly Aggregation Job:

      ps -ef | grep hourly-agg |grep -v grep

      The following sample may resemble the output:

      # ps -ef | grep hourly-agg | grep -v grep
      root 17868 45694 0 07:10 pts/2 00:00:00 sh /root/jobs/aggregation_jobs/run-hourly-agg-mgr_sh.sh
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      #kill -9 17868
    8. Stop the daily Aggregation Job:

      ps -ef | grep daily-agg |grep -v grep

      The following sample may resemble the output:

      # ps -ef | grep daily-agg | grep -v grep
      root     19338 74594  0 07:10 pts/2    00:00:00 sh /root/jobs/aggregation_jobs/run-daily-agg-mgr_sh.sh
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      #kill -9 19338
    9. Stop the monthly Aggregation Job:

      ps -ef | grep monthly-agg |grep -v grep

      The following sample may resemble the output:

      # ps -ef | grep monthly-agg | grep -v grep
      root 16543 55644 0 07:10 pts/2 00:00:00 sh /root/jobs/aggregation_jobs/run-monthly-agg-mgr_sh.sh
      • Obtain the ID of the pod from the preceding output and run the following command:
      # kill -9 <PID's>

      Example:

      #kill -9 16543
    10. Verify if the aggregation jobs are killed:

      ps -ef | egrep '5min-agg|hourly-agg|daily-agg|monthly-agg'| grep -v grep

      The following sample may resemble the output:

      # ps -ef | egrep '5min-agg|hourly-agg|daily-agg|monthly-agg'| grep -v grep
    11. Stop the cleanup job

      # ps -ef | grep cleanup |grep -v grep

      The following sample may resemble the output:

      # ps -ef | grep cleanup | grep -v grep
      root    11249   1  0 Jun23 ?   00:00:00 sh /root/jobs/misc_jobs/run_cleanup_job.sh

      Obtain the ID of the pod from the preceding output and run the following command:

      # kill -9 <PID's>
      

      Example:

      #kill -9 11249
      
    12. Run the following command to ensure that jobs are not running in Yarn:

      # yarn application -list

      The following sample may resemble the output:

      # yarn application -list 
      Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):2 Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
      #yarn application -kill <Application-Id>