当前位置: 首页 > 工具软件 > Flashcache > 使用案例 >

[2020-12-15]Exadata Write-Back Flash Cache - FAQ

壤驷华美
2023-12-01

APPLIES TO:

Oracle Exadata Storage Server Software - Version 11.2.3.2.1 and later
Information in this document applies to any platform.

PURPOSE

This document will provide answers to the frequently asked questions about write-back flash cache.

QUESTIONS AND ANSWERS

1. What is Write back Flash cache?

Write back flash cache provides the ability to cache write I/Os directly to PCI flash in addition to read I/Os.
Exadata storage software version 11.2.3.2.1 is the minimum version required to use write back flash cache.
Using Exadata storage software version 11.2.3.2.1, Exadata Smart Flash Cache is persistent across Exadata Storage server restarts.

 


2. What software and hardware versions are supported?

 

Write-back Smart Flash Cache requirement has been amended to state that both grid infrastructure and database homes must run 11.2.0.3.9 or later.

Database homes running 11.2.0.2 that cannot promptly upgrade to 11.2.0.3 to meet the amended requirement must install Patch 17342825.

As long as the minimum software requirements are met, any Exadata hardware with flashcache (V2 and later) can take advantage of this feature.

 

Since April 2017, Oracle Exadata Deployment Assistant (OEDA) enables Write-Back Flash Cache by default if the following conditions are met:

1. GI and DB home must be

  • 11.2.0.4.1 or higher
  • 12.1.0.2 or higher
  • 12.2.0.2 or higher

AND

2. DATA diskgroup has HIGH redundancy


Additional Notes:

  • Extreme Flash is always checked by default and cannot be modified.
  • A user can check/uncheck the WBFC checkbox if not using Extreme Flash to manually disable

 



3. When should I use the write back flash cache?

 

If your application writes intensively and if you find significant waits for "free buffer waits" or high IO times to check for write bottleneck in AWR reports, then you should consider using the write back flash cache

 

4. How to determine if you have write back flash cache enabled?

Execute:

#dcli -g ~/cell_group -l root cellcli -e "list cell attributes flashcachemode"

Results:

flashCacheMode: WriteBack  -> write back flash cache is enabled
flashCacheMode: WriteThrough  -> write back flash cache is not enabled

5. How can I enable the write back flash cache?

Preparation:

Before performing the below steps, you should perform some prerequisite checks to ensure that all cells are in the proper state.  Please perform the following check as root from one of the compute nodes:

Check all grdidksk asmdeactivationoutcome and asmmodestatus to ensure that all griddisks on all cells are “Yes” and “ONLINE” respectively.
# dcli -g cell_group -l root cellcli -e list griddisk attributes asmdeactivationoutcome, asmmodestatus

Check that all of the flashcache are in the “normal” state and that no flash disks are in a degraded or critical state:
# dcli -g cell_group -l root cellcli -e list flashcache detail

 

You may instead use the setWBFC.sh script (version 1.0.0.21_20160602) to enable or disable Writeback Flash Cache on the Exadata Machine DB nodes or Sun Supercluster DB nodes. This script now supports X5-2 High Capacity storage storage cells. This script automates all of the steps described below and supports rolling and non-rolling. Please refer to the readme for further details.

The script is supported for Storage cell software versions:

- 11.2.3.2.1 and above
- 11.2.3.3.1 enhancements that no longer require cellsrv service shutdown and restart.  See README.
- 12.1.1.1.1
- 12.1.2.1.0 (Except for X5-2 with Extreme Flash as WBFC will be enabled automatically)

GI versions
-  11.2.0.3.9 and above
-  12.1

 

 

Enabling  Write Back Flash Cache when Storage cell software version is 11.2.3.3.1 or higher

Note: 

  • With 11.2.3.3.1 or higher it is not required to stop cellsrv process or inactivate griddisk.
  • To reduce performance impact on the application, execute the change during a period of reduced workload.

 

  1. Validate all the Physical Disks are in NORMAL state before modifying FlashCache. The following command should return no rows.

# dcli –l root –g cell_group cellcli –e “list physicaldisk attributes name,status”|grep -v normal

2. Drop the flash cache 

# dcli –l root –g cell_group cellcli -e drop flashcache

3. Set flashCacheMode attribute to writeback

# dcli –l root – g cell_group cellcli -e "alter cell flashCacheMode=writeback"

4. Re-create the flash cache

# dcli –l root –g cell_group cellcli -e create flashcache all


5. Check attribute flashCacheMode is WriteBack:

# dcli –l root –g cell_group cellcli -e list cell detail | grep flashCacheMode

6. Validate griddisk attributes cachingPolicy and cachedby

# cellcli –e list griddisk attributes name,cachingpolicy,cachedby

 

Enabling Write Back Flash Cache when Storage cell software version is  lower than 11.2.3.3.1

Note:  With versions lower than 11.2.3.3.1, to make changes to the FlashCachemode, it requires to stop the cellsrv process.  That introduce the option of making the changes cell by cell (rolling) or all cells (non rolling) having CRS down.

A. Enable Write Back Flash Cache using a ROLLING method 

(RDBMS & ASM instance is up - enabling write-back flashcache one cell at a time)

    Log onto the first cell that you wish to enable write-back FlashCache

1. Drop the flash cache on that cell
# cellcli -e drop flashcache

2. Check if ASM will be OK if the grid disks go OFFLINE. The following command should return 'Yes' for the grid disks being listed:
# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

3. Inactivate the griddisk on the cell
# cellcli –e alter griddisk all inactive

4. Shut down cellsrv service
# cellcli -e alter cell shutdown services cellsrv 

5. Set the cell flashcache mode to writeback 
# cellcli -e "alter cell flashCacheMode=writeback"

6. Restart the cellsrv service 
# cellcli -e alter cell startup services cellsrv 

7. Reactivate the griddisks on the cell
# cellcli –e alter griddisk all active

8. Verify all grid disks have been successfully put online using the following command:
# cellcli -e list griddisk attributes name, asmmodestatus

9. Recreate the flash cache 
# cellcli -e create flashcache all 

10. Check the status of the cell to confirm that it's now in WriteBack mode:
# cellcli -e list cell detail | grep flashCacheMode 

11. Repeat these same steps again on the next cell. However, before taking another storage server offline, execute the following making sure 'asmdeactivationoutcome' displays YES: 
# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

 B . Enable Write Back Flash Cache using a NON-ROLLING method

    (RDBMS & ASM instances are down while enabling write-back flashcache)

1. Drop the flash cache on that cell
# cellcli -e drop flashcache 

2. Shut down cellsrv service
# cellcli -e alter cell shutdown services cellsrv 

3. Set the cell flashcache mode to writeback 
# cellcli -e "alter cell flashCacheMode=writeback" 

4. Restart the cellsrv service 
# cellcli -e alter cell startup services cellsrv 

5. Recreate the flash cache 
# cellcli -e create flashcache all

 

6. How can I disable the write back flash cache?

* Please note, disabling write back flash cache is not a typical operation and should only be done under the guidance of Oracle Support.


Disabling the write back flash cache requires flushing the dirty blocks to disk before changing the “flashcacheMode” to “writethrough”.  The flush can be performed in parallel across all cells using the dcli command.  Once the flush begins, all caching to the flash cache is stopped.  Therefore, applications will experience some performance impact that will vary depending upon the nature of the workload. The following steps are common to all cells and can be performed in parallel use the dcli utility from one of the compute nodes as shown below.

Disabling Write Back Flash Cache when Storage cell software version is 11.2.3.3.1 or higher

 

Note: 

  • With 11.2.3.3.1 or higher there is no need to stop cellsrv process or inactivate griddisk.
  • To reduce performance impact on the application, execute the change during a reduce workload.

 

1. Validate all the Physical Disks are in NORMAL state before modifying FlashCache. The following command should return no rows.

# dcli –l root –g cell_group cellcli –e “list physicaldisk attributes name,status”|grep –v normal

2. Determine amount of dirty data in the flash cache.  

# cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' "

3. Flush the flash cache  

# dcli –g cell_group –l root cellcli -e "alter flashcache all flush"

4. Check the progress of the flushing of flash cache  

The flushing process is complete when FC_BY_DIRTY is 0 MB

# dcli -g cell_group -l root cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' "

or flushstatus attribute is "Completed"

# dcli -g cell_group -l root cellcli -e "LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD

5. If flushing the cache is complete, drop the flash cache

# dcli -g cell_group -l root cellcli -e drop flashcache

6. Set the flashCacheMode attribute to writethrough    

# dcli -g cell_group -l root cellcli -e "alter cell flashCacheMode=writethrough"

7. Re-create the flash cache   

# dcli -g cell_group -l root cellcli -e create flashcache all

8. Check the attribute flashcacheMode is Writethrough 
# dcli -g cell_group -l root cellcli -e list cell detail | grep flashCacheMode
 

 Disabling Write Back Flash Cache when Storage cell software version is  lower than 11.2.3.3.1

Note: With versions lower than 11.2.3.3.1, to make changes to the FlashCachemode, it requires to stop the cellsrv process. That introduce the option of making the changes cell by cell (rolling) or all cells (non rolling) having CRS down.

 A.  Disable Write Back Flash Cache using a ROLLING method

   (RDBMS & ASM instance is up - disabling write-back flashcache one cell at a time)

1. Check griddisk status by verifying the griddisk attribute “asmdeactivationoutcome” = “Yes” for all griddisks on this cell.  Do not proceed if a griddisk is returned using the following command.  The following command should return no rows.
# dcli  –g cell_group –l root cellcli –e “list griddisk where asmdeactivationoutcome != 'Yes' attributes name,asmmodestatus,asmdeactivationoutcome”

2. Determine amount of bytes to be flushed by determining how much is dirty is in the flash cache.  This will provide the number of bytes of data that needs to be de-staged to disk per cell, which will give an indication of how long the flush will take.  
# dcli –g cell_group –l root cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' " 

3. Flush the flashcache across all cells

To perform this step, it is recommended to have two separate sessions: one to execute the command below and the other to monitor its progression in the next step.

Issue the following command to begin the flush in one of the two sessions:
# dcli –g cell_group –l root cellcli -e "alter flashcache all flush" 

If any errors occur, they will be displayed in this session, otherwise, this session will show a successful flush across all cells.

4. Check the flush status across all cells

a. Execute the following command every few minutes in the second session to monitor the progress.  As dirty blocks are de-staged to disk, this count will reduce to zero (0).  This will take some time and you can determine a time estimate as you execute the following command over time:  
# dcli –g cell_group –l root cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' " 

b. The following command should return  "working" for each flash disk on each cell while the cache is being flushed and "completed" when it is finished. Execute the following command in the second session: 
# dcli –g cell_group –l root cellcli -e "LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD

DO NOT CONTINUE UNTIL flushstatus is COMPLETE for all the celldisks 

The following steps are to be executed individually on each cell, one at a time.  All steps that must be performed directly on a cell use the cellcli utility.
 

 Log onto the first cell that will have the write back flash cache disabled.

5. Drop the flashcache for this cell after the flush completes 
# cellcli -e drop flashcache 

6. Inactivate all griddisks on the cell
# cellcli –e alter griddisk all inactive

7. Shut down the cellsrv service 
# cellcli -e alter cell shutdown services cellsrv 

8. Reset the cell flash cache state to writethrough 
# cellcli -e "alter cell flashCacheMode=writethrough" 

9. Restart the cellsrv service 
# cellcli -e alter cell startup services cellsrv 

10. Reactivate the griddisks on the cell
# cellcli –e alter griddisk all active

11. Recreate the flash cache 
# cellcli -e create flashcache all 

12. Check the status of this cell flash cache state
# cellcli -e list cell detail | grep flashCacheMode

13. Check the griddisks of the cell

Before moving on to the next cell, check the attribute “asmModestatus” of all of the griddisks and make sure they are all “ONLINE” and the attribute “asmdeactivationoutcome” is set to “Yes”.  It may be necessary to execute the following command several times until the “asmModestatus” shows “ONLINE”.  
# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

 

B. Disable Write Back Flash Cache using a NON-ROLLING method

   (RDBMS & ASM instances are down while disabling write-back flashcache)

To reduce the time for a total outage, the flashcache flush operation can be performed in advance of shutting down the entire cluster.  

1. Determine amount of bytes to be flushed by determining how much is dirty in the flash cache.  
# dcli -g cell_group -l root cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' " 

2. Flush the flashcache across all cells

*To perform this step, it is recommended to have two separate sessions: one to execute the command below and the other to monitor its progression in the next step. 

Issue the following command to begin the flush in one of the two sessions:
# dcli –g cell_group –l root cellcli -e "alter flashcache all flush" 

If any errors occur, they will be displayed in this session, otherwise, this session will show a successful flush across all cells.
Once the flush begins, all caching to the flash cache is stopped.  Therefore, applications will experience some performance impact that will vary depending upon the nature of the workload.

3. Check the flush status across all cells  

a. Execute the following command in the second session every few minutes to monitor the progress.  As dirty blocks are de-staged to disk, this count will reduce to zero (0).  This will take some time and you can determine a time estimate as you execute the following command over time:  
# dcli -g cell_group -l root cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' " 

b. Check the status. For each flash disk it will return "working" while the cache is being flushed, and "Completed" when it is finished: 
# dcli -g cell_group -l root cellcli -e "LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD 

4. Shutdown all databases and the entire cluster 

Ensure the entire cluster is shut down (all databases and CRS).  Not doing so will result in a crash of the all databases and the cluster when the cellsrv service is shut down and restarted.  It is recommended to shut down each database with the srvctl command.  However, you can shut down the entire cluster at once logged on as root on one of the compute nodes as follows:

# cd $GRID_HOME/bin
# ./crsctl stop cluster -all

$GRID_HOME refers to the location where the Grid Infrastructure is installed.  Typically: /u01/app/11.2.0.3/grid.

5. Drop the flash cache across all cells after flush completes 
# dcli -g cell_group -l root cellcli -e drop flashcache 

6. Shut down the cellsrv service across all cells
# dcli -g cell_group -l root cellcli -e alter cell shutdown services cellsrv 

7. Reset the cell flashcache state to writethrough across all cells
# dcli -g cell_group -l root cellcli -e "alter cell flashCacheMode=writethrough" 

8. Restart the cellsrv service across all cells
# dcli -g cell_group -l root cellcli -e alter cell startup services cellsrv 

9. Recreate the flash cache across all cells 
# dcli -g cell_group -l root cellcli -e create flashcache all 

10. Check the status of all cells’ flash cache state 
# dcli -g cell_group -l root cellcli -e list cell detail | grep flashCacheMode

* The flashCacheMode state of all cells should now be in writethrough

11. Start up the cluster and all databases.  As root on one of the compute nodes, issue the following command:

# cd $GRID_HOME/bin
# ./crsctl start cluster -all

 

7. What is the performance benefit we can get by enabling the write back flash cache?
 

   Write-back flash cache significantly improves the write intensive operations because writing to flash cache is faster than writing to Hard disks.
   Depending on your application, on X3-2 machines write performance can be improved up to 20X IOPS than disk and 10X more write IOPs than to disk on V2 and X2

   
8. Can I disable write back Flash cache on certain disk groups?

  Yes, you can disable cell level flash caching for grid disks that do not need it when using Write Back Flash Cache. For further details, refer to our best practice recommendations here
  
9. Is Write-back flash cache persistent upon reboots?

  Yes

 类似资料: