Bobcares

High or Full Disk Usage with Amazon Redshift – Factors & Fixes

by | Aug 12, 2021

Stuck with High or Full Disk Usage with Amazon Redshift? We can help you.

Here, at Bobcares, we assist our customers with several AWS queries as part of our AWS Support Services.

Today, let us see the factors that lead to high disk usage errors and how to troubleshoot them.

 

High or Full Disk Usage with Amazon Redshift

Generally, high disk usage errors depend on several factors. Moving ahead, let us discuss in detail these factors.

 

  • Distribution and sort key

Here, we review the table’s distribution style, distribution key, and sort key selection.

Tables that have a distribution skew with more data on one node than the others can cause a full disk node.

In this case, we need to change the distribution style to a more uniform distribution.

To determine the cardinality of the distribution key, we run:

SELECT <distkey column>, COUNT(*) FROM <schema name>.<table with distribution skew> GROUP BY <distkey column> HAVING COUNT(*) > 1 ORDER BY 2 DESC;

 

  • Query processing

While a query process, intermediate query results store in temporary blocks. Without enough memory, the tables cause a disk spill. The results won’t compress and affect the available disk space.

Amazon Redshift has a table structure with even distribution and no column encoding for temporary tables. However, if we use the SELECT…INTO syntax, we need to use a CREATE statement as well.

In case of insufficient memory, we may see a step in SVL_QUERY_SUMMARY where is_diskbased shows the value “true”.

To resolve this, we increase the number of query slots to allocate more memory.

 

  • Tables with VARCHAR(MAX) columns

We check the VARCHAR or CHARACTER VARYING columns to trail blanks that might omit when data stores on the disk.

Trailing blanks can occupy the full length in memory. So, our Support Techs recommend using the smallest possible column size.

We can generate a list of tables with maximum column widths using:

SELECT database, schema || '.' || "table" AS "table", max_varchar FROM svv_table_info WHERE max_varchar > 150 ORDER BY 2;

Then to identify and display the true widths of the wide VARCHAR table columns, we run:

SELECT max(octet_length (rtrim(column_name))) FROM table_name;

 

  • High column compression

For optimal column encoding, we encode columns except for the sort key. To do so, we use the ANALYZE COMPRESSION or Amazon Redshift column encoding utility.

Amazon Redshift provides column encoding to increase read performance while reducing overall storage consumption.

 

  • Maintenance operations

We need to ensure regular analysis and vacuum of the database tables.

The best method to maintain the health of the database is to identify any missing or outdated stats.

This prevents unnecessary table rows scan and also helps to optimize the query processing.

For example, here we can identify outdated stats in Amazon Redshift:

SELECT * FROM svv_table_info WHERE stats_off > 10 ORDER BY size DESC;

 

  • Cartesian products with cross-joins

Cartesian products can result in higher memory utilization and more table spills. To look for queries with Cartesian products we can use the EXPLAIN plan of the query.

Suppose it does not share a JOIN condition. Then the joins produce a Cartesian product of two tables. Every row of one table then joins to every row of the other table.

 

  • Minimum table size

It is possible for the same table to have different sizes in different clusters.

It determines the minimum table size by the number of columns and whether the table has a SORTKEY and the number of slices populated.

A change in the number of slices can result in overall disk storage.

 

  • Tombstone blocks

Generally, tombstone blocks occur when a WRITE transaction to an Amazon Redshift table occurs and there is a concurrent Read.

Suppose long-running table transactions occur regularly and across several loads. Then enough tombstones can accumulate to result in a Disk Full error.

With Amazon Redshift, we can analyze tombstone blocks using a commit command.

If there are long-running queries that are active, then we need to terminate them:

begin;
create table a (id int);
insert into a values(1);
commit;
drop table a;

To confirm tombstone blocks, we run:

select trim(name) as tablename, count(case when tombstone > 0 then 1 else null end) as tombstones from svv_diskusage group by 1 having count(case when tombstone > 0 then 1 else null end) > 0 order by 2 desc;

 

  • Copy a large file

During a COPY operation, we might receive a Disk Full error even if there is enough storage available.

This happens if the sorting operation spills to the disk, creating temporary blocks.

In this case, we check the STL_DISK_FULL_DIAG table:

select '2000-01-01'::timestamp + (currenttime/1000000.0)* interval '1 second' as currenttime,node_num,query_id,temp_blocks from pg_catalog.stl_disk_full_diag;

 

Additional troubleshooting

Initially, we check the percentage of disk space under the Performance tab.

In case of a sudden spike in utilization, we use the STL_QUERY to identify the activities and jobs that are running:

select * from stl_query where starttime between '2018-01-01 00:30:00' and '2018-01-01 00:40:00';

We can identify the top 20 disk spill queries using:

select A.userid, A.query, blocks_to_disk, trim(B.querytxt) text from stl_query_metrics A, stl_query B where A.query = B.query and segment=-1 and step = -1 and max_blocks_to_disk > 0 order by 3 desc limit 20;

To determine if the queries properly write to a disk, we run:

SELECT q.query, trim(q.cat_text)
FROM (
SELECT query,
replace( listagg(text,' ') WITHIN GROUP (ORDER BY sequence), '\\n', ' ') AS cat_text
FROM stl_querytext
WHERE userid>1
GROUP BY query) q
JOIN (
SELECT distinct query
FROM svl_query_summary
WHERE is_diskbased='t' AND (LABEL ILIKE 'hash%' OR LABEL ILIKE 'sort%' OR LABEL ILIKE 'aggr%' OR LABEL ILIKE 'save%' OR LABEL ILIKE 'window%' OR LABEL ILIKE 'unique%')
AND userid > 1) qs
ON qs.query = q.query;

[Stuck in between? We are here for you]

 

Conclusion

In short, we saw how our Support Techs troubleshoot high or full Disk Usage with Amazon Redshift

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

var google_conversion_label = "owonCMyG5nEQ0aD71QM";

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF