Microsoft Optimal method for columns into 1 word: XML Raw or Concat
What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?
Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.
SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))
SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))
We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.
Background will be utilized to apply hashing of row later.
sql-server database-design sql-server-2016 performance-tuning xml
add a comment |
What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?
Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.
SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))
SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))
We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.
Background will be utilized to apply hashing of row later.
sql-server database-design sql-server-2016 performance-tuning xml
add a comment |
What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?
Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.
SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))
SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))
We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.
Background will be utilized to apply hashing of row later.
sql-server database-design sql-server-2016 performance-tuning xml
What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?
Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.
SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))
SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))
We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.
Background will be utilized to apply hashing of row later.
sql-server database-design sql-server-2016 performance-tuning xml
sql-server database-design sql-server-2016 performance-tuning xml
edited Dec 20 '18 at 16:17
asked Dec 20 '18 at 16:12
user162241
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:
Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:
SELECT s1.*
INTO #source_data
FROM master..spt_values s1
CROSS JOIN master..spt_values t2;
This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
GO
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW
technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW
method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225493%2fmicrosoft-optimal-method-for-columns-into-1-word-xml-raw-or-concat%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:
Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:
SELECT s1.*
INTO #source_data
FROM master..spt_values s1
CROSS JOIN master..spt_values t2;
This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
GO
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW
technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW
method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.
add a comment |
In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:
Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:
SELECT s1.*
INTO #source_data
FROM master..spt_values s1
CROSS JOIN master..spt_values t2;
This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
GO
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW
technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW
method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.
add a comment |
In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:
Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:
SELECT s1.*
INTO #source_data
FROM master..spt_values s1
CROSS JOIN master..spt_values t2;
This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
GO
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW
technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW
method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.
In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:
Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:
SELECT s1.*
INTO #source_data
FROM master..spt_values s1
CROSS JOIN master..spt_values t2;
This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
GO
DROP TABLE IF EXISTS #t;
SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
INTO #t
FROM #source_data
OPTION (MAXDOP 1);
This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW
technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW
method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.
answered Dec 20 '18 at 17:17
Joe Obbish
20.6k32881
20.6k32881
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225493%2fmicrosoft-optimal-method-for-columns-into-1-word-xml-raw-or-concat%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown