Microsoft Optimal method for columns into 1 word: XML Raw or Concat












2














What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?



Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.



SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))

SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))


We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.



Background will be utilized to apply hashing of row later.










share|improve this question





























    2














    What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?



    Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.



    SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))

    SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))


    We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.



    Background will be utilized to apply hashing of row later.










    share|improve this question



























      2












      2








      2







      What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?



      Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.



      SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))

      SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))


      We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.



      Background will be utilized to apply hashing of row later.










      share|improve this question















      What is fastest, performance method to make columns into one word? Does Microsoft have recommendation, or is there performance Big-O studies on FOR XML RAW vs Concat?



      Company has multiple tables with different column types including int, varchar, string, different number of columns, some not/not null, maybe anywhere from 5-25 columns in a table.



      SELECT (Column1, Column2, Column3, Column4... FOR XML RAW)))

      SELECT CONCAT(Column1,'|',Column2,'|',Column3,'|',Column4...))


      We are seeing different timing results on our server, however both seem to be relatively close, curious if there is any general optimal to apply, rather than reviewing 1000 tables.



      Background will be utilized to apply hashing of row later.







      sql-server database-design sql-server-2016 performance-tuning xml






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Dec 20 '18 at 16:17

























      asked Dec 20 '18 at 16:12







      user162241





























          1 Answer
          1






          active

          oldest

          votes


















          5














          In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:



          query plans



          Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:



          SELECT s1.*
          INTO #source_data
          FROM master..spt_values s1
          CROSS JOIN master..spt_values t2;


          This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:



          DROP TABLE IF EXISTS #t;

          SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
          INTO #t
          FROM #source_data
          OPTION (MAXDOP 1);

          GO

          DROP TABLE IF EXISTS #t;

          SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
          INTO #t
          FROM #source_data
          OPTION (MAXDOP 1);


          This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.






          share|improve this answer





















            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "182"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225493%2fmicrosoft-optimal-method-for-columns-into-1-word-xml-raw-or-concat%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown
























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            5














            In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:



            query plans



            Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:



            SELECT s1.*
            INTO #source_data
            FROM master..spt_values s1
            CROSS JOIN master..spt_values t2;


            This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:



            DROP TABLE IF EXISTS #t;

            SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
            INTO #t
            FROM #source_data
            OPTION (MAXDOP 1);

            GO

            DROP TABLE IF EXISTS #t;

            SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
            INTO #t
            FROM #source_data
            OPTION (MAXDOP 1);


            This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.






            share|improve this answer


























              5














              In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:



              query plans



              Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:



              SELECT s1.*
              INTO #source_data
              FROM master..spt_values s1
              CROSS JOIN master..spt_values t2;


              This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:



              DROP TABLE IF EXISTS #t;

              SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
              INTO #t
              FROM #source_data
              OPTION (MAXDOP 1);

              GO

              DROP TABLE IF EXISTS #t;

              SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
              INTO #t
              FROM #source_data
              OPTION (MAXDOP 1);


              This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.






              share|improve this answer
























                5












                5








                5






                In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:



                query plans



                Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:



                SELECT s1.*
                INTO #source_data
                FROM master..spt_values s1
                CROSS JOIN master..spt_values t2;


                This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:



                DROP TABLE IF EXISTS #t;

                SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
                INTO #t
                FROM #source_data
                OPTION (MAXDOP 1);

                GO

                DROP TABLE IF EXISTS #t;

                SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
                INTO #t
                FROM #source_data
                OPTION (MAXDOP 1);


                This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.






                share|improve this answer












                In general, if you truly need to know what will be the fastest for your workload then you need to do your own benchmarking. Take a look at the following query plans:



                query plans



                Which one do you think would be faster if you had to pick one without any additional context? I would pick the bottom one. It simply appears to do less work because there's no nested loop join and there's no UDX operator. To generate the above query plans, put 6.5 million rows into a temp table:



                SELECT s1.*
                INTO #source_data
                FROM master..spt_values s1
                CROSS JOIN master..spt_values t2;


                This table has six columns with different data types. Some allow nulls and some don't. So it seems to meet your criteria. The top query takes about 3X longer on my machine than the bottom one:



                DROP TABLE IF EXISTS #t;

                SELECT HASHBYTES('SHA2_256', (SELECT name, number, type, low, high, status FOR XML RAW)) hash_value
                INTO #t
                FROM #source_data
                OPTION (MAXDOP 1);

                GO

                DROP TABLE IF EXISTS #t;

                SELECT HASHBYTES('SHA2_256', CONCAT(name, '|', number, '|', type, '|', low, '|', high, '|', status)) hash_value
                INTO #t
                FROM #source_data
                OPTION (MAXDOP 1);


                This seems like a perfectly reasonable result. It will add overhead to convert data to XML and to have a nested loop join. Why do that when you don't have to? In your question you suggest that sometimes the FOR XML RAW technique is faster for your data. If so, there's something about your data or testing methodology that you aren't telling us. My suggestion is to come up with a sharable reproduction that demonstrates that the FOR XML RAW method is significantly faster and to ask a new question. As is we have to guess what's going on, which isn't helpful for anyone.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Dec 20 '18 at 17:17









                Joe Obbish

                20.6k32881




                20.6k32881






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Database Administrators Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225493%2fmicrosoft-optimal-method-for-columns-into-1-word-xml-raw-or-concat%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Probability when a professor distributes a quiz and homework assignment to a class of n students.

                    Aardman Animations

                    Are they similar matrix