How to distinguish 'syscalls' form 'int 80h' when using ptrace












6















As far as I know, ptrace can only get syscall number by PTRACE_SYSCALL, but syscall number is different in x86 and x64. So is there any way to figure out where this syscall real origin?



I am now cording a program to limit some others' syscall by syscall number, I know the syscall number on both x86 and x64, but some of the programs using 'int 80h' instead of 'syscall' so that they can do dangerous thing which I limited on x64. For example,I banned fork() on x64, they can use 'int 80h(2)'(fork()) and I fell they are using 'syscall(2)'(open()), thus they can break the limit. Although ptrace can track both of them and get the syscall number, I cannot distinguish where the syscall actually comes.










share|improve this question





























    6















    As far as I know, ptrace can only get syscall number by PTRACE_SYSCALL, but syscall number is different in x86 and x64. So is there any way to figure out where this syscall real origin?



    I am now cording a program to limit some others' syscall by syscall number, I know the syscall number on both x86 and x64, but some of the programs using 'int 80h' instead of 'syscall' so that they can do dangerous thing which I limited on x64. For example,I banned fork() on x64, they can use 'int 80h(2)'(fork()) and I fell they are using 'syscall(2)'(open()), thus they can break the limit. Although ptrace can track both of them and get the syscall number, I cannot distinguish where the syscall actually comes.










    share|improve this question



























      6












      6








      6


      1






      As far as I know, ptrace can only get syscall number by PTRACE_SYSCALL, but syscall number is different in x86 and x64. So is there any way to figure out where this syscall real origin?



      I am now cording a program to limit some others' syscall by syscall number, I know the syscall number on both x86 and x64, but some of the programs using 'int 80h' instead of 'syscall' so that they can do dangerous thing which I limited on x64. For example,I banned fork() on x64, they can use 'int 80h(2)'(fork()) and I fell they are using 'syscall(2)'(open()), thus they can break the limit. Although ptrace can track both of them and get the syscall number, I cannot distinguish where the syscall actually comes.










      share|improve this question
















      As far as I know, ptrace can only get syscall number by PTRACE_SYSCALL, but syscall number is different in x86 and x64. So is there any way to figure out where this syscall real origin?



      I am now cording a program to limit some others' syscall by syscall number, I know the syscall number on both x86 and x64, but some of the programs using 'int 80h' instead of 'syscall' so that they can do dangerous thing which I limited on x64. For example,I banned fork() on x64, they can use 'int 80h(2)'(fork()) and I fell they are using 'syscall(2)'(open()), thus they can break the limit. Although ptrace can track both of them and get the syscall number, I cannot distinguish where the syscall actually comes.







      linux kernel chroot sandbox






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 7 '14 at 0:44







      Criyle

















      asked Oct 31 '14 at 6:10









      CriyleCriyle

      312




      312






















          2 Answers
          2






          active

          oldest

          votes


















          1














          Looks like as of writing (2019-02-08), this is impossible.



          And even strace gets it wrong.



          Edit: Linus Torvalds talks about it here, also analysing possible (but commented-out) workarounds in the strace code that directly look at the instructions made in the binary. This code was removed here as part of the patchset I mention below. It says It works, but is too complicated, and strictly speaking, unreliable, but it is unclear to me in which cases the "strictly speaking, unreliably" applies, if that is only in the case of a multi-threaded executable rewriting itself at runtime (thus not suitable for forbidding certain syscalls for security use cases), or also in other cases.



          Edit: The "unreliable" part was added in this commit.



          Edit: I have now tried out strace's opcode-peeking implementation (version v4.25), and suspect that it was bugged: When activating that code path by changing this line to #if 0 and this line to #elif 1, no syscalls are printed because scno is not set at all. I added scno = x86_64_regs.orig_rax; after this line to make it work.



          See the presentation How to make strace happy, slide 2, problem 2:




          There is no reliable way to distinguish between x86_64 and x86 syscalls.




          Details shown on slides 4-6. There is a proposed solution to be added to the kernel:




          Extend the ptrace API with PTRACE_GET_SYSCALL_INFO request




          But this solution isn't merged to the kernel.



          The patchset is called ptrace: add PTRACE_GET_SYSCALL_INFO request and it's still being worked on in January 2019. Hopefully it will soon be merged.





          strace already has support for it since release 4.26 (but it shouldn't work unless you apply the kernel patch manually):




          Implemented obtainment of system call information using PTRACE_GET_SYSCALL_INFO ptrace API.







          share|improve this answer


























          • Related: In Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI? I suggested that you could disassemble the code at RIP and check for the 0f 05 syscall instruction. IDK if that would really work, and of course it would be slower to use an extra ptrace system call or two to fetch registers and peek at the process memory. (And for security use-cases like this there'd be a race condition where another thread could rewrite those bytes after they execute, fooling the filter.)

            – Peter Cordes
            Feb 8 at 4:42






          • 1





            @PeterCordes Looks like code that does that actually exists/existed in strace (commented out); Linus Torvalds analysed it in a thread I've now linked (edited). I've added to the edit a new question about the "unreliability" of this method -- do you know more?

            – nh2
            Feb 8 at 5:06






          • 1





            I think Linus is just talking about the same race condition I pointed out: another thread in the process that made the syscall could modify the instruction or unmap the page before strace can read it. Most of his message is proposing mechanisms to sneak in extra signalling from the kernel to the tracing process without breaking old user-space. (e.g. upper bits of the 8-byte space for CS, or of RFLAGS.) Oh, the parent message shows that a single thread can bypass SMC flush with another mapping.

            – Peter Cordes
            Feb 8 at 5:48






          • 1





            Normally modern x86 CPUs snoop stores anywhere near any instruction address that's in the pipeline (Observing stale instruction fetching on x86 with self-modifying code), but apparently using a different virtual page bypasses that on at least some CPUs. The answer on that SO question already mentions that it's not guaranteed in that case, but that Intel has a patent on a mechanism for snooping based on physical address (with finer than 1 page granularity), so on most actual CPUs you maybe couldn't work around this with a single thread.

            – Peter Cordes
            Feb 8 at 5:49








          • 1





            yup, that's definitely true. The discussion leading up to questions about reliability was all about obfuscation attempts. Even if a single thread could create stale instruction fetch (e.g. on Atom or Silvermont?), that would only be problem for intentional obfuscation. And if you're worried about obfuscation, there are ways that can work on mainstream Intel (e.g. cross-modifying instead of self-modifying, on a multicore), so it turns out that self-modifying code is probably only relevant on a single-core machine. (Where only very lucky preemption could let another thread in then.)

            – Peter Cordes
            Feb 9 at 2:45



















          -2














          It's the system call sys_rt_sigtimedwait (since kernel 2.2). See the manpage of it by:



          man 2 rt_sigtimedwait


          That syscall suspends the execution, until a signal (or a set of singals) specified by the argument is delivered. A timeout is also been given.



          To be 100% sure there is a file called unistd_64.h. Search your system for that file. Mostly it's in the include folder (/usr/include/x86_64-linux-gnu/asm/unistd_64.h). In there are the numbers defined. Here the relevant line in my case (it's also a 64-bit system, kernel 3.2.0-58):



          #define __NR_rt_sigtimedwait                    128
          __SYSCALL(__NR_rt_sigtimedwait, sys_rt_sigtimedwait)


          Note 128 is decimal for 80 in hex.






          share|improve this answer


























          • This does not answer the question how to distinguish, when ptrace()ing, whether int 0x80 or syscall was used.

            – nh2
            Feb 8 at 4:40











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "3"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f834122%2fhow-to-distinguish-syscalls-form-int-80h-when-using-ptrace%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          Looks like as of writing (2019-02-08), this is impossible.



          And even strace gets it wrong.



          Edit: Linus Torvalds talks about it here, also analysing possible (but commented-out) workarounds in the strace code that directly look at the instructions made in the binary. This code was removed here as part of the patchset I mention below. It says It works, but is too complicated, and strictly speaking, unreliable, but it is unclear to me in which cases the "strictly speaking, unreliably" applies, if that is only in the case of a multi-threaded executable rewriting itself at runtime (thus not suitable for forbidding certain syscalls for security use cases), or also in other cases.



          Edit: The "unreliable" part was added in this commit.



          Edit: I have now tried out strace's opcode-peeking implementation (version v4.25), and suspect that it was bugged: When activating that code path by changing this line to #if 0 and this line to #elif 1, no syscalls are printed because scno is not set at all. I added scno = x86_64_regs.orig_rax; after this line to make it work.



          See the presentation How to make strace happy, slide 2, problem 2:




          There is no reliable way to distinguish between x86_64 and x86 syscalls.




          Details shown on slides 4-6. There is a proposed solution to be added to the kernel:




          Extend the ptrace API with PTRACE_GET_SYSCALL_INFO request




          But this solution isn't merged to the kernel.



          The patchset is called ptrace: add PTRACE_GET_SYSCALL_INFO request and it's still being worked on in January 2019. Hopefully it will soon be merged.





          strace already has support for it since release 4.26 (but it shouldn't work unless you apply the kernel patch manually):




          Implemented obtainment of system call information using PTRACE_GET_SYSCALL_INFO ptrace API.







          share|improve this answer


























          • Related: In Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI? I suggested that you could disassemble the code at RIP and check for the 0f 05 syscall instruction. IDK if that would really work, and of course it would be slower to use an extra ptrace system call or two to fetch registers and peek at the process memory. (And for security use-cases like this there'd be a race condition where another thread could rewrite those bytes after they execute, fooling the filter.)

            – Peter Cordes
            Feb 8 at 4:42






          • 1





            @PeterCordes Looks like code that does that actually exists/existed in strace (commented out); Linus Torvalds analysed it in a thread I've now linked (edited). I've added to the edit a new question about the "unreliability" of this method -- do you know more?

            – nh2
            Feb 8 at 5:06






          • 1





            I think Linus is just talking about the same race condition I pointed out: another thread in the process that made the syscall could modify the instruction or unmap the page before strace can read it. Most of his message is proposing mechanisms to sneak in extra signalling from the kernel to the tracing process without breaking old user-space. (e.g. upper bits of the 8-byte space for CS, or of RFLAGS.) Oh, the parent message shows that a single thread can bypass SMC flush with another mapping.

            – Peter Cordes
            Feb 8 at 5:48






          • 1





            Normally modern x86 CPUs snoop stores anywhere near any instruction address that's in the pipeline (Observing stale instruction fetching on x86 with self-modifying code), but apparently using a different virtual page bypasses that on at least some CPUs. The answer on that SO question already mentions that it's not guaranteed in that case, but that Intel has a patent on a mechanism for snooping based on physical address (with finer than 1 page granularity), so on most actual CPUs you maybe couldn't work around this with a single thread.

            – Peter Cordes
            Feb 8 at 5:49








          • 1





            yup, that's definitely true. The discussion leading up to questions about reliability was all about obfuscation attempts. Even if a single thread could create stale instruction fetch (e.g. on Atom or Silvermont?), that would only be problem for intentional obfuscation. And if you're worried about obfuscation, there are ways that can work on mainstream Intel (e.g. cross-modifying instead of self-modifying, on a multicore), so it turns out that self-modifying code is probably only relevant on a single-core machine. (Where only very lucky preemption could let another thread in then.)

            – Peter Cordes
            Feb 9 at 2:45
















          1














          Looks like as of writing (2019-02-08), this is impossible.



          And even strace gets it wrong.



          Edit: Linus Torvalds talks about it here, also analysing possible (but commented-out) workarounds in the strace code that directly look at the instructions made in the binary. This code was removed here as part of the patchset I mention below. It says It works, but is too complicated, and strictly speaking, unreliable, but it is unclear to me in which cases the "strictly speaking, unreliably" applies, if that is only in the case of a multi-threaded executable rewriting itself at runtime (thus not suitable for forbidding certain syscalls for security use cases), or also in other cases.



          Edit: The "unreliable" part was added in this commit.



          Edit: I have now tried out strace's opcode-peeking implementation (version v4.25), and suspect that it was bugged: When activating that code path by changing this line to #if 0 and this line to #elif 1, no syscalls are printed because scno is not set at all. I added scno = x86_64_regs.orig_rax; after this line to make it work.



          See the presentation How to make strace happy, slide 2, problem 2:




          There is no reliable way to distinguish between x86_64 and x86 syscalls.




          Details shown on slides 4-6. There is a proposed solution to be added to the kernel:




          Extend the ptrace API with PTRACE_GET_SYSCALL_INFO request




          But this solution isn't merged to the kernel.



          The patchset is called ptrace: add PTRACE_GET_SYSCALL_INFO request and it's still being worked on in January 2019. Hopefully it will soon be merged.





          strace already has support for it since release 4.26 (but it shouldn't work unless you apply the kernel patch manually):




          Implemented obtainment of system call information using PTRACE_GET_SYSCALL_INFO ptrace API.







          share|improve this answer


























          • Related: In Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI? I suggested that you could disassemble the code at RIP and check for the 0f 05 syscall instruction. IDK if that would really work, and of course it would be slower to use an extra ptrace system call or two to fetch registers and peek at the process memory. (And for security use-cases like this there'd be a race condition where another thread could rewrite those bytes after they execute, fooling the filter.)

            – Peter Cordes
            Feb 8 at 4:42






          • 1





            @PeterCordes Looks like code that does that actually exists/existed in strace (commented out); Linus Torvalds analysed it in a thread I've now linked (edited). I've added to the edit a new question about the "unreliability" of this method -- do you know more?

            – nh2
            Feb 8 at 5:06






          • 1





            I think Linus is just talking about the same race condition I pointed out: another thread in the process that made the syscall could modify the instruction or unmap the page before strace can read it. Most of his message is proposing mechanisms to sneak in extra signalling from the kernel to the tracing process without breaking old user-space. (e.g. upper bits of the 8-byte space for CS, or of RFLAGS.) Oh, the parent message shows that a single thread can bypass SMC flush with another mapping.

            – Peter Cordes
            Feb 8 at 5:48






          • 1





            Normally modern x86 CPUs snoop stores anywhere near any instruction address that's in the pipeline (Observing stale instruction fetching on x86 with self-modifying code), but apparently using a different virtual page bypasses that on at least some CPUs. The answer on that SO question already mentions that it's not guaranteed in that case, but that Intel has a patent on a mechanism for snooping based on physical address (with finer than 1 page granularity), so on most actual CPUs you maybe couldn't work around this with a single thread.

            – Peter Cordes
            Feb 8 at 5:49








          • 1





            yup, that's definitely true. The discussion leading up to questions about reliability was all about obfuscation attempts. Even if a single thread could create stale instruction fetch (e.g. on Atom or Silvermont?), that would only be problem for intentional obfuscation. And if you're worried about obfuscation, there are ways that can work on mainstream Intel (e.g. cross-modifying instead of self-modifying, on a multicore), so it turns out that self-modifying code is probably only relevant on a single-core machine. (Where only very lucky preemption could let another thread in then.)

            – Peter Cordes
            Feb 9 at 2:45














          1












          1








          1







          Looks like as of writing (2019-02-08), this is impossible.



          And even strace gets it wrong.



          Edit: Linus Torvalds talks about it here, also analysing possible (but commented-out) workarounds in the strace code that directly look at the instructions made in the binary. This code was removed here as part of the patchset I mention below. It says It works, but is too complicated, and strictly speaking, unreliable, but it is unclear to me in which cases the "strictly speaking, unreliably" applies, if that is only in the case of a multi-threaded executable rewriting itself at runtime (thus not suitable for forbidding certain syscalls for security use cases), or also in other cases.



          Edit: The "unreliable" part was added in this commit.



          Edit: I have now tried out strace's opcode-peeking implementation (version v4.25), and suspect that it was bugged: When activating that code path by changing this line to #if 0 and this line to #elif 1, no syscalls are printed because scno is not set at all. I added scno = x86_64_regs.orig_rax; after this line to make it work.



          See the presentation How to make strace happy, slide 2, problem 2:




          There is no reliable way to distinguish between x86_64 and x86 syscalls.




          Details shown on slides 4-6. There is a proposed solution to be added to the kernel:




          Extend the ptrace API with PTRACE_GET_SYSCALL_INFO request




          But this solution isn't merged to the kernel.



          The patchset is called ptrace: add PTRACE_GET_SYSCALL_INFO request and it's still being worked on in January 2019. Hopefully it will soon be merged.





          strace already has support for it since release 4.26 (but it shouldn't work unless you apply the kernel patch manually):




          Implemented obtainment of system call information using PTRACE_GET_SYSCALL_INFO ptrace API.







          share|improve this answer















          Looks like as of writing (2019-02-08), this is impossible.



          And even strace gets it wrong.



          Edit: Linus Torvalds talks about it here, also analysing possible (but commented-out) workarounds in the strace code that directly look at the instructions made in the binary. This code was removed here as part of the patchset I mention below. It says It works, but is too complicated, and strictly speaking, unreliable, but it is unclear to me in which cases the "strictly speaking, unreliably" applies, if that is only in the case of a multi-threaded executable rewriting itself at runtime (thus not suitable for forbidding certain syscalls for security use cases), or also in other cases.



          Edit: The "unreliable" part was added in this commit.



          Edit: I have now tried out strace's opcode-peeking implementation (version v4.25), and suspect that it was bugged: When activating that code path by changing this line to #if 0 and this line to #elif 1, no syscalls are printed because scno is not set at all. I added scno = x86_64_regs.orig_rax; after this line to make it work.



          See the presentation How to make strace happy, slide 2, problem 2:




          There is no reliable way to distinguish between x86_64 and x86 syscalls.




          Details shown on slides 4-6. There is a proposed solution to be added to the kernel:




          Extend the ptrace API with PTRACE_GET_SYSCALL_INFO request




          But this solution isn't merged to the kernel.



          The patchset is called ptrace: add PTRACE_GET_SYSCALL_INFO request and it's still being worked on in January 2019. Hopefully it will soon be merged.





          strace already has support for it since release 4.26 (but it shouldn't work unless you apply the kernel patch manually):




          Implemented obtainment of system call information using PTRACE_GET_SYSCALL_INFO ptrace API.








          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Feb 9 at 1:55

























          answered Feb 8 at 4:38









          nh2nh2

          5271320




          5271320













          • Related: In Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI? I suggested that you could disassemble the code at RIP and check for the 0f 05 syscall instruction. IDK if that would really work, and of course it would be slower to use an extra ptrace system call or two to fetch registers and peek at the process memory. (And for security use-cases like this there'd be a race condition where another thread could rewrite those bytes after they execute, fooling the filter.)

            – Peter Cordes
            Feb 8 at 4:42






          • 1





            @PeterCordes Looks like code that does that actually exists/existed in strace (commented out); Linus Torvalds analysed it in a thread I've now linked (edited). I've added to the edit a new question about the "unreliability" of this method -- do you know more?

            – nh2
            Feb 8 at 5:06






          • 1





            I think Linus is just talking about the same race condition I pointed out: another thread in the process that made the syscall could modify the instruction or unmap the page before strace can read it. Most of his message is proposing mechanisms to sneak in extra signalling from the kernel to the tracing process without breaking old user-space. (e.g. upper bits of the 8-byte space for CS, or of RFLAGS.) Oh, the parent message shows that a single thread can bypass SMC flush with another mapping.

            – Peter Cordes
            Feb 8 at 5:48






          • 1





            Normally modern x86 CPUs snoop stores anywhere near any instruction address that's in the pipeline (Observing stale instruction fetching on x86 with self-modifying code), but apparently using a different virtual page bypasses that on at least some CPUs. The answer on that SO question already mentions that it's not guaranteed in that case, but that Intel has a patent on a mechanism for snooping based on physical address (with finer than 1 page granularity), so on most actual CPUs you maybe couldn't work around this with a single thread.

            – Peter Cordes
            Feb 8 at 5:49








          • 1





            yup, that's definitely true. The discussion leading up to questions about reliability was all about obfuscation attempts. Even if a single thread could create stale instruction fetch (e.g. on Atom or Silvermont?), that would only be problem for intentional obfuscation. And if you're worried about obfuscation, there are ways that can work on mainstream Intel (e.g. cross-modifying instead of self-modifying, on a multicore), so it turns out that self-modifying code is probably only relevant on a single-core machine. (Where only very lucky preemption could let another thread in then.)

            – Peter Cordes
            Feb 9 at 2:45



















          • Related: In Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI? I suggested that you could disassemble the code at RIP and check for the 0f 05 syscall instruction. IDK if that would really work, and of course it would be slower to use an extra ptrace system call or two to fetch registers and peek at the process memory. (And for security use-cases like this there'd be a race condition where another thread could rewrite those bytes after they execute, fooling the filter.)

            – Peter Cordes
            Feb 8 at 4:42






          • 1





            @PeterCordes Looks like code that does that actually exists/existed in strace (commented out); Linus Torvalds analysed it in a thread I've now linked (edited). I've added to the edit a new question about the "unreliability" of this method -- do you know more?

            – nh2
            Feb 8 at 5:06






          • 1





            I think Linus is just talking about the same race condition I pointed out: another thread in the process that made the syscall could modify the instruction or unmap the page before strace can read it. Most of his message is proposing mechanisms to sneak in extra signalling from the kernel to the tracing process without breaking old user-space. (e.g. upper bits of the 8-byte space for CS, or of RFLAGS.) Oh, the parent message shows that a single thread can bypass SMC flush with another mapping.

            – Peter Cordes
            Feb 8 at 5:48






          • 1





            Normally modern x86 CPUs snoop stores anywhere near any instruction address that's in the pipeline (Observing stale instruction fetching on x86 with self-modifying code), but apparently using a different virtual page bypasses that on at least some CPUs. The answer on that SO question already mentions that it's not guaranteed in that case, but that Intel has a patent on a mechanism for snooping based on physical address (with finer than 1 page granularity), so on most actual CPUs you maybe couldn't work around this with a single thread.

            – Peter Cordes
            Feb 8 at 5:49








          • 1





            yup, that's definitely true. The discussion leading up to questions about reliability was all about obfuscation attempts. Even if a single thread could create stale instruction fetch (e.g. on Atom or Silvermont?), that would only be problem for intentional obfuscation. And if you're worried about obfuscation, there are ways that can work on mainstream Intel (e.g. cross-modifying instead of self-modifying, on a multicore), so it turns out that self-modifying code is probably only relevant on a single-core machine. (Where only very lucky preemption could let another thread in then.)

            – Peter Cordes
            Feb 9 at 2:45

















          Related: In Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI? I suggested that you could disassemble the code at RIP and check for the 0f 05 syscall instruction. IDK if that would really work, and of course it would be slower to use an extra ptrace system call or two to fetch registers and peek at the process memory. (And for security use-cases like this there'd be a race condition where another thread could rewrite those bytes after they execute, fooling the filter.)

          – Peter Cordes
          Feb 8 at 4:42





          Related: In Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI? I suggested that you could disassemble the code at RIP and check for the 0f 05 syscall instruction. IDK if that would really work, and of course it would be slower to use an extra ptrace system call or two to fetch registers and peek at the process memory. (And for security use-cases like this there'd be a race condition where another thread could rewrite those bytes after they execute, fooling the filter.)

          – Peter Cordes
          Feb 8 at 4:42




          1




          1





          @PeterCordes Looks like code that does that actually exists/existed in strace (commented out); Linus Torvalds analysed it in a thread I've now linked (edited). I've added to the edit a new question about the "unreliability" of this method -- do you know more?

          – nh2
          Feb 8 at 5:06





          @PeterCordes Looks like code that does that actually exists/existed in strace (commented out); Linus Torvalds analysed it in a thread I've now linked (edited). I've added to the edit a new question about the "unreliability" of this method -- do you know more?

          – nh2
          Feb 8 at 5:06




          1




          1





          I think Linus is just talking about the same race condition I pointed out: another thread in the process that made the syscall could modify the instruction or unmap the page before strace can read it. Most of his message is proposing mechanisms to sneak in extra signalling from the kernel to the tracing process without breaking old user-space. (e.g. upper bits of the 8-byte space for CS, or of RFLAGS.) Oh, the parent message shows that a single thread can bypass SMC flush with another mapping.

          – Peter Cordes
          Feb 8 at 5:48





          I think Linus is just talking about the same race condition I pointed out: another thread in the process that made the syscall could modify the instruction or unmap the page before strace can read it. Most of his message is proposing mechanisms to sneak in extra signalling from the kernel to the tracing process without breaking old user-space. (e.g. upper bits of the 8-byte space for CS, or of RFLAGS.) Oh, the parent message shows that a single thread can bypass SMC flush with another mapping.

          – Peter Cordes
          Feb 8 at 5:48




          1




          1





          Normally modern x86 CPUs snoop stores anywhere near any instruction address that's in the pipeline (Observing stale instruction fetching on x86 with self-modifying code), but apparently using a different virtual page bypasses that on at least some CPUs. The answer on that SO question already mentions that it's not guaranteed in that case, but that Intel has a patent on a mechanism for snooping based on physical address (with finer than 1 page granularity), so on most actual CPUs you maybe couldn't work around this with a single thread.

          – Peter Cordes
          Feb 8 at 5:49







          Normally modern x86 CPUs snoop stores anywhere near any instruction address that's in the pipeline (Observing stale instruction fetching on x86 with self-modifying code), but apparently using a different virtual page bypasses that on at least some CPUs. The answer on that SO question already mentions that it's not guaranteed in that case, but that Intel has a patent on a mechanism for snooping based on physical address (with finer than 1 page granularity), so on most actual CPUs you maybe couldn't work around this with a single thread.

          – Peter Cordes
          Feb 8 at 5:49






          1




          1





          yup, that's definitely true. The discussion leading up to questions about reliability was all about obfuscation attempts. Even if a single thread could create stale instruction fetch (e.g. on Atom or Silvermont?), that would only be problem for intentional obfuscation. And if you're worried about obfuscation, there are ways that can work on mainstream Intel (e.g. cross-modifying instead of self-modifying, on a multicore), so it turns out that self-modifying code is probably only relevant on a single-core machine. (Where only very lucky preemption could let another thread in then.)

          – Peter Cordes
          Feb 9 at 2:45





          yup, that's definitely true. The discussion leading up to questions about reliability was all about obfuscation attempts. Even if a single thread could create stale instruction fetch (e.g. on Atom or Silvermont?), that would only be problem for intentional obfuscation. And if you're worried about obfuscation, there are ways that can work on mainstream Intel (e.g. cross-modifying instead of self-modifying, on a multicore), so it turns out that self-modifying code is probably only relevant on a single-core machine. (Where only very lucky preemption could let another thread in then.)

          – Peter Cordes
          Feb 9 at 2:45













          -2














          It's the system call sys_rt_sigtimedwait (since kernel 2.2). See the manpage of it by:



          man 2 rt_sigtimedwait


          That syscall suspends the execution, until a signal (or a set of singals) specified by the argument is delivered. A timeout is also been given.



          To be 100% sure there is a file called unistd_64.h. Search your system for that file. Mostly it's in the include folder (/usr/include/x86_64-linux-gnu/asm/unistd_64.h). In there are the numbers defined. Here the relevant line in my case (it's also a 64-bit system, kernel 3.2.0-58):



          #define __NR_rt_sigtimedwait                    128
          __SYSCALL(__NR_rt_sigtimedwait, sys_rt_sigtimedwait)


          Note 128 is decimal for 80 in hex.






          share|improve this answer


























          • This does not answer the question how to distinguish, when ptrace()ing, whether int 0x80 or syscall was used.

            – nh2
            Feb 8 at 4:40
















          -2














          It's the system call sys_rt_sigtimedwait (since kernel 2.2). See the manpage of it by:



          man 2 rt_sigtimedwait


          That syscall suspends the execution, until a signal (or a set of singals) specified by the argument is delivered. A timeout is also been given.



          To be 100% sure there is a file called unistd_64.h. Search your system for that file. Mostly it's in the include folder (/usr/include/x86_64-linux-gnu/asm/unistd_64.h). In there are the numbers defined. Here the relevant line in my case (it's also a 64-bit system, kernel 3.2.0-58):



          #define __NR_rt_sigtimedwait                    128
          __SYSCALL(__NR_rt_sigtimedwait, sys_rt_sigtimedwait)


          Note 128 is decimal for 80 in hex.






          share|improve this answer


























          • This does not answer the question how to distinguish, when ptrace()ing, whether int 0x80 or syscall was used.

            – nh2
            Feb 8 at 4:40














          -2












          -2








          -2







          It's the system call sys_rt_sigtimedwait (since kernel 2.2). See the manpage of it by:



          man 2 rt_sigtimedwait


          That syscall suspends the execution, until a signal (or a set of singals) specified by the argument is delivered. A timeout is also been given.



          To be 100% sure there is a file called unistd_64.h. Search your system for that file. Mostly it's in the include folder (/usr/include/x86_64-linux-gnu/asm/unistd_64.h). In there are the numbers defined. Here the relevant line in my case (it's also a 64-bit system, kernel 3.2.0-58):



          #define __NR_rt_sigtimedwait                    128
          __SYSCALL(__NR_rt_sigtimedwait, sys_rt_sigtimedwait)


          Note 128 is decimal for 80 in hex.






          share|improve this answer















          It's the system call sys_rt_sigtimedwait (since kernel 2.2). See the manpage of it by:



          man 2 rt_sigtimedwait


          That syscall suspends the execution, until a signal (or a set of singals) specified by the argument is delivered. A timeout is also been given.



          To be 100% sure there is a file called unistd_64.h. Search your system for that file. Mostly it's in the include folder (/usr/include/x86_64-linux-gnu/asm/unistd_64.h). In there are the numbers defined. Here the relevant line in my case (it's also a 64-bit system, kernel 3.2.0-58):



          #define __NR_rt_sigtimedwait                    128
          __SYSCALL(__NR_rt_sigtimedwait, sys_rt_sigtimedwait)


          Note 128 is decimal for 80 in hex.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Oct 31 '14 at 7:19

























          answered Oct 31 '14 at 7:08









          chaoschaos

          3,45621126




          3,45621126













          • This does not answer the question how to distinguish, when ptrace()ing, whether int 0x80 or syscall was used.

            – nh2
            Feb 8 at 4:40



















          • This does not answer the question how to distinguish, when ptrace()ing, whether int 0x80 or syscall was used.

            – nh2
            Feb 8 at 4:40

















          This does not answer the question how to distinguish, when ptrace()ing, whether int 0x80 or syscall was used.

          – nh2
          Feb 8 at 4:40





          This does not answer the question how to distinguish, when ptrace()ing, whether int 0x80 or syscall was used.

          – nh2
          Feb 8 at 4:40


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Super User!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f834122%2fhow-to-distinguish-syscalls-form-int-80h-when-using-ptrace%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Probability when a professor distributes a quiz and homework assignment to a class of n students.

          Aardman Animations

          Are they similar matrix