forked from OSchip/llvm-project
				
			
		
			
				
	
	
		
			150 lines
		
	
	
		
			4.9 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
			
		
		
	
	
			150 lines
		
	
	
		
			4.9 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
=============================================
 | 
						|
Enable std::unique_ptr [[clang::trivial_abi]]
 | 
						|
=============================================
 | 
						|
 | 
						|
Background
 | 
						|
==========
 | 
						|
 | 
						|
Consider the follow snippets
 | 
						|
 | 
						|
 | 
						|
.. code-block:: cpp
 | 
						|
 | 
						|
    void raw_func(Foo* raw_arg) { ... }
 | 
						|
    void smart_func(std::unique_ptr<Foo> smart_arg) { ... }
 | 
						|
 | 
						|
    Foo* raw_ptr_retval() { ... }
 | 
						|
    std::unique_ptr<Foo*> smart_ptr_retval() { ... }
 | 
						|
 | 
						|
 | 
						|
 | 
						|
The argument ``raw_arg`` could be passed in a register but ``smart_arg`` could not, due to current
 | 
						|
implementation.
 | 
						|
 | 
						|
Specifically, in the ``smart_arg`` case, the caller secretly constructs a temporary ``std::unique_ptr``
 | 
						|
in its stack-frame, and then passes a pointer to it to the callee in a hidden parameter.
 | 
						|
Similarly, the return value from ``smart_ptr_retval`` is secretly allocated in the caller and
 | 
						|
passed as a secret reference to the callee.
 | 
						|
 | 
						|
 | 
						|
Goal
 | 
						|
===================
 | 
						|
 | 
						|
``std::unique_ptr`` is passed directly in a register.
 | 
						|
 | 
						|
Design
 | 
						|
======
 | 
						|
 | 
						|
* Annotate the two definitions of ``std::unique_ptr``  with ``clang::trivial_abi`` attribute.
 | 
						|
* Put the attribuate behind a flag because this change has potential compilation and runtime breakages.
 | 
						|
 | 
						|
 | 
						|
This comes with some side effects:
 | 
						|
 | 
						|
* ``std::unique_ptr`` parameters will now be destroyed by callees, rather than callers.
 | 
						|
  It is worth noting that destruction by callee is not unique to the use of trivial_abi attribute.
 | 
						|
  In most Microsoft's ABIs, arguments are always destroyed by the callee.
 | 
						|
 | 
						|
  Consequently, this may change the destruction order for function parameters to an order that is non-conforming to the standard.
 | 
						|
  For example:
 | 
						|
 | 
						|
 | 
						|
  .. code-block:: cpp
 | 
						|
 | 
						|
    struct A { ~A(); };
 | 
						|
    struct B { ~B(); };
 | 
						|
    struct C { C(A, unique_ptr<B>, A) {} };
 | 
						|
    C c{{}, make_unique<B>, {}};
 | 
						|
 | 
						|
 | 
						|
  In a conforming implementation, the destruction order for C::C's parameters is required to be ``~A(), ~B(), ~A()`` but with this mode enabled, we'll instead see ``~B(), ~A(), ~A()``.
 | 
						|
 | 
						|
* Reduced code-size.
 | 
						|
 | 
						|
 | 
						|
Performance impact
 | 
						|
------------------
 | 
						|
 | 
						|
Google has measured performance improvements of up to 1.6% on some large server macrobenchmarks, and a small reduction in binary sizes.
 | 
						|
 | 
						|
This also affects null pointer optimization
 | 
						|
 | 
						|
Clang's optimizer can now figure out when a `std::unique_ptr` is known to contain *non*-null.
 | 
						|
(Actually, this has been a *missed* optimization all along.)
 | 
						|
 | 
						|
 | 
						|
.. code-block:: cpp
 | 
						|
 | 
						|
    struct Foo {
 | 
						|
      ~Foo();
 | 
						|
    };
 | 
						|
    std::unique_ptr<Foo> make_foo();
 | 
						|
    void do_nothing(const Foo&)
 | 
						|
 | 
						|
    void bar() {
 | 
						|
      auto x = make_foo();
 | 
						|
      do_nothing(*x);
 | 
						|
    }
 | 
						|
 | 
						|
 | 
						|
With this change, ``~Foo()`` will be called even if ``make_foo`` returns ``unique_ptr<Foo>(nullptr)``.
 | 
						|
The compiler can now assume that ``x.get()`` cannot be null by the end of ``bar()``, because
 | 
						|
the deference of ``x`` would be UB if it were ``nullptr``. (This dereference would not have caused
 | 
						|
a segfault, because no load is generated for dereferencing a pointer to a reference. This can be detected with ``-fsanitize=null``).
 | 
						|
 | 
						|
 | 
						|
Potential breakages
 | 
						|
-------------------
 | 
						|
 | 
						|
The following breakages were discovered by enabling this change and fixing the resulting issues in a large code base.
 | 
						|
 | 
						|
- Compilation failures
 | 
						|
 | 
						|
 - Function definitions now require complete type ``T`` for parameters with type ``std::unique_ptr<T>``. The following code will no longer compile.
 | 
						|
 | 
						|
   .. code-block:: cpp
 | 
						|
 | 
						|
       class Foo;
 | 
						|
       void func(std::unique_ptr<Foo> arg) { /* never use `arg` directly */ }
 | 
						|
 | 
						|
 - Fix: Remove forward-declaration of ``Foo`` and include its proper header.
 | 
						|
 | 
						|
- Runtime Failures
 | 
						|
 | 
						|
 - Lifetime of ``std::unique_ptr<>`` arguments end earlier (at the end of the callee's body, rather than at the end of the full expression containing the call).
 | 
						|
 | 
						|
   .. code-block:: cpp
 | 
						|
 | 
						|
     util::Status run_worker(std::unique_ptr<Foo>);
 | 
						|
     void func() {
 | 
						|
        std::unique_ptr<Foo> smart_foo = ...;
 | 
						|
        Foo* owned_foo = smart_foo.get();
 | 
						|
        // Currently, the following would "work" because the argument to run_worker() is deleted at the end of func()
 | 
						|
        // With the new calling convention, it will be deleted at the end of run_worker(),
 | 
						|
        // making this an access to freed memory.
 | 
						|
        owned_foo->Bar(run_worker(std::move(smart_foo)));
 | 
						|
                  ^
 | 
						|
                 // <<<Crash expected here
 | 
						|
     }
 | 
						|
 | 
						|
 - Lifetime of local *returned* ``std::unique_ptr<>`` ends earlier.
 | 
						|
 | 
						|
   Spot the bug:
 | 
						|
 | 
						|
    .. code-block:: cpp
 | 
						|
 | 
						|
     std::unique_ptr<Foo> create_and_subscribe(Bar* subscriber) {
 | 
						|
       auto foo = std::make_unique<Foo>();
 | 
						|
       subscriber->sub([&foo] { foo->do_thing();} );
 | 
						|
       return foo;
 | 
						|
     }
 | 
						|
 | 
						|
   One could point out this is an obvious stack-use-after return bug.
 | 
						|
   With the current calling convention, running this code with ASAN enabled, however, would not yield any "issue".
 | 
						|
   So is this a bug in ASAN? (Spoiler: No)
 | 
						|
 | 
						|
   This currently would "work" only because the storage for ``foo`` is in the caller's stackframe.
 | 
						|
   In other words, ``&foo`` in callee and ``&foo`` in the caller are the same address.
 | 
						|
 | 
						|
ASAN can be used to detect both of these.
 |