This was broken by 16295d521e, when
instructions started being handled and not just constant
expressions. This was re-inserting an equivalent bitcast to the
original memcpy operand, which made a non-functional IR change on
every iteration.
This also fixes a secondary problem where it was inserting
addrspacecasts which may not have been legal (i.e. it changed the
source address space). Start visiting all pointer users and fail out
if we can't process them. Also start handling the relevant memory
intrinsic users. These cases can be dealt with by running
InferAddressSpaces separately.
The langref already states it does, but this wasn't implemented. Also
covers inalloca and preallocated. Also helps fix a dependence on
pointer element types.
Consider any constant memory type, not just global constants. AMDGPU
kernel parameters are effectively global constants, but appear as
either reads from an intrinsic derived pointer or function argument.