Commit Graph

70264 Commits

Author SHA1 Message Date
wenhu1024 8dee421bf2
[VENTUS][fix] modify rounding mode, functions, instructions to follow OpenCL2.0 conversions specifications (#183)
This commit ensures strict compliance with OpenCL 2.0 floating-point
conversion specifications:

* **RISCVISelLowering.cpp**:
  - Map FRINT to RNE (round-to-nearest-even) instead of DYN (dynamic)
  - Add proper FRM save/restore for vector floating-point operations

* **VentusInstrInfoV.td**:
  - Enable VFCVT_RTZ_* instructions for truncation-based conversions
  - Use RTZ (round-to-zero) mode for fp-to-int conversions to match
    OpenCL spec
  - Replace dynamic rounding with explicit RTZ for integer conversions

* **gen_convert.py**:
  - Improve saturation handling in type conversions
  - Add proper edge case handling for integer source saturation
  - Distinguish between integer and float source conversion logic

* **Test Updates**:
  - Update float.ll to expect RTZ instructions for fp-to-int conversions
  - Add fround.ll test cases for ceil/floor/rint operations

These changes ensure that Ventus GPGPU backend produces OpenCL 2.0 compliant
floating-point conversion behavior, particularly for rounding modes and
saturation handling.
2025-07-05 17:34:04 +08:00
wenhu1024 fbabe94747
Add ventus-fix-mixed-phi-pass, and put set isConvergent=1 in RVInst (#178)
* add ventus-fix-mixed-phi-pass, and put set isConvergent=1 in RVInst

* add ventus-fix-mixed-phi pass test file

---------

Co-authored-by: ivan lei <ivanlei0@163.com>
2025-06-18 11:41:27 +08:00
wenhu1024 f977b6eeb2
[Ventus][fix] fix barrier builtins support (#176)
- Fix barrier builtin to take single parameter per OpenCL spec
- Add sub_group_barrier builtin support
- Update codegen and related test cases
2025-06-18 02:14:40 +08:00
wenhu1024 5ecbb82d24
[Ventus][fix] remove incorrect pat (#175)
The FP_CONTRACT pragma can be used to allow (if the state is on) or disallow (if the state is off) the implementation to contract expressions.

The removed patterns previously matched ternary floating-point expressions
such as fadd(fmul(...)), enabling contraction via FMA even when not
explicitly allowed.

These patterns were incorrect because they ignored the current floating-point
contraction state (e.g., controlled by the FP_CONTRACT pragma).
2025-06-17 16:13:11 +08:00
wenhu1024 cebe3b1b46 [VENTUS][fix] disable BranchFolderPass, MachineBlockPlacement pass and remove checkJoinMBB from Insert-join-to-VBranch pass 2025-04-02 10:14:05 +08:00
wenhu1024 01f22ac6cf [VENTUS][fix] Fix regexti instruction bug 2025-03-10 11:30:11 +08:00
Jules-Kong 9b1565821d [VENTUS][NFC] Remove invalid code about divergence 2025-03-07 11:04:28 +08:00
Jules-Kong 61eafc4714 [VENTUS][Printf] Add opencl printf pass 2025-02-12 11:01:04 +08:00
ziliangzl 84b7c666eb [Ventus][fix] Fix flw/fsw instruction pattern 2024-08-20 09:59:25 +08:00
zhoujingya aeee8ee171 [VENTUS][fix] Fix memory flags set in tablegen #129
In previous logic ,default memory access flag is 0b00, this will cause
all no-local/no-private related instructions return true when fall into
`RISCVInstrInfo::isUniformMemoryAccess` logic
2024-06-24 23:01:47 +08:00
zhoujing 625facb350 [VENTUS][fix] Add memory access flags in tablegen
In this way, it is better to judge what memory scope is accessed by
load/store instructions
2024-06-14 09:46:00 +08:00
ziliangzl 0b03e6b411 [Ventus][fix]Fix missing vmv instruction for FrameReg in divergent path 2024-06-06 16:16:57 +08:00
zhoujingya 7c78b29815
Merge pull request #122 from THU-DSP-LAB/compress_instruction_disassemble
[VENTUS][fix] Disable compress instruction disassemble
2024-05-30 14:38:50 +08:00
ziliangzl 11b55acb48 [VENTUS][fix]Fix missing regext instruction for vmsle instruction
This bug caused PseudoVMSLT_VI node didn't insert regext.
Now OPENCL-CTS relationals test passed.
2024-05-21 17:09:48 +08:00
ziliangzl 60f388930d [VENTUS][fix]Assign initial value for VastartStoreFrameIndex
VastartStoreFrameIndex havn't initial value, caused issue THU-DSP-LAB/llvm-project#117
2024-05-15 13:39:07 +08:00
qinfan b465e58817 [VENTUS][fix] Add a switch to the C extension
Add a switch to the C extension, now the C extension is turned off by default.
2024-05-14 17:26:24 +08:00
qinfan 408ed74df2 [VENTUS][fix] Modify the disassembly result of a compress instruction error
Modify the disassembly result of a compress instruction error.
2024-05-13 15:57:55 +08:00
zhoujing 573ae5e8ee [VENTUS][workaround] Fix flw/fsw assembly errors
Signed-off-by: zhoujing <jing.zhou@terapines.com>

This is just a workarond, when new instruction is added by HW, revert this commit
2024-04-26 15:14:22 +08:00
ziliangzl 8be3150696 [#112][fix]Remove flw/fsw InstAlias
1.Removed flw/fsw InstAlias for now,cause flw/fsw could not match correctly.
2.Modified kernel_arg testcase.
2024-04-26 10:59:35 +08:00
ziliangzl 4354b039f3 [VENTUS][fix]Fix FLW/FSW instruction coding conflict
Replace FLW/FSW instruction with PseudoFLW/PseudoFSW
2024-04-25 10:44:57 +08:00
zhoujingya 1347c06d50
Merge pull request #109 from ziliangzl/divergent-analyse
[VENTUS][fix] Fix kernel divergent analysis
2024-04-22 15:58:52 +08:00
ziliangzl d138bdacf6 [VENTUS][fix]Fix kernel divergent analysis 2024-04-22 15:53:36 +08:00
qinfan f781479b52 [VENTUS][RISCV] Fix move instructions after JOIN move forward bug
1. If the move instruction needs to be moved forward, it will only be inserted after the last corresponding move instruction in the predecessor basic block.
2. The first instruction of the predecessor is also counted as a possible insertion point.
2024-03-29 16:15:12 +08:00
zhoujing 797c85d829 [patch] Add a fix patch from terapines_dev branch 2024-03-08 18:23:47 +08:00
zhoujingya efef613b61
Merge pull request #83 from THU-DSP-LAB/34_local_addressed_variables_into_stack
[VENTUS][fix] Put local variables declared in kernel function into shared memory
2024-03-06 09:19:25 +08:00
zhoujing 87fe5f3ce8 [VENTUS][fix] Put local variables declared in kernel function into shared memory 2024-03-05 16:32:59 +08:00
zhoujing a909be0434 [VENTUS][fix] Fix insert vmv instruction bug when vmv instruction is in JOIN MBB 2024-03-05 15:26:43 +08:00
qinfan c42c00f67e [VENTUS][fix] Modified the resource statistics interface
1. The origin interface will not be called under the -O0 optimization.
2. New interfaces added to epilogue pass.
2024-03-04 15:44:30 +08:00
zhoujingya 49c039a902
Merge pull request #89 from THU-DSP-LAB/eliminate_call_frame
[VENTUS][fix] Fix framelowering and calculation method of stack offset
2024-02-01 14:54:42 +08:00
zhoujingya 6b17accc5f
Merge pull request #70 from THU-DSP-LAB/resource_manage
[VENTUS][fix] Fix the mechanism of statistical register resources
2024-02-01 13:17:45 +08:00
zhoujingya 965f8c1fb6
Merge branch 'main' into eliminate_call_frame 2024-02-01 13:15:03 +08:00
zhoujingya 0b7be4b4a5
Merge branch 'main' into 39_parameter_types 2024-01-24 11:43:08 +08:00
zhoujingya a87bae445c
Merge pull request #49 from THU-DSP-LAB/instructions-remove
[VENTUS][fix] Remove instructions not supported by hardware
2024-01-24 09:41:33 +08:00
qinfan 71caf2361b [VENTUS][fix] Fix register extension
Fix register extension.
2024-01-23 09:59:51 +08:00
zhoujingya 7adec4402a [VENTUS][fix] Support the regexti instruction
Support the regexti instruction with or, xor, sub, and, setne, seteq, add test file.
2024-01-23 09:59:51 +08:00
qinfan 93c99240db [VENTUS][fix] Fix the calculation of stack size
Fix the calculation of stack size.
2023-12-25 13:26:25 +08:00
qinfan d809d3a2bd [VENTUS][fix] Fix the Offset of private variable offset on stack
Fix the Offset of private variable offset on stack.
2023-12-22 16:47:19 +08:00
qinfan 755797e27c [VENTUS][fix] Fix framelowering and calculation method of stack offset
1. Add VMV_V_X in emitEpilogue.
2. Change all the positive numbers added by TP to negative numbers(in LowerCall).
3. Fix the LowerCall function to generate correct store instruction transferring the function parameters.
4. Fix hasReservedCallFrame function to return false.
5. Align the convention between caller and callee in the case of passing parameters by stack.
6. Change the stack offset calculation method of TP.
7. Unify the calculation of TP stack and SP stack offset.
8. Node that needing to manually modify the calculation of sp offset in the workitem.S. Since the growth direction of the stack is different from that of the traditional RISCV, it is now stipulated that for both the SP stack and the TP stack, the data is stored where the stack pointer is not offset.
9. There is a SPAdj check in eliminateFrameIndex function. but we don't need this value at all so that adding a getSPAdjust function to return zero.
10. V33 is a wrong value when parameters pushed to TP stack so there must be a MV instruction to refresh V33 after ADJCALLSTACKDOWN.
2023-12-20 17:03:01 +08:00
qinfan e35b2e4fed [VENTUS][fix] Distinguish the resource usage of each kernel function
Distinguish the resource usage of each kernel function in the same source file.
2023-12-14 17:18:20 +08:00
qinfan 304a2c1284 [VENTUS][fix] Fix the mechanism of statistical register resources
1. Fix the bug of repeated calculation of register resources.
2. Add resource calculation with stack register.
2023-11-28 11:29:13 +08:00
zhoujingya d32d735ea4 [VENTUS][fix] Remove instructions not supported by hardware
These instructions included belows:

* float load/store instructions
* vfmv instruction
* "Single-Width Floating-Point/Integer Type-Convert Instructions" in RISCV manual
2023-11-24 17:26:50 +08:00
zhoujingya f85215d671
Revert "[VENTUS][fix] Add subregclass and flag to distinguish GPR and GPRF32" (#68)
This reverts commit 5e424e2b64.
2023-11-24 15:18:18 +08:00
qinfan 6b002e6c5d [#39][fix] Modify the code failed to merge
Modify the code failed to merge.
2023-11-09 13:31:31 +08:00
qinfan 99a81dd407 [#39][fix] Fix scalar and vector kernel parameter bugs
The alignment method for kernel function parameters finally determined by the software.
2023-11-09 13:18:48 +08:00
qinfan 8dfd3561c4 [VENTUS][RISCV][feat] Legalized vector parameters
Summary: LegaLegalized vector parameters, but not been added FileCheck now.

Test Plan: Legalized vector parameters

Differential Revision: http://www.tpt.com/D740
2023-11-09 13:09:54 +08:00
zhoujingya 1e0fad0aef [#54][fix] Fix integer compare instructions pattern 2023-11-08 14:56:10 +08:00
qinfan 8fb493873a [VENTUS][RISCV][fix] Add subregclass and flag to distinguish GPR and GPRF32
Add subregclass and flag to distinguish GPR and GPRF32
2023-10-26 09:46:07 +08:00
zhoujingya 5c2738eb80 [VENTUS][fix] Fix address space mapping error for constant address 2023-10-16 17:30:49 +08:00
zhoujingya f48875e9fe Revert "[VENTUS][fix] Add subregclass and flag to distinguish GPR and GPRF32"
This reverts commit 5e424e2b64.
2023-10-16 16:45:15 +08:00
qinfan 5e424e2b64 [VENTUS][fix] Add subregclass and flag to distinguish GPR and GPRF32
Summary: fix float COPY instruction bug

Test Plan: fix float COPY instruction bug

Reviewers: zhoujing

Differential Revision: http://www.tpt.com/D747
2023-10-16 13:51:10 +08:00