LLVM 공부/기록/공유 고민

Notice

LLVM 관련 잡일 의뢰 받습니다.

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

Why should I know this?

LLVM 공부/기록/공유 고민 본문

LLVM-STUDY/TODO

LLVM 공부/기록/공유 고민

die4taoam 2023. 2. 27. 06:58

LLVM Optimizing 과정을 공부하면서 효율을 높히고 기록을 남기기 위해서 고민을 좀 해봤습니다.

예로 들려는 최적화는 LoopFlatten 입니다.

llvm/lib/Transforms/Scalar/LoopFlatten.cpp 에 구현되어 있습니다.

첫 번째, 최적화 과정에서 활용되는 로직을 모은 실행파일 견본으로 모아보려고 합니다.

LoopFlatten에서 큰 비중을 차지하는 static bool findLoopComponents 함수에는 다음과 같은 내용이 있습니다.

static bool findLoopComponents(
    Loop *L, SmallPtrSetImpl<Instruction *> &IterationInstructions,
    PHINode *&InductionPHI, Value *&TripCount, BinaryOperator *&Increment,
    BranchInst *&BackBranch, ScalarEvolution *SE, bool IsWidened) {

  // Find increment and trip count.
  // There are exactly 2 incoming values to the induction phi; one from the
  // pre-header and one from the latch. The incoming latch value is the
  // increment variable.
  Increment =
      cast<BinaryOperator>(InductionPHI->getIncomingValueForBlock(Latch));
  if (Increment->hasNUsesOrMore(3)) {
    dbgs() << "Could not find valid increment\n";
    return false;
  }
  // The trip count is the RHS of the compare. If this doesn't match the trip
  // count computed by SCEV then this is because the trip count variable
  // has been widened so the types don't match, or because it is a constant and
  // another transformation has changed the compare (e.g. icmp ult %inc,
  // tripcount -> icmp ult %j, tripcount-1), or both.
  Value *RHS = Compare->getOperand(1);

이는 Loop 최적화에서 중요한 TripCount를 찾아내는 로직입니다.
TripCount를 찾아내는 로직을 골라서 테스트 가능한 함수를 만듭니다.

TEST(LoopInfoTest, HowToFindTripCount) {

...
if (!Compare || !IsValidPredicate(Compare->getUnsignedPredicate()) ||
      Compare->hasNUsesOrMore(2)) {
    LLVM_DEBUG(dbgs() << "Could not find valid comparison\n");
    FAIL();
  }

  // Find increment and trip count.
  // There are exactly 2 incoming values to the induction phi; one from the
  // pre-header and one from the latch. The incoming latch value is the
  // increment variable.
  BinaryOperator *Increment =
      cast<BinaryOperator>(InductionPHI->getIncomingValueForBlock(Latch));
  if (Increment->hasNUsesOrMore(3)) {
    LLVM_DEBUG(dbgs() << "Could not find valid increment\n");
    FAIL();
  }

  // The trip count is the RHS of the compare. If this doesn't match the trip
  // count computed by SCEV then this is because the trip count variable
  // has been widened so the types don't match, or because it is a constant and
  // another transformation has changed the compare (e.g. icmp ult %inc,
  // tripcount -> icmp ult %j, tripcount-1), or both.
  Value *RHS = Compare->getOperand(1);

  dbgs() << "TripCount : "; RHS->dump();
}

위 코드를 컴파일하여 나온 바이너리를 실행하면 다음과 같은 결과를 볼 수 있습니다.

[       OK ] LoopInfoTest.LoopFlatten (25 ms)
[ RUN      ] LoopInfoTest.HowToFindTripCount
Found induction PHI:   %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
TripCount : i32 %n
[       OK ] LoopInfoTest.HowToFindTripCount (0 ms)
[----------] 3 tests from LoopInfoTest (29 ms total)

[       OK ] LoopInfoTest.LoopFlatten (25 ms)
[ RUN      ] LoopInfoTest.HowToFindTripCount
Found induction PHI:   %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]

TripCount : i32 %n

[ OK ] LoopInfoTest.HowToFindTripCount (0 ms)
[----------] 3 tests from LoopInfoTest (29 ms total)

TripCount 는 i32 %n 입니다.

이런 식으로 최적화 패스에서 사용되는 로직 일부를 떼와서 파트별로(LoopFlatten->Loop)
한군데 모으고 상세주석을 달아 로직에 대한 이해도를 증진시키는데 도움이 될 수 있게 만들 수 있지 않을까 합니다.

두 번째, 최적화 PASS를 추출해 주석과 디버깅메시지를 덕지덕지 붙이기.

기본적으로 LLVM에서 사용되는 Pass들은 clang / opt 모두 사용 가능할 수 있는 구조로 되어 있습니다.

llvm/lib/Transforms/Scalar/LoopFlatten.cpp 를 분석하고자 할 때,

해당 파일을 통채로 복붙하고 다음과 같은 라인을 추가해줍니다.

static RegisterPass<LoopFlattenLegacyPass> X("loop-flatten-study", "Loop Flatten Pass");

이 한 줄을 추가하면 opt에서 사용할 수 있는 plugin으로 만들 수 있습니다.

마찬가지로 컴파일을 하고,

/build/bin/clang++ -o /work/mystudy/llvm-study/Loop/LoopFlatten.o /work/mystudy/llvm-study/Loop/LoopFlatten.cpp -fPIC -c -Wall -I/work/llvm-project/llvm/include -I/build/include -std=c++14 -fno-exceptions -fno-rtti -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
/build/bin/clang++ -o /work/mystudy/llvm-study/libLoopStudy.so /work/mystudy/llvm-study/Loop/LoopFlatten.o -shared -Wl,-O1 -I/work/llvm-project/llvm/include -I/build/include -std=c++14 -fno-exceptions -fno-rtti -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -L/build/lib

다음처럼 실행해줍니다.

opt -enable-new-pm=0 -load /work/mystudy/llvm-study/loop-flatten.so --loop-flatten-study loop-flatten.bc > loop-flatten-passed.bin
llvm-dis loop-flatten-passed.bin

llvm-dis로 디스어셈블해서 보면 다음처럼 LoopFlatten이 적용되었음을 알 수 있습니다.

define i32 @test1(i32 %val, i16* nocapture %A) {
entry:
  %flatten.tripcount = mul i32 20, 10
  br label %for.body

for.body:                                         ; preds = %for.inc6, %entry
  %i.018 = phi i32 [ 0, %entry ], [ %inc7, %for.inc6 ]
  %mul = mul nuw nsw i32 %i.018, 20
  br label %for.body3

그럼 이제 뭘하면 좋을까요?

여기저기 마구 출력문을 찍어서 전승지식을 만들면 되겠습니다!

다음은 제가 붙인 주식문입니다.

from :   %j.017 = phi i32 [ 0, %for.body ], [ %inc, %for.body3 ]
removed :   %j.017 = phi i32 [ 0, %for.body ]
OutterBranch Condition change from:   %exitcond19 = icmp ne i32 %inc7, 10
OutterBranch Condition chnage to :   %exitcond19 = icmp ne i32 %inc7, %flatten.tripcount
erase InnerExitingBlock :
for.body3:                                        ; preds = %for.body3, %for.body
  %j.017 = phi i32 [ 0, %for.body ]
  %add = add nuw nsw i32 %j.017, %mul
  %arrayidx = getelementptr inbounds i16, i16* %A, i32 %add
  %0 = load i16, i16* %arrayidx, align 2
  %conv16 = zext i16 %0 to i32
  %add4 = add i32 %conv16, %val
  %conv5 = trunc i32 %add4 to i16
  store i16 %conv5, i16* %arrayidx, align 2
  %inc = add nuw nsw i32 %j.017, 1
  %exitcond = icmp ne i32 %inc, 20
  br i1 %exitcond, label %for.body3, label %for.inc6

erase InnerExitingBlock from :   br i1 %exitcond, label %for.body3, label %for.inc6
create new Branch between InnerExitBlock & InnerExitingBlock  br label %for.inc6
Replacing:   %add = add nuw nsw i32 %j.017, %mul
with:        %i.018 = phi i32 [ 0, %entry ], [ %inc7, %for.inc6 ]

위처럼 추출한뒤 최적화 과정에서 출력되는 로그를 통해 최적화 패스를 분석하는거죠.
위 로그를 통해 LoopFlatten을 분석한 결과 입니다.

https://die4taoam.tistory.com/98

LLVM Optimization study - LoopFlatten

LoopFlatten은 Loop 문을 평탄화 하는 최적화인데, 말로는 느낌이 잘 안오니 다음 주석을 참고하자. LoopFlatten.cpp // for (int i = 0; i getLoopPreheader()->getTerminator() 에 추가. ; 다음 BasicBlock 중 PHI Remove Incoming V

die4taoam.tistory.com

어떨지 모르겠습니다만, 이런 식으로 두 방향의 접근을 진행해보려고 합니다.

위의 코드는 다음 github 에서 확인하실 수 있습니다.

https://github.com/ParkHanbum/mystudy/tree/master/llvm-study

GitHub - ParkHanbum/mystudy

Contribute to ParkHanbum/mystudy development by creating an account on GitHub.

github.com

'LLVM-STUDY > TODO' 카테고리의 다른 글

LLVM Optimization study - simplify with ChatGPT (0)	2023.03.19
LLVM instruction 추가 PR (0)	2023.03.17
LLVM Optimization study - DCE (0)	2023.02.17
DEX to IR 개발일지 - 간단한 bytecode <> IR 변환, mangling 필요?? (3)	2023.01.17
DEX to IR 개발일지 - Class 변환 (0)	2022.12.28

'LLVM-STUDY/TODO' Related Articles

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Why should I know this?

Why should I know this?

LLVM 공부/기록/공유 고민 본문

LLVM 공부/기록/공유 고민

첫 번째, 최적화 과정에서 활용되는 로직을 모은 실행파일 견본으로 모아보려고 합니다.

두 번째, 최적화 PASS를 추출해 주석과 디버깅메시지를 덕지덕지 붙이기.

'LLVM-STUDY > TODO' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역